BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
thesasuser
Pyrite | Level 9

Imputing missing numeric values is well discussed and solutions are well defined.
Is there a way we can replace a missing value of character variable with the values of observation with largest frequency?

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

See a rather brute force example:

/* create example data */
data class;
set sashelp.class end=done;
output;
if done
then do;
  call missing(of _all_);
  output;
end;
run;

/* update  with largest freq */
proc sql;
create table upd (drop=count) as
select sex, count(*) as count from class
where sex ne ' '
group by sex
order by count descending;
create table want as
select name, coalesce(a.sex,b.sex) as sex, age, height, weight
from class a, upd (obs=1) b;
drop table upd;
quit;

View solution in original post

8 REPLIES 8
Kurt_Bremser
Super User

See a rather brute force example:

/* create example data */
data class;
set sashelp.class end=done;
output;
if done
then do;
  call missing(of _all_);
  output;
end;
run;

/* update  with largest freq */
proc sql;
create table upd (drop=count) as
select sex, count(*) as count from class
where sex ne ' '
group by sex
order by count descending;
create table want as
select name, coalesce(a.sex,b.sex) as sex, age, height, weight
from class a, upd (obs=1) b;
drop table upd;
quit;
PeterClemmensen
Tourmaline | Level 20

Yes there are ways. Show us your data to get a code answer

 

SuryaKiran
Meteorite | Level 14

What do you mean by values of observation with largest frequency, Can you be more specific. 

 

If you want all the missing characters to be replaced by some other values then you can use arrays as following:

data want;
set sashelp.class;
array char _Character_;
do over char;
if missing(char) then char="Some value";
end;
run;
Thanks,
Suryakiran
DanielLangley
Quartz | Level 8

Here's a solution. I would imagine that there are plenty of better solutions. For example if there is an equal amount of the most common it will just pick one of them.

 

DATA have;
INPUT ID Imp $;
*---+----1----+----2----+----3----+----4----+----5;
CARDS;
1 A
2 A
3 .
4 D
5 D
6 D
;

Proc sql outobs=1;
create table mostcommon as
select t1.imp, count(t1.imp) AS Count
from work.have as t1
where t1.imp is not missing
group by t1.imp
order by CALCULATED count desc
;
quit;

Proc sql;
create table want as
select have.ID, COALESCEC(have.Imp,mc.Imp) as Imp
from work.have as have
left join work.mostcommon as mc on 1=1
;
quit;

 

SuryaKiran
Meteorite | Level 14
DATA have;
infile datalines missover;
INPUT ID Imp $;
CARDS;
1 A
2 A
3
4 D
5 D
6 D
;

proc sql noprint;
select Imp INTO:Rep
from (select Imp,COUNT(Imp) as count
		from have
			Group by Imp)
Having count=max(Count)
;
quit;


data want;
set Have;
array char _Character_;
do over char;
if missing(char) then char="&Rep";
end;
run;

NOTE: Character missing values are represented by blank, not period.

 

 

Thanks,
Suryakiran
DanielLangley
Quartz | Level 8

Thank you.

 

I didn't know about

 

infile datalines missover;

 

s_lassen
Meteorite | Level 14

First, you have to find the most common values, e.g.:

proc freq data=sashelp.class noprint order=freq;

  tables age /out=age;

  tables height/ out=height;

  tables sex/ out=sex;

run;

Then, the easiest may be to put the values in macro variables:

data _null_;

  merge age sex height; /* with ORDER=FREQ, the first obs is the most common value */

  call symputx('age',age);

  call symputx('height',height);

  call symputx('sex',sex);

  stop;

run;

Given this test data:

data test;

  set sashelp.class;

  if _N_>8 then

    call missing(height,age);

  if name =:'L' then

    sex=' ';

run;

We can now fill in the blanks:

data want;

  set test;

  age=coalesce(age,&age);

  height=coalesce(height,&height);

  sex=coalescec(sex,"&sex");

run;

 

 

thesasuser
Pyrite | Level 9

Thank you all for the response.

All solutions were good . But as I understand it I can accept one as the solution.

So selecting the first one.

Thanks again

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 3469 views
  • 2 likes
  • 6 in conversation