BookmarkSubscribeRSS Feed
meriS
Fluorite | Level 6

I'd like to write a macro to replace any missing variables (numeric or character) with the most frequent value of that variable in a new column. 

I don't have much so far, but here is what I've come up with. I basically don't know how to determine the most frequent value. 

 

%MACRO impute_freq (dat, var_name);

proc freq data = &dat;

     by descending &var_name;

run;

 

%if &var_name = . %then

    newcol = ???

 

%else newcol = &var_name;

 

run;

 

%MEND impute_freq;

4 REPLIES 4
Reeza
Super User
That usually is not the recommended approach to dealing with missing data though it's easily achievable.

https://stats.idre.ucla.edu/sas/seminars/multiple-imputation-in-sas/mi_new_1/

You can also look into PROC STDIZE which has several methods to replace missing values.

PaigeMiller
Diamond | Level 26

Yes, with PROC STDIZE and METHOD=IN(ds), no macros are needed.

 

But I can't think of a good reason to use the most frequent value; I would recommend using either the mean or the median.

 

UNTESTED CODE

proc summary data=have;
    var variable1 variable2 ... ;
    output out=mode(drop=_:) mode=;
run;
data modes;
     set mode;
     array x variable1 variable2 ... ;
     _type_='LOCATION';
    output;
    do i=1 to dim(x);
         x(i)=1;
    end;
    _type_='SCALE';
    output;
    drop i;
run;
proc stdize data=have out=want reponly method=in(modes);
    var variable1 variable2 ... ;
run;

    

But really, don't use the MODE, use either MEDIAN or MEAN. A word to the wise ... 

 

And in addition, if you want to replace missings with the median or mean, its even easier, as this is built directly into PROC STDIZE.

--
Paige Miller
sustagens
Pyrite | Level 9

One way of doing it using a macro, despite being doable without

%macro Assign_Mode_to_Null(table,var);
proc sql noprint;
select &var into :mode
from (select &var, count(&var) as N 
      from &table 
      group by &var)
having N=max(N) and monotonic()=1; 
quit;

data want (drop=dsid type rc); 
dsid=open("&table",'i'); 
type=vartype(dsid,varnum(dsid,"&var")); 
rc=close(dsid);
set &table;
if missing(&var) then do; 
if type='N' then &var=&mode; 
if type='C' then do; &var="&mode"; drop &mode; end;
end;
run;
%mend Assign_Mode_to_Null;

%Assign_Mode_to_Null(have,var_name);

 

Ksharp
Super User
data have;
set sashelp.class;
if _n_ in (1 4 7) then call missing(age, sex);
run;




%MACRO impute_freq (dat, var_name);
proc freq data=&dat(where=(&var_name is not missing)) order=freq noprint;
table &var_name/out=mode;
run;
data &dat;
 set &dat;
 if _n_=1 then set mode(keep=&var_name rename=(&var_name=_&var_name));
 if missing(&var_name) then &var_name=_&var_name;
 drop _&var_name;
run;
%MEND impute_freq;

%impute_freq(have,age)
%impute_freq(have,sex)

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1530 views
  • 2 likes
  • 5 in conversation