SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
meriS
Fluorite | Level 6

I'd like to write a macro to replace any missing variables (numeric or character) with the most frequent value of that variable in a new column. 

I don't have much so far, but here is what I've come up with. I basically don't know how to determine the most frequent value. 

 

%MACRO impute_freq (dat, var_name);

proc freq data = &dat;

     by descending &var_name;

run;

 

%if &var_name = . %then

    newcol = ???

 

%else newcol = &var_name;

 

run;

 

%MEND impute_freq;

4 REPLIES 4
Reeza
Super User
That usually is not the recommended approach to dealing with missing data though it's easily achievable.

https://stats.idre.ucla.edu/sas/seminars/multiple-imputation-in-sas/mi_new_1/

You can also look into PROC STDIZE which has several methods to replace missing values.

PaigeMiller
Diamond | Level 26

Yes, with PROC STDIZE and METHOD=IN(ds), no macros are needed.

 

But I can't think of a good reason to use the most frequent value; I would recommend using either the mean or the median.

 

UNTESTED CODE

proc summary data=have;
    var variable1 variable2 ... ;
    output out=mode(drop=_:) mode=;
run;
data modes;
     set mode;
     array x variable1 variable2 ... ;
     _type_='LOCATION';
    output;
    do i=1 to dim(x);
         x(i)=1;
    end;
    _type_='SCALE';
    output;
    drop i;
run;
proc stdize data=have out=want reponly method=in(modes);
    var variable1 variable2 ... ;
run;

    

But really, don't use the MODE, use either MEDIAN or MEAN. A word to the wise ... 

 

And in addition, if you want to replace missings with the median or mean, its even easier, as this is built directly into PROC STDIZE.

--
Paige Miller
sustagens
Pyrite | Level 9

One way of doing it using a macro, despite being doable without

%macro Assign_Mode_to_Null(table,var);
proc sql noprint;
select &var into :mode
from (select &var, count(&var) as N 
      from &table 
      group by &var)
having N=max(N) and monotonic()=1; 
quit;

data want (drop=dsid type rc); 
dsid=open("&table",'i'); 
type=vartype(dsid,varnum(dsid,"&var")); 
rc=close(dsid);
set &table;
if missing(&var) then do; 
if type='N' then &var=&mode; 
if type='C' then do; &var="&mode"; drop &mode; end;
end;
run;
%mend Assign_Mode_to_Null;

%Assign_Mode_to_Null(have,var_name);

 

Ksharp
Super User
data have;
set sashelp.class;
if _n_ in (1 4 7) then call missing(age, sex);
run;




%MACRO impute_freq (dat, var_name);
proc freq data=&dat(where=(&var_name is not missing)) order=freq noprint;
table &var_name/out=mode;
run;
data &dat;
 set &dat;
 if _n_=1 then set mode(keep=&var_name rename=(&var_name=_&var_name));
 if missing(&var_name) then &var_name=_&var_name;
 drop _&var_name;
run;
%MEND impute_freq;

%impute_freq(have,age)
%impute_freq(have,sex)

sas-innovate-white.png

Join us for our biggest event of the year!

Four days of inspiring keynotes, product reveals, hands-on learning opportunities, deep-dive demos, and peer-led breakouts. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2023 views
  • 2 likes
  • 5 in conversation