I am trying to create a macro that drops variables having missing values % greater than 90. It is working fine for numeric variables. Since PROC MEANS does not work for character variables and PROC FREQ produces clumsy output, i ain't able to replice it for character variables.
data have;
set sashelp.class;
if _n_ in (5 10 12) then do;
call missing(height,weight);
end;
if _n_ in (7 9) then do;
call missing(age);
end;
r = ranuni(1);
run;
ods output summary=nmiss;
proc means data=have stackods nmiss n;
var _numeric_;
run;
ods output close;
proc print data=nmiss;
run;
%let pmisslist=;
proc sql noprint;
select variable into :pmisslist separated by ' '
from nmiss where nmiss/(sum(n,nmiss)) ge .15;
quit;
run;
%put NOTE: &=PMISSLIST;
Proc freq is not efficient for large table , But SQL was.
data have; set sashelp.class; if _n_ in (1:18) then do; call missing(height,weight); end; if _n_ in (7 9) then do; call missing(age); end; r = ranuni(1); run; proc sql; select cat('nmiss(',strip(name),')/count(*) as ',strip(name)) into : list separated by ',' from dictionary.columns where libname='WORK' and memname='HAVE'; create table temp as select &list from have; quit; proc transpose data=temp out=temp1;run; proc sql; select _NAME_ into : keep separated by ',' from temp1 where col1 lt .9 ; create table want as select &keep from have; quit;
Xia Keshan
This should get you started.
Proc freq is not efficient for large table , But SQL was.
data have; set sashelp.class; if _n_ in (1:18) then do; call missing(height,weight); end; if _n_ in (7 9) then do; call missing(age); end; r = ranuni(1); run; proc sql; select cat('nmiss(',strip(name),')/count(*) as ',strip(name)) into : list separated by ',' from dictionary.columns where libname='WORK' and memname='HAVE'; create table temp as select &list from have; quit; proc transpose data=temp out=temp1;run; proc sql; select _NAME_ into : keep separated by ',' from temp1 where col1 lt .9 ; create table want as select &keep from have; quit;
Xia Keshan
Where is your proof?
xia keshan wrote:
Proc freq is not efficient for large table , But SQL was.
If you have many levels and that is the issue try a FORMAT for the character variables that reduces everything to 2 levels:
proc format library=work;
value $dummy
' '='Missing'
other='Not Missing';
run;
and
Proc freq data=have;
table charactervariablename/ missing ;
format charactervariablename dummy.;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.