I am trying to create a macro that drops variables having missing values % greater than 90. It is working fine for numeric variables. Since PROC MEANS does not work for character variables and PROC FREQ produces clumsy output, i ain't able to replice it for character variables.
data have;
set sashelp.class;
if _n_ in (5 10 12) then do;
call missing(height,weight);
end;
if _n_ in (7 9) then do;
call missing(age);
end;
r = ranuni(1);
run;
ods output summary=nmiss;
proc means data=have stackods nmiss n;
var _numeric_;
run;
ods output close;
proc print data=nmiss;
run;
%let pmisslist=;
proc sql noprint;
select variable into :pmisslist separated by ' '
from nmiss where nmiss/(sum(n,nmiss)) ge .15;
quit;
run;
%put NOTE: &=PMISSLIST;
Proc freq is not efficient for large table , But SQL was.
data have; set sashelp.class; if _n_ in (1:18) then do; call missing(height,weight); end; if _n_ in (7 9) then do; call missing(age); end; r = ranuni(1); run; proc sql; select cat('nmiss(',strip(name),')/count(*) as ',strip(name)) into : list separated by ',' from dictionary.columns where libname='WORK' and memname='HAVE'; create table temp as select &list from have; quit; proc transpose data=temp out=temp1;run; proc sql; select _NAME_ into : keep separated by ',' from temp1 where col1 lt .9 ; create table want as select &keep from have; quit;
Xia Keshan
This should get you started.
Proc freq is not efficient for large table , But SQL was.
data have; set sashelp.class; if _n_ in (1:18) then do; call missing(height,weight); end; if _n_ in (7 9) then do; call missing(age); end; r = ranuni(1); run; proc sql; select cat('nmiss(',strip(name),')/count(*) as ',strip(name)) into : list separated by ',' from dictionary.columns where libname='WORK' and memname='HAVE'; create table temp as select &list from have; quit; proc transpose data=temp out=temp1;run; proc sql; select _NAME_ into : keep separated by ',' from temp1 where col1 lt .9 ; create table want as select &keep from have; quit;
Xia Keshan
Where is your proof?
xia keshan wrote:
Proc freq is not efficient for large table , But SQL was.
If you have many levels and that is the issue try a FORMAT for the character variables that reduces everything to 2 levels:
proc format library=work;
value $dummy
' '='Missing'
other='Not Missing';
run;
and
Proc freq data=have;
table charactervariablename/ missing ;
format charactervariablename dummy.;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.