I need help finding a way to drop any variables that contain all missing values. I have attempted to make a macro, but it seems to be getting a bit cumbersome, and I don't know how to finish what I've started. Can someone help me simplify (or even complete) my code?
In my code below, I have a table TEMP that contains 5 variables. They can be a combination of numeric or character. My macro is looking at the input dataset and determining the number of variables in the table. I then create a table MISSING that creates a corresponding numeric variable for each existing variable in TEMP and store in it the binary 0/1 from the missing() function. I then use PROC MEANS to get the sum of each new variable. In the output table MISSING_SUM, I get the system variable _FREQ_ and the sum of each created binary variable. I figured that if _FREQ_=VARn, then I want to keep that variable and trace it back to the original data.
Is there a better thought process here, or is there a way I can finish my macro? In the end, if I were to use PROC PRINT on the final result, it would display TEMP showing F1-F4 instead of F1-F5 since F5 is completely blank or missing.
data temp;
input F1 $ F2 F3 F4 F5;
cards;
A 1 2 . .
B 3 4 . .
C 5 6 . .
D . 5 1 .
E 4 6 5 .
F 4 . 6 .
G 3 3 6 .
H 5 3 2 .
;
run;
%macro missing_var (dsin=);
proc sql noprint;
select nvar into :nvar from sashelp.vtable where libname='WORK' & memname="&dsin.";
%let nvar=%sysfunc(compress(&nvar.));
select name into :var1-:var&nvar. from sashelp.vcolumn where libname='WORK' & memname="&dsin.";
quit;
data missing (keep=var1-var&nvar.);
set &dsin.;
%do i=1 %to &nvar.;
var&i.=missing(&&var&i.);
%end;
run;
proc means data=missing noprint sum;
var _all_;
output out=missing_sum sum=;
run;
%mend missing_var;
%missing_var (dsin=temp);
Hi @djbateman
Somebody here posted this code.
data class;
set sashelp.class;
call missing(height,sex);
run;
proc format ;
value $missfmt ' '="Missing" other="Not Missing";
value missfmt ._-.Z ="Missing" other="Not Missing";
run;
ods select none;
proc freq data=class nlevels;
format _numeric_ missfmt. _character_ $missfmt.;
ods output nlevels=nlevels;
run;
ods select all;
data nlevels;
length TableVar $32 NLevels NMissLevels NNonMissLevels 8;
set nlevels;
if NNonMissLevels eq 0 then Flag=1;
run;
proc print;
run;
Hi @djbateman
Somebody here posted this code.
data class;
set sashelp.class;
call missing(height,sex);
run;
proc format ;
value $missfmt ' '="Missing" other="Not Missing";
value missfmt ._-.Z ="Missing" other="Not Missing";
run;
ods select none;
proc freq data=class nlevels;
format _numeric_ missfmt. _character_ $missfmt.;
ods output nlevels=nlevels;
run;
ods select all;
data nlevels;
length TableVar $32 NLevels NMissLevels NNonMissLevels 8;
set nlevels;
if NNonMissLevels eq 0 then Flag=1;
run;
proc print;
run;
That worked great. I was able to modify and simplify in a way that didn't even require the use of a macro. Thank you so much!
data temp;
input F1 $ F2 F3 F4 F5;
cards;
A 1 2 . .
B 3 4 . .
C 5 6 . .
D . 5 1 .
E 4 6 5 .
F 4 . 6 .
G 3 3 6 .
H 5 3 2 .
;
run;
proc format;
value $missfmt ' '="Missing" other="Not Missing";
value missfmt ._-.Z ="Missing" other="Not Missing";
run;
ods select none;
proc freq data=temp nlevels;
format _numeric_ missfmt. _character_ $missfmt.;
ods output nlevels=nlevels;
run;
ods select all;
proc sql noprint;
select tablevar into :dropvars separated by ' ' from nlevels where NNonMissLevels=0;
quit;
data temp;
set temp (drop=&dropvars.);
run;
You are welcome!
And Don't forget SQL way.
data have;
input F1 $ F2 F3 F4 F5;
cards;
A 1 2 . .
B 3 4 . .
C 5 6 . .
D . 5 1 .
E 4 6 5 .
F 4 . 6 .
G 3 3 6 .
H 5 3 2 .
;
run;
/*firstly get variable name*/
proc transpose data=have(obs=0) out=vname;
var _all_;
run;
/*then know if variable is all missing*/
proc sql noprint;
select catx(' ','n(',_name_,') as',_name_) into :vnames separated by ','
from vname;
create table temp as
select &vnames from have;
quit;
proc transpose data=temp out=want;
var _all_;
run;
proc sql noprint;
select _name_ into : missing_variables separated by ' '
from want
where col1=0;
quit;
%put Missing variables are : &missing_variables ;
And Don't forget IML way, if you have it.
data have;
input F1 $ F2 F3 F4 F5;
cards;
A 1 2 . .
B 3 4 . .
C 5 6 . .
D . 5 1 .
E 4 6 5 .
F 4 . 6 .
G 3 3 6 .
H 5 3 2 .
;
run;
proc iml;
use have;
read all var _char_ into char[c=vname_char];
read all var _num_ into num[c=vname_num];
close;
missing_char=vname_char[loc(countn(char,'col')=0)];
missing_num =vname_num [loc(countn(num ,'col')=0)];
missing_var=missing_char//missing_num;
create want var{missing_var};
append;
close;
quit;
proc sql noprint;
select missing_var into : missing_variables separated by ' '
from want;
quit;
%put Missing variables are : &missing_variables ;
Thank you . I have added those and many of your important other threads to my personal notes 🙂
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.