I need help finding a way to drop any variables that contain all missing values. I have attempted to make a macro, but it seems to be getting a bit cumbersome, and I don't know how to finish what I've started. Can someone help me simplify (or even complete) my code?
In my code below, I have a table TEMP that contains 5 variables. They can be a combination of numeric or character. My macro is looking at the input dataset and determining the number of variables in the table. I then create a table MISSING that creates a corresponding numeric variable for each existing variable in TEMP and store in it the binary 0/1 from the missing() function. I then use PROC MEANS to get the sum of each new variable. In the output table MISSING_SUM, I get the system variable _FREQ_ and the sum of each created binary variable. I figured that if _FREQ_=VARn, then I want to keep that variable and trace it back to the original data.
Is there a better thought process here, or is there a way I can finish my macro? In the end, if I were to use PROC PRINT on the final result, it would display TEMP showing F1-F4 instead of F1-F5 since F5 is completely blank or missing.
data temp;
input F1 $ F2 F3 F4 F5;
cards;
A 1 2 . .
B 3 4 . .
C 5 6 . .
D . 5 1 .
E 4 6 5 .
F 4 . 6 .
G 3 3 6 .
H 5 3 2 .
;
run;
%macro missing_var (dsin=);
proc sql noprint;
select nvar into :nvar from sashelp.vtable where libname='WORK' & memname="&dsin.";
%let nvar=%sysfunc(compress(&nvar.));
select name into :var1-:var&nvar. from sashelp.vcolumn where libname='WORK' & memname="&dsin.";
quit;
data missing (keep=var1-var&nvar.);
set &dsin.;
%do i=1 %to &nvar.;
var&i.=missing(&&var&i.);
%end;
run;
proc means data=missing noprint sum;
var _all_;
output out=missing_sum sum=;
run;
%mend missing_var;
%missing_var (dsin=temp);
Hi @djbateman
Somebody here posted this code.
data class;
set sashelp.class;
call missing(height,sex);
run;
proc format ;
value $missfmt ' '="Missing" other="Not Missing";
value missfmt ._-.Z ="Missing" other="Not Missing";
run;
ods select none;
proc freq data=class nlevels;
format _numeric_ missfmt. _character_ $missfmt.;
ods output nlevels=nlevels;
run;
ods select all;
data nlevels;
length TableVar $32 NLevels NMissLevels NNonMissLevels 8;
set nlevels;
if NNonMissLevels eq 0 then Flag=1;
run;
proc print;
run;
Hi @djbateman
Somebody here posted this code.
data class;
set sashelp.class;
call missing(height,sex);
run;
proc format ;
value $missfmt ' '="Missing" other="Not Missing";
value missfmt ._-.Z ="Missing" other="Not Missing";
run;
ods select none;
proc freq data=class nlevels;
format _numeric_ missfmt. _character_ $missfmt.;
ods output nlevels=nlevels;
run;
ods select all;
data nlevels;
length TableVar $32 NLevels NMissLevels NNonMissLevels 8;
set nlevels;
if NNonMissLevels eq 0 then Flag=1;
run;
proc print;
run;
That worked great. I was able to modify and simplify in a way that didn't even require the use of a macro. Thank you so much!
data temp;
input F1 $ F2 F3 F4 F5;
cards;
A 1 2 . .
B 3 4 . .
C 5 6 . .
D . 5 1 .
E 4 6 5 .
F 4 . 6 .
G 3 3 6 .
H 5 3 2 .
;
run;
proc format;
value $missfmt ' '="Missing" other="Not Missing";
value missfmt ._-.Z ="Missing" other="Not Missing";
run;
ods select none;
proc freq data=temp nlevels;
format _numeric_ missfmt. _character_ $missfmt.;
ods output nlevels=nlevels;
run;
ods select all;
proc sql noprint;
select tablevar into :dropvars separated by ' ' from nlevels where NNonMissLevels=0;
quit;
data temp;
set temp (drop=&dropvars.);
run;
You are welcome!
And Don't forget SQL way.
data have;
input F1 $ F2 F3 F4 F5;
cards;
A 1 2 . .
B 3 4 . .
C 5 6 . .
D . 5 1 .
E 4 6 5 .
F 4 . 6 .
G 3 3 6 .
H 5 3 2 .
;
run;
/*firstly get variable name*/
proc transpose data=have(obs=0) out=vname;
var _all_;
run;
/*then know if variable is all missing*/
proc sql noprint;
select catx(' ','n(',_name_,') as',_name_) into :vnames separated by ','
from vname;
create table temp as
select &vnames from have;
quit;
proc transpose data=temp out=want;
var _all_;
run;
proc sql noprint;
select _name_ into : missing_variables separated by ' '
from want
where col1=0;
quit;
%put Missing variables are : &missing_variables ;
And Don't forget IML way, if you have it.
data have;
input F1 $ F2 F3 F4 F5;
cards;
A 1 2 . .
B 3 4 . .
C 5 6 . .
D . 5 1 .
E 4 6 5 .
F 4 . 6 .
G 3 3 6 .
H 5 3 2 .
;
run;
proc iml;
use have;
read all var _char_ into char[c=vname_char];
read all var _num_ into num[c=vname_num];
close;
missing_char=vname_char[loc(countn(char,'col')=0)];
missing_num =vname_num [loc(countn(num ,'col')=0)];
missing_var=missing_char//missing_num;
create want var{missing_var};
append;
close;
quit;
proc sql noprint;
select missing_var into : missing_variables separated by ' '
from want;
quit;
%put Missing variables are : &missing_variables ;
Thank you . I have added those and many of your important other threads to my personal notes 🙂
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.