I have dataset with around 80-90 variables. I want to create dataset in such way that I have to drop those variables which are null for all records. It should be dynamic in such way if new variable is added with null value it should be dropped.
Thanks
My code is better ,shorter and faster than support . sas . com .
data class; set sashelp.class; call missing(age,name); run; proc sql noprint ; select cat('n(',strip(name),') as ',name) into : list separated by ',' from dictionary.columns where libname='WORK' and memname='CLASS'; create table temp as select &list from class; quit; data _null_; set temp; length drop $ 4000; array x{*} _numeric_; do i=1 to dim(x); if x{i}=0 then drop=catx(' ',drop,vname(x{i})); end; call symputx('drop',drop); run; data want; set class(drop= &drop ); run;
Xia Keshan
Q1 = Why 80-90 variables - consider normalizing your data!
Try the below example and modify for your particular data:
data work.have;
length a b c d $20;
a="GHGYT";b="£$"; output;
run;
proc sql;
create table TO_PROCESS
(
NAME char(50),
CNT num
);
quit;
data tmp;
set sashelp.vcolumn (where=(libname="WORK" and MEMNAME="HAVE"));
call execute('proc sql;
insert into WORK.TO_PROCESS
set NAME="'||strip(NAME)||'",
CNT=(select count('||strip(NAME)||') from WORK.HAVE where '||strip(NAME)||' is not null);
quit;');
run;
proc sql noprint;
select NAME
into :DROP_LIST separated by ' '
from WORK.TO_PROCESS
where CNT=0;
quit;
data want (drop=&drop_list.);
set have;
run;
My code is better ,shorter and faster than support . sas . com .
data class; set sashelp.class; call missing(age,name); run; proc sql noprint ; select cat('n(',strip(name),') as ',name) into : list separated by ',' from dictionary.columns where libname='WORK' and memname='CLASS'; create table temp as select &list from class; quit; data _null_; set temp; length drop $ 4000; array x{*} _numeric_; do i=1 to dim(x); if x{i}=0 then drop=catx(' ',drop,vname(x{i})); end; call symputx('drop',drop); run; data want; set class(drop= &drop ); run;
Xia Keshan
Hi @Ksharp, Is it possible to drop null variables without specifying by variable names because I have hundreds of them?
Check @MikeZeb 's solution.
hi ... you got a very nice answer from Ksharp, but here's another idea that use PROC FREQ rather than PROC SQL to find the variables with all missing values ...
data class;
set sashelp.class;
call missing(age,name);
run;
ods output nlevels=nlvs (where=(nnonmisslevels eq 0));
proc freq data=class nlevels;
ods select nlevels;
run;
proc sql noprint ;
select tablevar into :drop separated by ' ' from nlvs;
quit;
data want;
set class(drop=&drop);
run;
ps more on NLEVELS at 30867 - Modernizing Your SAS Code: PROC FREQ Applications and in Appendix C of http://www.lexjansen.com/nesug/nesug11/ds/ds12.pdf
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.