I have dataset with around 80-90 variables. I want to create dataset in such way that I have to drop those variables which are null for all records. It should be dynamic in such way if new variable is added with null value it should be dropped.
Thanks
My code is better ,shorter and faster than support . sas . com .
data class; set sashelp.class; call missing(age,name); run; proc sql noprint ; select cat('n(',strip(name),') as ',name) into : list separated by ',' from dictionary.columns where libname='WORK' and memname='CLASS'; create table temp as select &list from class; quit; data _null_; set temp; length drop $ 4000; array x{*} _numeric_; do i=1 to dim(x); if x{i}=0 then drop=catx(' ',drop,vname(x{i})); end; call symputx('drop',drop); run; data want; set class(drop= &drop ); run;
Xia Keshan
Q1 = Why 80-90 variables - consider normalizing your data!
Try the below example and modify for your particular data:
data work.have;
length a b c d $20;
a="GHGYT";b="£$"; output;
run;
proc sql;
create table TO_PROCESS
(
NAME char(50),
CNT num
);
quit;
data tmp;
set sashelp.vcolumn (where=(libname="WORK" and MEMNAME="HAVE"));
call execute('proc sql;
insert into WORK.TO_PROCESS
set NAME="'||strip(NAME)||'",
CNT=(select count('||strip(NAME)||') from WORK.HAVE where '||strip(NAME)||' is not null);
quit;');
run;
proc sql noprint;
select NAME
into :DROP_LIST separated by ' '
from WORK.TO_PROCESS
where CNT=0;
quit;
data want (drop=&drop_list.);
set have;
run;
My code is better ,shorter and faster than support . sas . com .
data class; set sashelp.class; call missing(age,name); run; proc sql noprint ; select cat('n(',strip(name),') as ',name) into : list separated by ',' from dictionary.columns where libname='WORK' and memname='CLASS'; create table temp as select &list from class; quit; data _null_; set temp; length drop $ 4000; array x{*} _numeric_; do i=1 to dim(x); if x{i}=0 then drop=catx(' ',drop,vname(x{i})); end; call symputx('drop',drop); run; data want; set class(drop= &drop ); run;
Xia Keshan
Hi @Ksharp, Is it possible to drop null variables without specifying by variable names because I have hundreds of them?
Check @MikeZeb 's solution.
hi ... you got a very nice answer from Ksharp, but here's another idea that use PROC FREQ rather than PROC SQL to find the variables with all missing values ...
data class;
set sashelp.class;
call missing(age,name);
run;
ods output nlevels=nlvs (where=(nnonmisslevels eq 0));
proc freq data=class nlevels;
ods select nlevels;
run;
proc sql noprint ;
select tablevar into :drop separated by ' ' from nlvs;
quit;
data want;
set class(drop=&drop);
run;
ps more on NLEVELS at 30867 - Modernizing Your SAS Code: PROC FREQ Applications and in Appendix C of http://www.lexjansen.com/nesug/nesug11/ds/ds12.pdf
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.