Hi everyone!
I'm having a little trouble with a data set. I've the idea of dropping a variable if all the observations have the value of 1,
The table is like this;
var1 var2 var3 var4
0.5 1 1.2 0.4
0.9 1 1.5 0.2
0.7 1 1.1 0.3
1 1 1 1
I want to drop the variable var2 in this case. My problem is that I can't do it mannualy, as the four variables are being included because of a macro, and the name of them can change.
Is there any way to remove it conditionally?
Thanks in advance.
To remove a variable you need to either DROP it or not KEEP it. Basically you need use code generation create the DROP statement (or DROP= dataset option. One easy way to do that is to put the variable name into a macro variable and then use the value of the macro variable in your code. You could look on-line for many examples of how to do this for variables with all missing values and just adopt one and modify it to work for variables that are all one.
Here is one way.
data have ;
input var1-var4 ;
cards;
0.5 1 1.2 0.4
0.9 1 1.5 0.2
0.7 1 1.1 0.3
1 1 1 1
;
%let droplist=;
data _null_;
array _flags (100) _temporary_ (100*1) ;
set have end=eof;
array _in var1-var4 ;
do _n_ =1 to dim(_in) ;
if _in(_n_) ne 1 then _flags(_n_)=0;
end;
if eof then do _n_=1 to dim(_in);
if _flags(_n_) then call execute(catx(' ','%let droplist=&droplist',nliteral(vname(_in(_n_))),';'));
end;
run;
%put &=droplist ;
data want ;
set have ;
drop &droplist;
run;
If you have more than 100 variables you want to test then just increase the size of the temporary array.
To remove a variable you need to either DROP it or not KEEP it. Basically you need use code generation create the DROP statement (or DROP= dataset option. One easy way to do that is to put the variable name into a macro variable and then use the value of the macro variable in your code. You could look on-line for many examples of how to do this for variables with all missing values and just adopt one and modify it to work for variables that are all one.
Here is one way.
data have ;
input var1-var4 ;
cards;
0.5 1 1.2 0.4
0.9 1 1.5 0.2
0.7 1 1.1 0.3
1 1 1 1
;
%let droplist=;
data _null_;
array _flags (100) _temporary_ (100*1) ;
set have end=eof;
array _in var1-var4 ;
do _n_ =1 to dim(_in) ;
if _in(_n_) ne 1 then _flags(_n_)=0;
end;
if eof then do _n_=1 to dim(_in);
if _flags(_n_) then call execute(catx(' ','%let droplist=&droplist',nliteral(vname(_in(_n_))),';'));
end;
run;
%put &=droplist ;
data want ;
set have ;
drop &droplist;
run;
If you have more than 100 variables you want to test then just increase the size of the temporary array.
I really don't understand very well what it's being doing in the _null_ data set, but it worked perfectly ! Thank you very much, i'll try to study it.
Hi,
If you are comfortable with some statistical stuff then try this.
proc means data=have stackods var;
ods output summary=stats;
run;
proc sql;
select Variable into :drop_vars from stats
where Var=0;
quit;
data want;
set have(drop=&drop_vars);
run;
@stat_sasNice use of the ODS OUTPUT statement. If more than one variable needs to be dropped a SEPARATED BY phrase could be added to the SQL step which creates &DROP_VARS:
select Variable into :drop_vars separated by ' ' from stats
Also this will eliminate any variable with constant value, other than all missing, not just all 1's.
data have ;
input var1-var4 ;
cards;
0.5 1 1.2 0.4
0.9 1 1.5 0.2
0.7 1 1.1 0.3
1 1 1 1
;
proc sql noprint;
select catx(' ','sum(',name,'=1) as ',name) into : list separated by ','
from dictionary.columns
where libname='WORK' and memname='HAVE';
create table temp as
select &list from have;
quit;
proc transpose data=temp out=temp1;
run;
proc sql noprint;
select count(*) into : n from have;
select _name_ into : drop separated by ','
from temp1
where col1=&n;
alter table have
drop &drop;
quit;
The simplest way is using IML .
data have ;
input var1-var4 ;
cards;
0.5 1 1.2 0.4
0.9 1 1.5 0.2
0.7 1 1.1 0.3
1 1 1 1
;
proc iml;
use have nobs nobs;
read all var _all_ into x[c=vname];
close;
drop=vname[loc((x=1)[+,]=nobs)];
submit drop;
data want;
set have;
drop &drop;
run;
endsubmit;
quit;
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.