DATA Step, Macro, Functions and more

Removing a variable conditionally

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Removing a variable conditionally

Hi everyone!

I'm having a little trouble with a data set. I've the idea of dropping a variable if all the observations have the value of 1,

The table is like this;

 

var1 var2 var3 var4

0.5  1  1.2  0.4

0.9  1  1.5  0.2

0.7  1   1.1  0.3

1     1    1    1

 

I want to drop the variable var2 in this case. My problem is that I can't do it mannualy, as the four variables are being included because of a macro, and the name of them can change.

Is there any way to remove it conditionally?

Thanks in advance.


Accepted Solutions
Solution
‎01-15-2018 06:45 AM
Super User
Super User
Posts: 7,860

Re: Removing a variable conditionally

To remove a variable you need to either DROP it or not KEEP it.  Basically you need use code generation create the DROP statement (or DROP= dataset option.  One easy way to do that is to put the variable name into a macro variable and then use the value of the macro variable in your code.  You could look on-line for many examples of how to do this for variables with all missing values and just adopt one and modify it to work for variables that are all one.

 

Here is one way.

data have ;
  input var1-var4 ;
cards;
0.5  1  1.2  0.4
0.9  1  1.5  0.2
0.7  1  1.1  0.3
1    1  1    1
;

%let droplist=;
data _null_;
  array _flags (100) _temporary_ (100*1) ;
  set have end=eof;
  array _in  var1-var4 ;
  do _n_ =1 to dim(_in) ;
    if _in(_n_) ne 1 then _flags(_n_)=0;
  end;
  if eof then do _n_=1 to dim(_in);
    if _flags(_n_) then call execute(catx(' ','%let droplist=&droplist',nliteral(vname(_in(_n_))),';'));
  end;
run;
%put &=droplist ;
data want ;
  set have ;
  drop &droplist;
run;

If you have more than 100 variables you want to test then just increase the size of the temporary array.

 

 

 

 

View solution in original post


All Replies
Solution
‎01-15-2018 06:45 AM
Super User
Super User
Posts: 7,860

Re: Removing a variable conditionally

To remove a variable you need to either DROP it or not KEEP it.  Basically you need use code generation create the DROP statement (or DROP= dataset option.  One easy way to do that is to put the variable name into a macro variable and then use the value of the macro variable in your code.  You could look on-line for many examples of how to do this for variables with all missing values and just adopt one and modify it to work for variables that are all one.

 

Here is one way.

data have ;
  input var1-var4 ;
cards;
0.5  1  1.2  0.4
0.9  1  1.5  0.2
0.7  1  1.1  0.3
1    1  1    1
;

%let droplist=;
data _null_;
  array _flags (100) _temporary_ (100*1) ;
  set have end=eof;
  array _in  var1-var4 ;
  do _n_ =1 to dim(_in) ;
    if _in(_n_) ne 1 then _flags(_n_)=0;
  end;
  if eof then do _n_=1 to dim(_in);
    if _flags(_n_) then call execute(catx(' ','%let droplist=&droplist',nliteral(vname(_in(_n_))),';'));
  end;
run;
%put &=droplist ;
data want ;
  set have ;
  drop &droplist;
run;

If you have more than 100 variables you want to test then just increase the size of the temporary array.

 

 

 

 

New Contributor
Posts: 2

Re: Removing a variable conditionally

I really don't understand very well what it's being doing in the _null_ data set, but it worked perfectly ! Thank you very much, i'll try to study it.

Trusted Advisor
Posts: 1,270

Re: Removing a variable conditionally

Hi,

 

If you are comfortable with some statistical stuff then try this.

 

proc means data=have stackods var;
ods output summary=stats;
run;


proc sql;
select Variable into :drop_vars from stats
where Var=0;
quit;

 

data want;
set have(drop=&drop_vars);
run;

Valued Guide
Posts: 653

Re: Removing a variable conditionally

@stat_sasNice use of the ODS OUTPUT statement.  If more than one variable needs to be dropped a SEPARATED BY phrase could be added to the SQL step which creates &DROP_VARS:

select Variable into :drop_vars separated by ' ' from stats

Also this will eliminate any variable with constant value, other than all missing, not just all 1's.

Super User
Posts: 10,621

Re: Removing a variable conditionally

data have ;
  input var1-var4 ;
cards;
0.5  1  1.2  0.4
0.9  1  1.5  0.2
0.7  1  1.1  0.3
1    1  1    1
;

proc sql noprint;
select catx(' ','sum(',name,'=1) as ',name)  into : list	separated by ','
 from dictionary.columns
  where libname='WORK' and memname='HAVE';
create table temp as
 select &list from have;
quit;
proc transpose data=temp out=temp1;
run;
proc sql noprint;
select count(*) into : n from have;
select _name_ into : drop separated by ','
 from temp1
  where col1=&n;
alter table have
 drop &drop;
quit;
Super User
Posts: 10,621

Re: Removing a variable conditionally

The simplest way is using IML .

 

data have ;
  input var1-var4 ;
cards;
0.5  1  1.2  0.4
0.9  1  1.5  0.2
0.7  1  1.1  0.3
1    1  1    1
;
proc iml;
use have nobs nobs;
read all var _all_ into x[c=vname];
close;
drop=vname[loc((x=1)[+,]=nobs)];  
submit drop;
 data want;
  set have;
  drop &drop;
 run;
endsubmit;
quit;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 205 views
  • 6 likes
  • 5 in conversation