DATA Step, Macro, Functions and more

How to select the wanted amount of variables?

Reply
Contributor
Posts: 62

How to select the wanted amount of variables?

Hi!

 

I have 15  variables, each variable has just two values: 0 and 1.

 

I want to creat another variable (values 0 and 1) which would indicate that from these 15 variables at least 6 have values 1 (or at least 7 and so on)?

 

thanks

Super User
Posts: 5,257

Re: How to select the wanted amount of variables?

I don't know what you are going to use this for, but to me it sounds like it would be easier if you transpose your data, then you could just use SQL to sum/count for each id how many '1' you have.

Another benefit is that you can add more "variables" whenever you want, without changing any logic or change the structure of your data.

Data never sleeps
Super User
Posts: 10,500

Re: How to select the wanted amount of variables?

Easiest is the compare the sum.

In a data step:

 

Newvar = (sum (varname1, varname2 ..., varname15) ge 6);

 

If the variables have nice names like rate1 rate2 ... rate15 then use a variable list

Newvar = (sum (varname1 - varname15) ge 6);

 

Or possibly you could just use a variable with the total.

NewVar = sum(varname1 - varname15);

Super User
Posts: 5,083

Re: How to select the wanted amount of variables?

The idea of creating a new variable is good.  However, note that the dash is a minus sign here:

 

newvar = sum(var1 - var15);

 

If you want it to represent a list of variables, you have to add "of" before the list:

 

newvar = sum(of var1 - var15);

Super User
Posts: 10,500

Re: How to select the wanted amount of variables?


Astounding wrote:

The idea of creating a new variable is good.  However, note that the dash is a minus sign here:

 

newvar = sum(var1 - var15);

 

If you want it to represent a list of variables, you have to add "of" before the list:

 

newvar = sum(of var1 - var15);


 

Thanks for the reminder, generally I'm using that with arrays and didn't want to add complexity for this solution.

Trusted Advisor
Posts: 1,115

Re: How to select the wanted amount of variables?

[ Edited ]

If your 15 variables have consecutive internal variable numbers, you could proceed like in the following example:

 

data test;
input numberone b c d e f g h i j k l m n nofifteen;
array atleast[15];
do z=1 to 15;
  atleast[z]=(sum(of numberone--nofifteen)>=z);
end;
drop z atleast1-atleast5;
cards;
1 0 0 1 1 0 1 0 0 1 0 1 1 0 1
;

This would create variables ATLEAST6, ATLEAST7, ..., ATLEAST15 containing the desired 0-1-indicators for "at least 6 ones", "at least 7 ones" and so on.

 

 

In order to check if the 15 internal variable numbers are in fact consecutive, run the following step and look where your 15 variables are located in the output:

 

proc contents data=yourdata varnum;
run;

 

 

If it turns out that the list of your 15 variables in the above PROC CONTENTS output is interrupted by other variables, modify the order of the variables by running the following step on your dataset (variable names taken from previous example):

 

data yourdata;
retain numberone b c d e f g h i j k l m n nofifteen;
set yourdata;
run;

This will ensure that the list numberone--nofifteen contains just the relevant 15 variables.

 

Respected Advisor
Posts: 3,893

Re: How to select the wanted amount of variables?

Nothing others have suggested already. Just posting a full code sample.

data have(drop=_:);
  array var_ {15} 8.;
  do _rows=1 to 20;
    do _i=1 to dim(var_);
      var_{_i}=floor(ranuni(0)*1.5);
    end;
    output;
  end;
run;

data want;
  set have;
  Six_Plus_indicator= ( sum(of var_:) >= 6 );
run;
Ask a Question
Discussion stats
  • 6 replies
  • 239 views
  • 1 like
  • 6 in conversation