Greetings,
I am trying to create a summary group variable that evaluates values in multiple columns. Snapshot of data below.
ID | R1 | R2 | R3 | R4 | R |
ABC | 1 | 0 | 9 | 9 | 1 |
DVF | 0 | 0 | 1 | 1 | Multiple |
EF | 1 | 1 | 9 | 0 | Multiple |
RR | 9 | 0 | 1 | 0 | 3 |
GGH | 1 | 0 | 9 | 0 | 1 |
HJY | 0 | 9 | 0 | 1 | 4 |
Partner: 1=Yes, 0-No | |||||
R1-R4: 1=Yes, 0=No, 9=Don't Know |
Column R is what I am trying to create.
If R1=1 and R2-R4 is 0,9 then R=1
If R3=1 and R1,R2,R4 is 0,9 then R=3
If R1=1, R2=1 and R3-R4 is 0,9 then R=10 (Multiple)
THanks in advance.
Is R supposed to be a numeric value or character? You can't have a value of "multiple" if the variable is supposed to be numeric.
For the explicit values you list:
data want; set have; if r1=1 and r2 in (0 9) and r3 in (0 9) and r4 in (0 9) then R=1; else if r3=1 and r1 in (0 9) and r2 in (0 9) and r4 in (0 9) then R=3; else if r1=1 and r2=1 and r3 in (0 9) and r4 in (0 9) then R=10; run;
The IN operator compares a single variable to a list of values (not variables) and is true if the variable is any of the listed values.
This only addresses the specific combinations you asked for. I foresee potentially a lot of missing values for R depending on your actual data.
Thanks Ballard.
Your solution is what I had planned originally. I have several variables I need to create using similar algorithm, so I wanted to see if there was an easier way to do this. If an ID had YES for more than one R then I wanted to code as"Multiple" or 10, a numerical variable.
THanks again.
@rsva wrote:
Thanks Ballard.
Your solution is what I had planned originally. I have several variables I need to create using similar algorithm, so I wanted to see if there was an easier way to do this. If an ID had YES for more than one R then I wanted to code as"Multiple" or 10, a numerical variable.
THanks again.
You do not associate a value of the shown variables with "yes" and made no similar statement about "more than one". All of you rules involved explicit rules without any context and assigned specific variable values, not a count.
@rsva shows the basis of one method of counting 1's, convert all the values to character and concatenate them, then use the COUNTC function to count how many 1's. This is if the 1 are all you want to count.
Data want; set have; if countc('1',cats(r1,r2,r3,r4))=1 then r=1; else if countc('1',cats(r1,r2,r3,r4)) > 1 then r=10; run;
If I understand right what you're after then below could work.
data have;
infile datalines truncover;
input (ID R1 R2 R3 R4) ($);
datalines;
ABC 1 0 9 9 1
DVF 0 0 1 1 Multiple
EF 1 1 9 0 Multiple
RR 9 0 1 0 3
GGH 1 0 9 0 1
HJY 0 9 0 1 4
xx1 0 9 5 1
xx2 0 9 0 0
;
data want(drop=_:);
set have;
array rs {*} r1-r4;
length R $10;
if countc(cats(of rs[*]),'109')=dim(rs) then
do;
_cnt_1=countc(cats(of rs[*]),'1');
if _cnt_1=0 then R=' ';
else if _cnt_1=1 then R=whichc('1',of rs[*]);
else R='Multiple';
end;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.