Hello my current dataset is in the form below (dummy data):
ID | Visit | Group1 | ScoreX | Group2 | ScoreY | Group3 | ScoreZ |
---|---|---|---|---|---|---|---|
1 | 1 | A | 0 | B | 0 | C | 0 |
1 | 2 | A | 0 | B | 1 | C | 0 |
1 | 3 | A | 1 | B | 1 | C | 0 |
1 | 4 | A | 2 | B | 1 | C | 0 |
2 | 1 | C | 1 | A | 0 | B | 0 |
2 | 2 | C | 0 | A | 1 | B | 1 |
2 | 3 | C | 0 | A | 3 | B | 2 |
2 | 4 | C | 3 | A | 3 | B | 3 |
and would like to convert into two ways: 1) i would like to go from wide to long format ***UPDATE part 1 solved: see post 3 below***
ID | Group | Visit | Score |
---|---|---|---|
1 | A | 1 | 0 |
1 | A | 2 | 0 |
1 | A | 3 | 1 |
1 | A | 4 | 2 |
1 | B | 1 | 0 |
1 | B | 2 | 1 |
1 | B | 3 | 1 |
1 | B | 4 | 1 |
1 | C | 1 | 0 |
1 | C | 2 | 0 |
1 | C | 3 | 0 |
1 | C | 4 | 0 |
2 | B | 1 | 0 |
2 | B | 2 | 0 |
2 | B | 3 | 0 |
2 | B | 4 | 3 |
2 | A | 1 | 0 |
2 | A | 2 | 1 |
... | (continued) | ... | ... |
and 2) I would like to distill the data into this below. ***UPDATE part 2 solved: see post 4 and 5 below***
Essentially I am counting the number of visits it took to get a score of 1, 2, or 3.
ID | Group | Visit of first 1 score | Visit of first 2 score | Visit of first 3 score |
---|---|---|---|---|
1 | A | 3 | 4 | . |
1 | B | 2 | . | . |
1 | C | . | . | . |
2 | A | 2 | . | 3 |
2 | B | 2 | 3 | 4 |
2 | C | 1 | . | 4 |
I believe I can do this with a combination of arrays but I am not sure how to handle the groups when my data is in wide format (1st table). So my thoughts are first I need to get it into long format (2nd table) and then compress the data (3rd table). I do not know SQL so any datastep programming help would be greatly appreciated
UNTESTED (and written while watching TV)
data want;
set table2;
by id group;
retain flag1-flag3 visit_score1-visit_score3 counter;
if first.group then do;
visit_score1=.; visit_score2=.; visit_score3=.;
flag1=0; flag2=0;flag3=0;
counter=0;
end;
counter+1;
if score=1 and flag1=0 then do;
visit_score1=counter; flag1=1;
end;
else if score=2 and flag2=0 then do;
visit_score2=counter; flag2=1;
end;
else if score=3 and flag3=0 then do;
visit_score3=counter; flag3=1;
end;
if last.group then output;
run;
Does your data from your second table match up with the first/second table?
Specifically, where does this line come from:
1 | B | 2 | . | . |
Reeza,
so the third table used the data from the first/second table. So to answer your question, for the first subject ID=1 group=B, it took two visits to get a score of 1 but that subject never got a score of 2 or 3 so I left them blank. Another example would be for ID=2 group=A, a score of 1 was achieved on visit 2, a score of 2 was never attained (so its missing) and a score of 3 was achieved on visit 3
Ok I was able to figure out how to convert from the wide format to the long format with the code below. Half way there!
data TABLE1;
input ID Visit Group1 $ ScoreX Group2 $ ScoreY Group3 $ ScoreZ;
datalines;
1 1 A 0 B 0 C 0
1 2 A 0 B 1 C 0
1 3 A 1 B 1 C 0
1 4 A 2 B 1 C 0
2 1 C 1 A 0 B 0
2 2 C 0 A 1 B 1
2 3 C 0 A 3 B 2
2 4 C 3 A 3 B 3
;
data TABLE2(keep=id group visit score);
retain id group visit; set TABLE1;
array grp{3} Group1 Group2 Group3;
array scr{3} ScoreX ScoreY ScoreZ;
do i=1 to 3;
group=grp{i};
score=scr{i};
output;
end;
run;
proc sort data=TABLE2; by id group visit; run;
UNTESTED (and written while watching TV)
data want;
set table2;
by id group;
retain flag1-flag3 visit_score1-visit_score3 counter;
if first.group then do;
visit_score1=.; visit_score2=.; visit_score3=.;
flag1=0; flag2=0;flag3=0;
counter=0;
end;
counter+1;
if score=1 and flag1=0 then do;
visit_score1=counter; flag1=1;
end;
else if score=2 and flag2=0 then do;
visit_score2=counter; flag2=1;
end;
else if score=3 and flag3=0 then do;
visit_score3=counter; flag3=1;
end;
if last.group then output;
run;
Awesome thank you Reeza! I found out that you can also do this with array statements as below:
data TABLE3(keep=id group visit_score1-visit_score3); set TABLE2;
by id group;
array vs{3} visit_score1-visit_score3 (3*0);
array flag{3} flag1-flag3 (3*0);
if first.group then do;
do i=1 to 3;
vs{i}=.; flag{i}=0;
end;
counter=0;
end;
counter+1;
do i=1 to 3;
if score=i and flag{i}=0 then do;
vs{i}=counter; flag{i}=1;
end;
end;
if last.group then output;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.