Hello my current dataset is in the form below (dummy data):
ID | Visit | Group1 | ScoreX | Group2 | ScoreY | Group3 | ScoreZ |
---|---|---|---|---|---|---|---|
1 | 1 | A | 0 | B | 0 | C | 0 |
1 | 2 | A | 0 | B | 1 | C | 0 |
1 | 3 | A | 1 | B | 1 | C | 0 |
1 | 4 | A | 2 | B | 1 | C | 0 |
2 | 1 | C | 1 | A | 0 | B | 0 |
2 | 2 | C | 0 | A | 1 | B | 1 |
2 | 3 | C | 0 | A | 3 | B | 2 |
2 | 4 | C | 3 | A | 3 | B | 3 |
and would like to convert into two ways: 1) i would like to go from wide to long format ***UPDATE part 1 solved: see post 3 below***
ID | Group | Visit | Score |
---|---|---|---|
1 | A | 1 | 0 |
1 | A | 2 | 0 |
1 | A | 3 | 1 |
1 | A | 4 | 2 |
1 | B | 1 | 0 |
1 | B | 2 | 1 |
1 | B | 3 | 1 |
1 | B | 4 | 1 |
1 | C | 1 | 0 |
1 | C | 2 | 0 |
1 | C | 3 | 0 |
1 | C | 4 | 0 |
2 | B | 1 | 0 |
2 | B | 2 | 0 |
2 | B | 3 | 0 |
2 | B | 4 | 3 |
2 | A | 1 | 0 |
2 | A | 2 | 1 |
... | (continued) | ... | ... |
and 2) I would like to distill the data into this below. ***UPDATE part 2 solved: see post 4 and 5 below***
Essentially I am counting the number of visits it took to get a score of 1, 2, or 3.
ID | Group | Visit of first 1 score | Visit of first 2 score | Visit of first 3 score |
---|---|---|---|---|
1 | A | 3 | 4 | . |
1 | B | 2 | . | . |
1 | C | . | . | . |
2 | A | 2 | . | 3 |
2 | B | 2 | 3 | 4 |
2 | C | 1 | . | 4 |
I believe I can do this with a combination of arrays but I am not sure how to handle the groups when my data is in wide format (1st table). So my thoughts are first I need to get it into long format (2nd table) and then compress the data (3rd table). I do not know SQL so any datastep programming help would be greatly appreciated
UNTESTED (and written while watching TV)
data want;
set table2;
by id group;
retain flag1-flag3 visit_score1-visit_score3 counter;
if first.group then do;
visit_score1=.; visit_score2=.; visit_score3=.;
flag1=0; flag2=0;flag3=0;
counter=0;
end;
counter+1;
if score=1 and flag1=0 then do;
visit_score1=counter; flag1=1;
end;
else if score=2 and flag2=0 then do;
visit_score2=counter; flag2=1;
end;
else if score=3 and flag3=0 then do;
visit_score3=counter; flag3=1;
end;
if last.group then output;
run;
Does your data from your second table match up with the first/second table?
Specifically, where does this line come from:
1 | B | 2 | . | . |
Reeza,
so the third table used the data from the first/second table. So to answer your question, for the first subject ID=1 group=B, it took two visits to get a score of 1 but that subject never got a score of 2 or 3 so I left them blank. Another example would be for ID=2 group=A, a score of 1 was achieved on visit 2, a score of 2 was never attained (so its missing) and a score of 3 was achieved on visit 3
Ok I was able to figure out how to convert from the wide format to the long format with the code below. Half way there!
data TABLE1;
input ID Visit Group1 $ ScoreX Group2 $ ScoreY Group3 $ ScoreZ;
datalines;
1 1 A 0 B 0 C 0
1 2 A 0 B 1 C 0
1 3 A 1 B 1 C 0
1 4 A 2 B 1 C 0
2 1 C 1 A 0 B 0
2 2 C 0 A 1 B 1
2 3 C 0 A 3 B 2
2 4 C 3 A 3 B 3
;
data TABLE2(keep=id group visit score);
retain id group visit; set TABLE1;
array grp{3} Group1 Group2 Group3;
array scr{3} ScoreX ScoreY ScoreZ;
do i=1 to 3;
group=grp{i};
score=scr{i};
output;
end;
run;
proc sort data=TABLE2; by id group visit; run;
UNTESTED (and written while watching TV)
data want;
set table2;
by id group;
retain flag1-flag3 visit_score1-visit_score3 counter;
if first.group then do;
visit_score1=.; visit_score2=.; visit_score3=.;
flag1=0; flag2=0;flag3=0;
counter=0;
end;
counter+1;
if score=1 and flag1=0 then do;
visit_score1=counter; flag1=1;
end;
else if score=2 and flag2=0 then do;
visit_score2=counter; flag2=1;
end;
else if score=3 and flag3=0 then do;
visit_score3=counter; flag3=1;
end;
if last.group then output;
run;
Awesome thank you Reeza! I found out that you can also do this with array statements as below:
data TABLE3(keep=id group visit_score1-visit_score3); set TABLE2;
by id group;
array vs{3} visit_score1-visit_score3 (3*0);
array flag{3} flag1-flag3 (3*0);
if first.group then do;
do i=1 to 3;
vs{i}=.; flag{i}=0;
end;
counter=0;
end;
counter+1;
do i=1 to 3;
if score=i and flag{i}=0 then do;
vs{i}=counter; flag{i}=1;
end;
end;
if last.group then output;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.