Hi,
I am not very used to code in SAS. My background is in R. But I want to do the following thing and I could need some help:
Imagine the following data:
SubjectId Seq ActivityId
1 1 A
1 2 B
2 1 A
2 2 B
2 3 C
2 4 D
2 5 A
3 1 A
3 2 C
3 3 D
4 1 C
4 2 B
As you can see I have (different) many rows per unique SubjectID. I have two conditions that I want to check if fulfilled:
Imagine that Seq is a visit at the doctor. In total, there are 6 possible doctor visits(Seq: 1-6), and I want to check if every unique SubjectID has been to the doctor all 6 visits or if some is missing?
I want to see if every unique SubjectId has done ActivityId A and B.
Do you know how to do this in SAS? Please feel free do ask if you miss any information.
What do you expect the output to look like?
A data set (for further processing) a report (people read these)?
The optimal output is a new dataset where the unique SubjectId:s that are not fullfilling the 2 conditions are presented: including two new variables that tell us what conditions are not met, e.g. "Seq_missing" that includes what sequences that are missing (1-6) and "ActivityId_missing" that includes what ActivityId that is missing (A or/and B)
In that case my previous code requires only one DoW-loop:
data have;
input SubjectId Seq ActivityId $;
cards;
1 1 A
1 2 B
2 1 A
2 2 B
2 3 C
2 4 D
2 5 A
2 6 A
3 1 A
3 2 C
3 3 D
4 1 C
4 2 B
;
run;
data want;
keep SubjectId allVisits activity;
allVisits = sum(1,2,3,4,5,6);
activityA = 0;
activityB = 0;
do until(last.SubjectId);
set have;
by SubjectId;
allVisits + (-Seq);
activityA + (ActivityId = "A"); drop activityA;
activityB + (ActivityId = "B"); drop activityB;
end;
allVisits = (not allVisits);
activity = activityA and activityB;
if not(activity and allVisits);
run;
proc print;
run;
All the best
Bart
Hi @Born_Aioli
Here is an attempt to achieve this:
data have;
input SubjectId Seq ActivityId $;
datalines;
1 1 A
1 2 B
2 1 A
2 2 B
2 3 C
2 4 D
2 5 A
3 1 A
3 2 C
3 3 D
4 1 C
4 2 B
;
run;
/* ActivityId_missing */
proc transpose data=have out=have_tr;
var ActivityId;
by SubjectId;
run;
data have2 (keep=SubjectId ActivityId_missing);
set have_tr;
length ActivityId_missing $10.;
if findc(catx("",of col:),"A")>0 and findc(catx("",of col:),"B")>0 then ActivityId_missing = "";
else if findc(cats("",of col:),"A")>0 then ActivityId_missing = "B";
else if findc(cats("",of col:),"B")>0 then ActivityId_missing = "A";
else ActivityId_missing = "A and B";
run;
/* Seq_missing */
proc sql;
create table have3 as
select SubjectId, case when count(Seq)+1<6 then catx("-",put(count(Seq)+1,best.),"6") else "" end as Seq_missing
from have
group by SubjectId;
quit;
/* Final table */
proc sql;
create table want as
select a.SubjectId,b.Seq_missing,a.ActivityId_missing from
have2 as a inner join have3 as b
on a.SubjectId=b.SubjectId;
quit;
The first condition works fine!
According to the second condition, is it possible to get the output for the missing steps like: 3,4,5,6 instead of 3-6? It is because I cannot be sure that the answer wouldn't be: 3,4,6 (like if visit 5 isn't done). I cannot figure out how to do it.
Thanks a lot!
Hi @Born_Aioli ,
it looks like good example for double DoW-loop:
data have;
input SubjectId Seq ActivityId $;
cards;
1 1 A
1 2 B
2 1 A
2 2 B
2 3 C
2 4 D
2 5 A
2 6 A
3 1 A
3 2 C
3 3 D
4 1 C
4 2 B
;
run;
data want;
allVisits = sum(1,2,3,4,5,6);
activityA = 0;
activityB = 0;
do until(last.SubjectId);
set have;
by SubjectId;
allVisits + (-Seq);
activityA + (ActivityId = "A"); drop activityA;
activityB + (ActivityId = "B"); drop activityB;
end;
allVisits = (not allVisits);
activity = activityA and activityB;
do until(last.SubjectId);
set have;
by SubjectId;
output;
end;
run;
proc print;
run;
Read about it in @hashman article here: http://support.sas.com/resources/papers/proceedings13/126-2013.pdf
All the bets
Bart
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.