Hi all,
Repeated measures data is frying my brain 😣... I'm working on a dataset that has multiple visits overtime. I want to reduce the visits per ID down to just one observation per ID. A sample of the data is below:
ID |
Visit |
Label |
AAA |
1 |
. |
AAA |
2 |
No |
AAA |
3 |
Yes |
BBB |
1 |
. |
BBB |
2 |
Yes |
CCC |
1 |
Yes |
DDD |
1 |
. |
DDD |
2 |
No |
DDD |
3 |
No |
EEE |
1 |
. |
EEE |
2 |
. |
FFF |
1 |
Yes |
GGG |
1 |
No |
HHH |
1 |
. |
HHH |
2 |
. |
HHH |
3 |
. |
Everything is sorted in chronological order based on their visit (1->2->3->etc.). Basically I have three criteria to follow per ID:
1) If the ID has at least one label with a "Yes", pick the row where a "Yes" label first appears.
2) If the ID ONLY has "No" / "missing" labels across all visits, pick the row where a "No" label first appears.
3) If the ID ONLY has missing observations across all visits, pick the row where a missing label first appears.
Initially, the code I have only reduces the observations down to their very first visit (regardless of their label). Now I'm having trouble with coding the part of identifying "the first time a "___" appears".
I'd want to hopefully produce a table that looks something like this:
ID |
Visit |
Label |
AAA |
3 |
Yes |
BBB |
2 |
Yes |
CCC |
1 |
Yes |
DDD |
2 |
No |
EEE |
1 |
. |
FFF |
1 |
Yes |
GGG |
1 |
No |
HHH |
1 |
. |
Any help or advice would be really appreciated!
Hi @asgee Solution plagiarized from @Reeza . She stumped with me with this elegant approach. I don't like intelligent people
data have;
input ID $ Visit Label $;
cards;
AAA 1 .
AAA 2 No
AAA 3 Yes
BBB 1 .
BBB 2 Yes
CCC 1 Yes
DDD 1 .
DDD 2 No
DDD 3 No
EEE 1 .
EEE 2 .
FFF 1 Yes
GGG 1 No
HHH 1 .
HHH 2 .
HHH 3 .
;
proc sort data=have out=_have;
by id descending label;
run;
data want;
set _have;
by id;
if first.id;
run;
Hi @asgee Solution plagiarized from @Reeza . She stumped with me with this elegant approach. I don't like intelligent people
data have;
input ID $ Visit Label $;
cards;
AAA 1 .
AAA 2 No
AAA 3 Yes
BBB 1 .
BBB 2 Yes
CCC 1 Yes
DDD 1 .
DDD 2 No
DDD 3 No
EEE 1 .
EEE 2 .
FFF 1 Yes
GGG 1 No
HHH 1 .
HHH 2 .
HHH 3 .
;
proc sort data=have out=_have;
by id descending label;
run;
data want;
set _have;
by id;
if first.id;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.