Hi all,
Repeated measures data is frying my brain 😣... I'm working on a dataset that has multiple visits overtime. I want to reduce the visits per ID down to just one observation per ID. A sample of the data is below:
ID |
Visit |
Label |
AAA |
1 |
. |
AAA |
2 |
No |
AAA |
3 |
Yes |
BBB |
1 |
. |
BBB |
2 |
Yes |
CCC |
1 |
Yes |
DDD |
1 |
. |
DDD |
2 |
No |
DDD |
3 |
No |
EEE |
1 |
. |
EEE |
2 |
. |
FFF |
1 |
Yes |
GGG |
1 |
No |
HHH |
1 |
. |
HHH |
2 |
. |
HHH |
3 |
. |
Everything is sorted in chronological order based on their visit (1->2->3->etc.). Basically I have three criteria to follow per ID:
1) If the ID has at least one label with a "Yes", pick the row where a "Yes" label first appears.
2) If the ID ONLY has "No" / "missing" labels across all visits, pick the row where a "No" label first appears.
3) If the ID ONLY has missing observations across all visits, pick the row where a missing label first appears.
Initially, the code I have only reduces the observations down to their very first visit (regardless of their label). Now I'm having trouble with coding the part of identifying "the first time a "___" appears".
I'd want to hopefully produce a table that looks something like this:
ID |
Visit |
Label |
AAA |
3 |
Yes |
BBB |
2 |
Yes |
CCC |
1 |
Yes |
DDD |
2 |
No |
EEE |
1 |
. |
FFF |
1 |
Yes |
GGG |
1 |
No |
HHH |
1 |
. |
Any help or advice would be really appreciated!
Hi @asgee Solution plagiarized from @Reeza . She stumped with me with this elegant approach. I don't like intelligent people
data have;
input ID $ Visit Label $;
cards;
AAA 1 .
AAA 2 No
AAA 3 Yes
BBB 1 .
BBB 2 Yes
CCC 1 Yes
DDD 1 .
DDD 2 No
DDD 3 No
EEE 1 .
EEE 2 .
FFF 1 Yes
GGG 1 No
HHH 1 .
HHH 2 .
HHH 3 .
;
proc sort data=have out=_have;
by id descending label;
run;
data want;
set _have;
by id;
if first.id;
run;
Hi @asgee Solution plagiarized from @Reeza . She stumped with me with this elegant approach. I don't like intelligent people
data have;
input ID $ Visit Label $;
cards;
AAA 1 .
AAA 2 No
AAA 3 Yes
BBB 1 .
BBB 2 Yes
CCC 1 Yes
DDD 1 .
DDD 2 No
DDD 3 No
EEE 1 .
EEE 2 .
FFF 1 Yes
GGG 1 No
HHH 1 .
HHH 2 .
HHH 3 .
;
proc sort data=have out=_have;
by id descending label;
run;
data want;
set _have;
by id;
if first.id;
run;
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.