Hi all,
I have a data set like this (see below). There are duplicate observations within some survey waves. Now I want to remove the duplicate observations with the highest "visit age" from the sample. What should I do?
The original data set:
ID | WAVE | Visit age |
1231 | 1 | 45 |
1231 | 2 | 46 |
1231 | 2 | 47 |
1232 | 1 | 55 |
1233 | 1 | 56 |
1234 | 1 | 34 |
1234 | 1 | 35 |
1234 | 2 | 38 |
The dataset that I want:
ID | WAVE | Visit age |
1231 | 1 | 45 |
1231 | 2 | 46 |
1232 | 1 | 55 |
1233 | 1 | 56 |
1234 | 1 | 34 |
1234 | 2 | 38 |
Thank you very much!
@zjppdozen wrote:
Hi all,
I have a data set like this (see below). There are duplicate observations within some survey waves. Now I want to remove the duplicate observations with the highest "visit age" from the sample. What should I do?
The original data set:
ID WAVE Visit age 1231 1 45 1231 2 46 1231 2 47 1232 1 55 1233 1 56 1234 1 34 1234 1 35 1234 2 38
The dataset that I want:
ID WAVE Visit age 1231 1 45 1231 2 46 1232 1 55 1233 1 56 1234 1 34 1234 2 38
Thank you very much!
proc summary nway missing;
class id wave;
output out=deduped(drop=_type_) min('visit age'n)=;
run;
@zjppdozen wrote:
Hi all,
I have a data set like this (see below). There are duplicate observations within some survey waves. Now I want to remove the duplicate observations with the highest "visit age" from the sample. What should I do?
The original data set:
ID WAVE Visit age 1231 1 45 1231 2 46 1231 2 47 1232 1 55 1233 1 56 1234 1 34 1234 1 35 1234 2 38
The dataset that I want:
ID WAVE Visit age 1231 1 45 1231 2 46 1232 1 55 1233 1 56 1234 1 34 1234 2 38
Thank you very much!
proc summary nway missing;
class id wave;
output out=deduped(drop=_type_) min('visit age'n)=;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.