I have searched everywhere, but I don't seem to find the answer. I think my problem is simple, but I can't get it right. My problem:
In a dataset, I have patients having multiple diagnoses and duplicate diagnoses. I need to delete these duplicate diagnoses per patient, meaning another patient having the same diagnose should not be deleted.
Example of dataset (not real data):
Patient_ID | Diagnose | |
1 | 1 | Keep observation |
1 | 1 | Delete observation |
1 | 2 | Keep observation |
1 | 3 | Keep observation |
2 | 1 | Keep observation |
2 | 1 | Delete observation |
2 | 3 | Keep observation |
2 | 3 | Delete observation |
3 | 1 | Keep observation |
3 | 2 | Keep observation |
3 | 5 | Keep observation |
3 | 6 | Keep observation |
4 | 1 | Keep observation |
4 | 1 | Delete observation |
4 | 3 | Keep observation |
4 | 3 | Delete observation |
Thanks in advance!
Is your actual data sorted by Patiend_ID and Diagnose?
If so:
data have;
input Patient_ID Diagnose;
datalines;
1 1
1 1
1 2
1 3
2 1
2 1
2 3
2 3
3 1
3 2
3 5
3 6
4 1
4 1
4 3
4 3
;
data want;
set have;
by Patient_ID Diagnose;
if first.Diagnose;
run;
Result:
Patient_ID Diagnose 1 1 1 2 1 3 2 1 2 3 3 1 3 2 3 5 3 6 4 1 4 3
Is your actual data sorted by Patiend_ID and Diagnose?
If so:
data have;
input Patient_ID Diagnose;
datalines;
1 1
1 1
1 2
1 3
2 1
2 1
2 3
2 3
3 1
3 2
3 5
3 6
4 1
4 1
4 3
4 3
;
data want;
set have;
by Patient_ID Diagnose;
if first.Diagnose;
run;
Result:
Patient_ID Diagnose 1 1 1 2 1 3 2 1 2 3 3 1 3 2 3 5 3 6 4 1 4 3
simple sort and nodupkey?
proc sort data=have out=want nodupkey;
by patient_id diagnose;
run;
If your dataset had been already ordered as shown in sample then-
data want;
set have;
by patient_id diagnose;
if first.diagnose;
run;
Thank you both! My data is sorted as in the example dataset, so and your solution works perfect.
Regards
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.