I have searched everywhere, but I don't seem to find the answer. I think my problem is simple, but I can't get it right. My problem:
In a dataset, I have patients having multiple diagnoses and duplicate diagnoses. I need to delete these duplicate diagnoses per patient, meaning another patient having the same diagnose should not be deleted.
Example of dataset (not real data):
| Patient_ID | Diagnose | |
| 1 | 1 | Keep observation |
| 1 | 1 | Delete observation |
| 1 | 2 | Keep observation |
| 1 | 3 | Keep observation |
| 2 | 1 | Keep observation |
| 2 | 1 | Delete observation |
| 2 | 3 | Keep observation |
| 2 | 3 | Delete observation |
| 3 | 1 | Keep observation |
| 3 | 2 | Keep observation |
| 3 | 5 | Keep observation |
| 3 | 6 | Keep observation |
| 4 | 1 | Keep observation |
| 4 | 1 | Delete observation |
| 4 | 3 | Keep observation |
| 4 | 3 | Delete observation |
Thanks in advance!
Is your actual data sorted by Patiend_ID and Diagnose?
If so:
data have;
input Patient_ID Diagnose;
datalines;
1 1
1 1
1 2
1 3
2 1
2 1
2 3
2 3
3 1
3 2
3 5
3 6
4 1
4 1
4 3
4 3
;
data want;
set have;
by Patient_ID Diagnose;
if first.Diagnose;
run;
Result:
Patient_ID Diagnose 1 1 1 2 1 3 2 1 2 3 3 1 3 2 3 5 3 6 4 1 4 3
Is your actual data sorted by Patiend_ID and Diagnose?
If so:
data have;
input Patient_ID Diagnose;
datalines;
1 1
1 1
1 2
1 3
2 1
2 1
2 3
2 3
3 1
3 2
3 5
3 6
4 1
4 1
4 3
4 3
;
data want;
set have;
by Patient_ID Diagnose;
if first.Diagnose;
run;
Result:
Patient_ID Diagnose 1 1 1 2 1 3 2 1 2 3 3 1 3 2 3 5 3 6 4 1 4 3
simple sort and nodupkey?
proc sort data=have out=want nodupkey;
by patient_id diagnose;
run;
If your dataset had been already ordered as shown in sample then-
data want;
set have;
by patient_id diagnose;
if first.diagnose;
run;
Thank you both! My data is sorted as in the example dataset, so and your solution works perfect.
Regards
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.