Hello! I want to create a list of partial duplicates - observations that have the same values for ID, date, and time, but different values in all other variables. The only condition is that I want to ignore differences due to missing values. For example, let's say I have the dataset: ID VAR1_ VAR2_ VAR3_ VAR4_ date time 1 1 1 1 2 9 15 1 2 . . 2 9 15 1 2 2 2 1 9 15 2 1 1 2 2 10 20 2 1 . 2 3 10 20 2 3 2 2 . 10 20 2 1 1 2 3 10 20 3 . 1 3 4 9 15 3 2 1 . 4 9 15 Edit: I just realized I framed this wrong. I'd ultimately want to keep only one of row 5 or 7 as they are partial duplicates to the other observations of ID=2, date=10, and time=20. However, I'd want to remove both rows with ID=3 as when one is absorbed by the other, it no longer has partial duplicates. Sorry for the confusion! : ID VAR1_ VAR2_ VAR3_ VAR4_ date time 1 1 1 1 2 9 15 1 2 . . 2 9 15 1 2 2 2 1 9 15 2 1 1 2 2 10 20 2 3 2 2 . 10 20 2 1 1 2 3 10 20 Is this possible in SAS? Is this not practical to code? Below I have the SAS code to make the data: data dt;
input ID VAR1 VAR2 VAR3 VAR4 date time;
datalines;
1 1 1 1 2 9 15
1 2 . . 2 9 15
1 2 2 2 1 9 15
2 1 1 2 2 10 20
2 1 . 2 3 10 20
2 3 2 2 . 10 20
2 1 1 2 3 10 20
3 . 1 3 4 9 15
3 2 1 . 4 9 15
;
run; Thanks!
... View more