I have a large dataset with multiple rows per ID. I am trying to create a dataset with only one row per id (no duplicates), but on specific conditions. Data looks like this: ID Month Case Control 1 10 0 1 2 8 0 1 2 9 1 0 3 3 1 0 4 7 0 1 4 11 1 0 5 3 0 1 5 6 0 1 5 8 0 1 The conditions for retaining the observation in the dataset are: 1) Select all cases (case=1). If the ID also has observations with control=1, remove those observations ID 2 month 9 would be retained ID 4 month 11 would be retained and ID 4 month 7 would be removed 2) For IDs with multiple control observations, select the observation with the first/lowest month. ID 5 month 3 would be retained I know I can achieve criteria #2 with something like: Data want; Set have; By ID Month; If first.ID; run; But how can I achieve #1?
... View more