This is my first post, so apologies if this question is not worded/formatted correctly. I am trying to sort a dataset that has the following rough structure: First Name Last Name Date of Birth Unique ID John Smith 01/01/1970 1000 John Smith 01/01/1970 2000 Jane Smith 01/03/1975 3000 Jane Smith 06/08/1980 4000 The rough operation I want to carry out is to de-duplicate the first name, last name and dob (with all three representing a single person), and then keep the biggest number for unique ID. What I want to achieve is this: First Name Last Name Date of Birth Unique ID John Smith 01/01/1970 2000 Jane Smith 01/03/1975 3000 Jane Smith 06/08/1980 4000 My initial thought was to try something like this: DATA people_sorted;
SET people;
PROC SORT DATA=people_sorted;
BY id;
RUN;
PROC SORT DATA=people_sorted NODUPKEY;
BY fname lname dob;
RUN; This does not retain the original sorting as I had initially thought. I was wondering if there was a way to 'chain' these sorts together, so to speak. Alternatively, is there a better approach to this that I'm not seeing? Thank you in advance for your help!
... View more