How do you deduplicate based on most recent date? I have duplicate IDs which may or may not have the same information from week to week. I want to retain information for each uniqe ID with the most recent time/date stamp (i.e, 12/8/2009 10:20:53 AM). Thanks.
PROC SORT data=total;
by date_time ID;
PROC FREQ DATA = total noprint;
table ID/ out = ID_DUPS (keep = date_time ID Count where = (Count > 1)) ;
PROC PRINT DATA = ID_DUPS;
Message was edited by: jcis
You have a couple of options - one is to perform two sorts with ay BY ID DESCENDING DATE; on the first sort and then followed with a SORT NODUPKEY EQUALS and a BY ID; statement.
The other option is to again sort with ID DESCENDING DATE, and then use a SAS DATA step with a SET and a BY ID; statement -- and use BY GROUP processing with IF FIRST.ID processing to subset your input and only capture the first occurence of ID values.
Suggested Google advanced search argument on this topic: