Help using Base SAS procedures

Deduplicating based on most recent date/time

Reply
N/A
Posts: 0

Deduplicating based on most recent date/time

How do you deduplicate based on most recent date? I have duplicate IDs which may or may not have the same information from week to week. I want to retain information for each uniqe ID with the most recent time/date stamp (i.e, 12/8/2009 10:20:53 AM). Thanks.

PROC SORT data=total;
by date_time ID;
run;
PROC FREQ DATA = total noprint;
by date_time;
table ID/ out = ID_DUPS (keep = date_time ID Count where = (Count > 1)) ;
run;
PROC PRINT DATA = ID_DUPS;
run Message was edited by: jcis
Super Contributor
Super Contributor
Posts: 3,174

Re: Deduplicating based on most recent date/time

You have a couple of options - one is to perform two sorts with ay BY ID DESCENDING DATE; on the first sort and then followed with a SORT NODUPKEY EQUALS and a BY ID; statement.

The other option is to again sort with ID DESCENDING DATE, and then use a SAS DATA step with a SET and a BY ID; statement -- and use BY GROUP processing with IF FIRST.ID processing to subset your input and only capture the first occurence of ID values.

Scott Barry

Suggested Google advanced search argument on this topic:

data step by group processing site:sas.com
N/A
Posts: 0

Re: Deduplicating based on most recent date/time

Thanks! Will try that.
Super Contributor
Super Contributor
Posts: 3,174

Re: Deduplicating based on most recent date/time

Please consider consolidating the two current posts - they are very much related with the only difference being.....maybe....date versus date/time.
Ask a Question
Discussion stats
  • 3 replies
  • 145 views
  • 0 likes
  • 2 in conversation