BookmarkSubscribeRSS Feed
deleted_user
Not applicable
How do you deduplicate based on most recent date? I have duplicate IDs which may or may not have the same information from week to week. I want to retain information for each uniqe ID with the most recent time/date stamp (i.e, 12/8/2009 10:20:53 AM). Thanks.

PROC SORT data=total;
by date_time ID;
run;
PROC FREQ DATA = total noprint;
by date_time;
table ID/ out = ID_DUPS (keep = date_time ID Count where = (Count > 1)) ;
run;
PROC PRINT DATA = ID_DUPS;
run Message was edited by: jcis
3 REPLIES 3
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
You have a couple of options - one is to perform two sorts with ay BY ID DESCENDING DATE; on the first sort and then followed with a SORT NODUPKEY EQUALS and a BY ID; statement.

The other option is to again sort with ID DESCENDING DATE, and then use a SAS DATA step with a SET and a BY ID; statement -- and use BY GROUP processing with IF FIRST.ID processing to subset your input and only capture the first occurence of ID values.

Scott Barry

Suggested Google advanced search argument on this topic:

data step by group processing site:sas.com
deleted_user
Not applicable
Thanks! Will try that.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Please consider consolidating the two current posts - they are very much related with the only difference being.....maybe....date versus date/time.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1226 views
  • 0 likes
  • 2 in conversation