Help using Base SAS procedures

Help with removing duplicates

Reply
N/A
Posts: 0

Help with removing duplicates

Hi,
I am fairly new to using SAS and require some assistance. My data set (Test) contains many duplicates and i need to remove the duplicates by a date field. Both the Nodup and Nodupkey functions do not provide me the results i need. The 'test' data set contains a list of accounts, the acctnum is the primary identifier, and multiple records for these accounts are coming back, i only want to keep the record with the most recent date.
Can someone please help?
Thanks.
Super Contributor
Super Contributor
Posts: 3,174

Re: Help with removing duplicates

Explore using two sorts, the first to get the desired "first condition" ordered at the beginning (ahead of any duplicates - using DESCENDING in the BY list) followed by a less-discreted SORT with EQUALS specified in the PROC SORT command.

Or another option is to use PROC SORT to get your data in the proper order (with the appropriate BY statement variables and, again, using DESCENDING in the BY list.

The use a DATA step approach with a BY statement and a list of the sort-variables listed that you want to test using the IF FIRST. (or maybe IF LAST.) -- choice of whether to use FIRST. or LAST. will depend on how you decide to sort your input file (with or without DESCENDING).

The SAS support http://support.sas.com/ website has SAS-hosted documentation and supplemental technical and conference topic-related reference materials. Here are a few Google advanced arguments for you to use to find suitable matches on this topic for discussion / example code:

remove duplicates equals site:sas.com

by first last processing site:sas.com


Also, this topic has been discussed on the SAS Discussion Forums, if you want to search the archives.

Scott Barry
SBBWorks, Inc.
Occasional Contributor
Posts: 8

Re: Help with removing duplicates

agree with sbb

easy to say:

proc sort data=test ; by id descending date;

data lastdate;
set test;
by id descending date;
if first.id;
run;

of course double check your data and make sure you get what you want!
Ask a Question
Discussion stats
  • 2 replies
  • 94 views
  • 0 likes
  • 3 in conversation