Remove Duplicate Rows with priority conditions

Accepted Solution Solved
Reply
Regular Contributor
Posts: 222
Accepted Solution

Remove Duplicate Rows with priority conditions

Hi there....I am trying to remove any duplicates in the dataset. A row or record is considered a duplicate if the ID Number, Last Name and Given Name are identical. However, I would like to keep or retain the row by assigning priority first to the maximum Initial Date. If the Initial Dates for the duplicates are all the same, then priority is given to the maximum Revision Date. If the Revision Dates are all the same them the row with maximum Approval Date is selected and kept in the dataset. I have attached a pre-dataset and a final dataset below that I would like to obtain in the end....Thanks
Id NumberLast NameGiven NameInitial DateRevision DateApproval Date
28871SmithJames201301302013063020130901
28871SmithJames201301302013100220131101
28871SmithJames201301302013100220131115
28873DeckerWilliam201302232013051820130630
28873DeckerWilliam201302252013051920130629
28875DaviesJim201303152013071920130902
28875DaviesJim201303152013071920130904
28875DaviesJim2013031520130730

20130903

Id NumberLast NameGiven NameInitial DateRevision DateApproval Date
28871SmithJames201301302013100220131115
28873DeckerWilliam201302252013051920130629
28875DaviesJim201303152013073020130903

Accepted Solutions
Solution
‎12-08-2013 08:37 PM
PROC Star
Posts: 1,236

Re: Remove Duplicate Rows with priority conditions

Hi,

I think following will do it:

proc sort data=have;

by IDnumber LastName GivenName InitialDate RevisionDate ApprovalDate;

run;

data want;

set have;

by IDnumber LastName GivenName InitialDate RevisionDate ApprovalDate;

if last.GivenName;

run;

HTH!

View solution in original post


All Replies
Solution
‎12-08-2013 08:37 PM
PROC Star
Posts: 1,236

Re: Remove Duplicate Rows with priority conditions

Hi,

I think following will do it:

proc sort data=have;

by IDnumber LastName GivenName InitialDate RevisionDate ApprovalDate;

run;

data want;

set have;

by IDnumber LastName GivenName InitialDate RevisionDate ApprovalDate;

if last.GivenName;

run;

HTH!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 207 views
  • 0 likes
  • 2 in conversation