Help using Base SAS procedures

Deleting Duplicate Records But Keep the one with Specific Variable

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

Deleting Duplicate Records But Keep the one with Specific Variable

Need SAS procedure!

 

I have a situation where I have duplicate records but there is one variable that is different and desire to keep that specific record. For instance:

 

1)  Customer Name    Customer ID  Address      Customer Type

       Joe Doe                  123              123 Way       Online            (retain)

       Joe Doe                  123              123 Way       In-Store         (delete)

       Ken Moore              456              456 Way       Online           (retain)

       Ken Moore              456              456 Way       In-Store         (delete)

       Lisa Mae                 789              789 Way       In-Store         (retain)

 

I want to keep the "Online" record (if duplicates) and delete the "In-Store" records. However, when there are no duplicates I retain "In-Store" records.

 

Thanks for your help!


Accepted Solutions
Solution
‎01-12-2018 10:23 AM
Super User
Super User
Posts: 9,599

Re: Deleting Duplicate Records But Keep the one with Specific Variable

 

proc sort data=have out=want;
  by customer_name customer_id address descending customer_type;
run;
proc sort data=want nodupkey;
by customer_name customer_id address;
run;

The first sort with descending customer_type is key, it will sort it so that online is always before in-store if present, the second sort with  nodupkey will just take the first observation which should be online if present, or in-store if not.

Note, not tested, post test data in the form of a datastep to get tested code.

 

View solution in original post


All Replies
Solution
‎01-12-2018 10:23 AM
Super User
Super User
Posts: 9,599

Re: Deleting Duplicate Records But Keep the one with Specific Variable

 

proc sort data=have out=want;
  by customer_name customer_id address descending customer_type;
run;
proc sort data=want nodupkey;
by customer_name customer_id address;
run;

The first sort with descending customer_type is key, it will sort it so that online is always before in-store if present, the second sort with  nodupkey will just take the first observation which should be online if present, or in-store if not.

Note, not tested, post test data in the form of a datastep to get tested code.

 

Occasional Contributor
Posts: 7

Re: Deleting Duplicate Records But Keep the one with Specific Variable

This worked! Thank you very much...

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 176 views
  • 0 likes
  • 2 in conversation