I know there have been a lot of posts about ways to delete duplicates, but I can't find exactly what I am looking for.
I have a data set of phone numbers and I am trying to save all of the unique phone numbers for each ID. I want one row per phone number. If a phone number appears multiple times for one record, I want to keep the record with the most recent date.
Here's an example of test data:
ID phone number date
1 (555) 555-5555 5/6/2018
1 (555) 555-5555 6/8/2018
1 (666) 666-6666 9/10/2018
2 (777) 777-7777 4/1/2018
2 (777) 777-7777 5/20/2018
3 (888) 888-8888 6/4/2018
4 (222) 222-2222 5/6/2018
4 (222) 222-2222 6/5/2018
4 (333) 333-3333 8/4/2018
4 (222) 222-2222 10/10/2018
I would like the resulting data set to look like this:
ID phone number date
1 (555) 555-5555 6/8/2018
1 (666) 666-6666 9/10/2018
2 (777) 777-7777 5/20/2018
3 (888) 888-8888 6/4/2018
4 (333) 333-3333 8/4/2018
4 (222) 222-2222 10/10/2018
Where within one ID, each phone number only appears once. If the phone number appears for different dates in the original data, then the most recent date is the record that is kept.
Thanks in advance for any help you can provide!
This approach requires that you have real dates, and not just character strings:
proc sort data=have;
by id phone_number date;
run;
data want;
set have;
by id phone_number date;
if last.phone_number;
run;
This approach requires that you have real dates, and not just character strings:
proc sort data=have;
by id phone_number date;
run;
data want;
set have;
by id phone_number date;
if last.phone_number;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.