BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
katblack
Fluorite | Level 6

I know there have been a lot of posts about ways to delete duplicates, but I can't find exactly what I am looking for. 

 

I have a data set of phone numbers and I am trying to save all of the unique phone numbers for each ID. I want one row per phone number. If a phone number appears multiple times for one record, I want to keep the record with the most recent date. 

 

Here's an example of test data:


ID    phone number      date
1     (555) 555-5555     5/6/2018
1     (555) 555-5555     6/8/2018
1     (666) 666-6666     9/10/2018
2     (777) 777-7777     4/1/2018
2     (777) 777-7777     5/20/2018
3     (888) 888-8888     6/4/2018
4     (222) 222-2222     5/6/2018
4     (222) 222-2222     6/5/2018
4     (333) 333-3333     8/4/2018
4     (222) 222-2222     10/10/2018

 

I would like the resulting data set to look like this:

 

ID     phone number      date
1      (555) 555-5555     6/8/2018
1      (666) 666-6666     9/10/2018
2      (777) 777-7777     5/20/2018
3      (888) 888-8888     6/4/2018
4      (333) 333-3333     8/4/2018
4      (222) 222-2222     10/10/2018

 

Where within one ID, each phone number only appears once. If the phone number appears for different dates in the original data, then the most recent date is the record that is kept. 

 

Thanks in advance for any help you can provide! 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
Opal | Level 21

This approach requires that you have real dates, and not just character strings:

 

proc sort data=have;
  by id phone_number date;
run;

data want;
   set have;
   by id phone_number date;
   if last.phone_number;
run;

View solution in original post

2 REPLIES 2
Astounding
Opal | Level 21

This approach requires that you have real dates, and not just character strings:

 

proc sort data=have;
  by id phone_number date;
run;

data want;
   set have;
   by id phone_number date;
   if last.phone_number;
run;
ed_sas_member
Meteorite | Level 14

Hi @katblack 

 

Another approach could be:

 

proc sql;
	create table want as
	select ID, phone_number, max(date) as date format=MMDDYY10.
	from have
	group by ID, phone_number;
quit;

Capture d’écran 2020-01-04 à 14.30.18.png

 

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 358 views
  • 0 likes
  • 3 in conversation