BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Emma_at_SAS
Lapis Lazuli | Level 10

Based on duplicate Names and Addresses, I had identified duplicate observations (some individuals are repeated more than twice) and had created a subset of my dataset.

Each row of my dataset has unique ID values. I want to get the list of the ID values associated with 2nd, 3rd, 4th, ... repetitions of the DUP_OBS to use it in my original dataset and say remove the rows where ID in (the list of ID values) 

DATA WANT;

     SET HAVE;

     WHERE ID NOT IN ("313086590", "110616230", "570208047", ..., "359502732");

RUN; *my ID variable is defined as a CHAR variable;

Eventually, I want to remove the bold rows in the example table below.

How may I identify these ID values in my data?

Thank you for your help.

ID DUP_OBS
165577390 1
313086590 1
181429833 2
165924338 2
110616230 2
520386083 3
169961292 3
570208047 3
521476491 3
393097972 4
359502732 4
1 ACCEPTED SOLUTION

Accepted Solutions
maguiremq
SAS Super FREQ

This gets you what you want, but I'm not sure how you're sorting before this step:

 

proc sort
	data = have 
	out = have_nodup nodupkey
	dupout = have_dup;
		by dup_obs;
run;
id dup_obs 
165577390 1 
181429833 2 
520386083 3 
393097972 4 

View solution in original post

2 REPLIES 2
maguiremq
SAS Super FREQ

This gets you what you want, but I'm not sure how you're sorting before this step:

 

proc sort
	data = have 
	out = have_nodup nodupkey
	dupout = have_dup;
		by dup_obs;
run;
id dup_obs 
165577390 1 
181429833 2 
520386083 3 
393097972 4 
Emma_at_SAS
Lapis Lazuli | Level 10

Thank you @maguiremq . 

I used a similar method to get to the subset of data with duplicate names and addresses. I need to repeat the step as you suggested. The have_dup dataset would give me the ID values I need to remove from my original data. Thanks

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 920 views
  • 0 likes
  • 2 in conversation