DATA Step, Macro, Functions and more

List duplicate observations?

Accepted Solution Solved
Reply
Super Contributor
Posts: 297
Accepted Solution

List duplicate observations?

Hello:

 

I would like to list the duplicate observations in 'Found' column.  Please advice how.  Thanks.

 

Best,

 

data Founddup;

informat name $80.;

input name $ found;

cards;

If_True 1

If_True_kary 1

If_True_kary 1

If_True_John 3

If_Not 24

If_Not 24

If_Not_Carol 24

If_Not_Carol 24

If_Not_Carol 24

If_False_Joe 288

If_False_Joe 288

;

 


Accepted Solutions
Solution
‎07-10-2017 01:05 PM
PROC Star
Posts: 7,363

Re: List duplicate observations?

data want;
  set Founddup;
  by found notsorted;
  if not(first.found and last.found);
run;
proc print data=want;
run;

Art, CEO, AnalystFinder.com

 

View solution in original post


All Replies
Solution
‎07-10-2017 01:05 PM
PROC Star
Posts: 7,363

Re: List duplicate observations?

data want;
  set Founddup;
  by found notsorted;
  if not(first.found and last.found);
run;
proc print data=want;
run;

Art, CEO, AnalystFinder.com

 

Super Contributor
Posts: 297

Re: List duplicate observations?

Just curious, is there a way that sort the 'found' duplicate obs after the 'not (first and last)' statement? Thanks.

PROC Star
Posts: 7,363

Re: List duplicate observations?

Not sure what you're asking. My suggestion was based on the existing order of the records. If you needed it sorted, I'd go with the proc sort nouniquerec option that @Reeza suggested. 

 

Art, CEO, AnalystFinder.com

 

Super Contributor
Posts: 297

Re: List duplicate observations?

Well, the proc sort code doesn't work in my actual dataset.. Yours works.   Basely, I would like to sort the dupicate numbers from zero to largest.  I found your code doesn't come with this function.  I need to add proc sort on more step.  I wish I could do it in one data steps.

 

data want;

set Founddup;

by found notsorted;

if not(first.found and last.found);

run;

 

proc sort data=want; by found; run;

Super User
Posts: 17,829

Re: List duplicate observations?


ybz12003 wrote:

Well, the proc sort code doesn't work in my actual dataset..


It's not clear who you're responding to, please quote the original post in your response.

 

PROC SORT will work for your situation in a single step. If it doesn't you're doing something wrong.

PROC Star
Posts: 7,363

Re: List duplicate observations?

Couldn't you just use?:

proc sort data=have out=duplicates  nouniquerec;
  by found name;
run;

Art, CEO, AnalystFinder.com

 

Super User
Posts: 17,829

Re: List duplicate observations?

NOUNIQUEREC option in PROC SORT does exactly this. You can also use the NOUNIQUEKEY if you're looking at specific variables to identify duplicates.

 

/*This code demonstrates how to keep only duplicate observations in a data set*/

%*Create sample data set;
data have;
informat name $80.;
input name $ found;
cards;
If_True 1
If_True_kary 1
If_True_kary 1
If_True_John 3
If_Not 24
If_Not 24
If_Not_Carol 24
If_Not_Carol 24
If_Not_Carol 24
If_False_Joe 288
If_False_Joe 288
;
run;

%*Sort with NOUNIQUEREC option;
proc sort data=have out=duplicates  nouniquerec;
by name found;
run;

https://gist.githubusercontent.com/statgeek/1ff5a4aaca9b3f875d4defdaa598ae5e/raw/87a7485a46add02b5f0...

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 105 views
  • 2 likes
  • 3 in conversation