BookmarkSubscribeRSS Feed
deleted_user
Not applicable
I'm working on a large clinical dataset. I'd like to extract duplicates into a new table. Any idea on how to do this?
3 REPLIES 3
prholland
Fluorite | Level 6
Not really an EG issue, but you could put this into a Code node:

PROC SORT DATA = inputdsn OUT = temp;
BY var1 var2 var3;
RUN;

DATA unique duplicates;
SET temp;
BY var1 var2 var3;
IF NOT LAST.var3 THEN OUTPUT duplicates;
ELSE OUTPUT unique;
RUN;

"var1 var2 var3" are the variables used to identify the duplicated records. Your duplicate values will be in the "duplicates" data set. The individual unique records will be in the "unique" data set.

Is this what you were looking for?

.............Phil

Message was edited by: prholland Message was edited by: prholland
deleted_user
Not applicable
Its just what I'm looking for although I was hoping there would be a feature in enterprise guide that would do it...
Colin
Calcite | Level 5
If your client is version 9 then you can use DUPOUT

data in;
do x=1 to 6; output; end;
do x=1 to 2; output; end;
run;
proc sort data=in out=out nodupkey dupout=dupes;
by x;
run;

Colin

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1639 views
  • 0 likes
  • 3 in conversation