BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NonSleeper
Quartz | Level 8

My data has IDs as a group of variables, let's say 3. It means each variable may not be unique individually, but together every 3 variables specifies a unique observation. It looks like this:

Var1     Var2     Var3

John     PHIL    PA

Mike     PHIL    PA

John     CHIC    IL

John     PHIL     PA

You will see that observations 1 and 4 are duplicates and there comes the question: How can I identify duplicates from this data?

If it's some single ID variable I can do:

data dups nodups;

set have;

by ID;

if first.ID and last.ID then output nodups;

else output dups;

run;

But I'm not sure how to do in this situation. I can sort them out one by one by wonder if there's a more efficient way.

1 ACCEPTED SOLUTION

Accepted Solutions
AskoLötjönen
Quartz | Level 8

Just sort by all three vars:

proc sort data=have;

  by var1 var2 var3;

run;

data dups nodups;

set have;

by var1 var2 var3;

if first.var3 and last.var3 then output nodups;

else output dups;

run;

View solution in original post

2 REPLIES 2
AskoLötjönen
Quartz | Level 8

Just sort by all three vars:

proc sort data=have;

  by var1 var2 var3;

run;

data dups nodups;

set have;

by var1 var2 var3;

if first.var3 and last.var3 then output nodups;

else output dups;

run;

NonSleeper
Quartz | Level 8

Oh...Wao...Yeh...

I think I'm gonna go home. Well, no : )

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 3738 views
  • 0 likes
  • 2 in conversation