SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How to identify duplicates for data with IDs as a group of variables

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 75
Accepted Solution

How to identify duplicates for data with IDs as a group of variables

My data has IDs as a group of variables, let's say 3. It means each variable may not be unique individually, but together every 3 variables specifies a unique observation. It looks like this:

Var1     Var2     Var3

John     PHIL    PA

Mike     PHIL    PA

John     CHIC    IL

John     PHIL     PA

You will see that observations 1 and 4 are duplicates and there comes the question: How can I identify duplicates from this data?

If it's some single ID variable I can do:

data dups nodups;

set have;

by ID;

if first.ID and last.ID then output nodups;

else output dups;

run;

But I'm not sure how to do in this situation. I can sort them out one by one by wonder if there's a more efficient way.


Accepted Solutions
Solution
‎06-04-2015 02:38 AM
Contributor
Posts: 44

Re: How to identify duplicates for data with IDs as a group of variables

Just sort by all three vars:

proc sort data=have;

  by var1 var2 var3;

run;

data dups nodups;

set have;

by var1 var2 var3;

if first.var3 and last.var3 then output nodups;

else output dups;

run;

View solution in original post


All Replies
Solution
‎06-04-2015 02:38 AM
Contributor
Posts: 44

Re: How to identify duplicates for data with IDs as a group of variables

Just sort by all three vars:

proc sort data=have;

  by var1 var2 var3;

run;

data dups nodups;

set have;

by var1 var2 var3;

if first.var3 and last.var3 then output nodups;

else output dups;

run;

Frequent Contributor
Posts: 75

Re: How to identify duplicates for data with IDs as a group of variables

Oh...Wao...Yeh...

I think I'm gonna go home. Well, no : )

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 309 views
  • 0 likes
  • 2 in conversation