DATA Step, Macro, Functions and more

Duplicate value within by group

Reply
Regular Contributor
Posts: 215

Duplicate value within by group

Hi All,

Can someone tell me how to get the duplicate value within by group. Thanks,

 

ID     Subject

100   English

100   Math

100   Biology

111   English

111   Biology

111   Math

111   Math

111  Biology

112  Chemistry

112   English

112   Physics

112   Math

112   English

 

Outpu table will be:

 

ID     Subject

111   Biology

111   Biology

112   English

112   English

Super User
Posts: 17,870

Re: Duplicate value within by group

proc sort data=have;
by id subject;
run;

data dups;
set have;
by id subject;
if not (first.subject and last.subject) then output;
run;
Regular Contributor
Posts: 234

Re: Duplicate value within by group

proc sort data=have nodupkey dupout=dups;
  by id subject;
run;
Super User
Posts: 17,870

Re: Duplicate value within by group

Proc Sort identifies one of the duplicate records not both.  

If you want to identify both using PROC SORT use the NOUNIQUEKEY option instead.

 

proc sort data=have nouniquekey out=want;
  by id subject;
run;

proc print data=want;
run;
Super User
Super User
Posts: 7,413

Re: Duplicate value within by group

Or:

proc sort data=have out=nondups dupout=want nodupkey;
  by id subject;
run;

This will give you a dataset want which has all the duplicate values, you can sort it again nodupkey to get distinct values, also you do:

proc sql;
create table WANT as
select distinct ID,SUBJECT
from HAVE
group by ID,SUBJECT
having count(*) > 1;
quit;
Ask a Question
Discussion stats
  • 4 replies
  • 201 views
  • 0 likes
  • 4 in conversation