DATA Step, Macro, Functions and more

If duplicate, add flag

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 95
Accepted Solution

If duplicate, add flag

hi,

 

does anyone know how i can flag for duplicate value? I'd like to create a dummy variable where if variable is a duplicate, dummy_var=1, else 0...

 

the variable contains char value.

 

Example output:

IDs                  Dummy_Var

a010                     1

a010                     1

a011                     0

a012                     1

a012                     1

 

Thank you


Accepted Solutions
Solution
‎03-02-2017 03:38 PM
Frequent Contributor
Posts: 75

Re: If duplicate, add flag

data t2;
set t;
by ID;
DupFlag=  ^(first.ID and last.ID); 
run;

View solution in original post


All Replies
Frequent Contributor
Posts: 75

Re: If duplicate, add flag

[ Edited ]

Try this:

data t;
input ID$;
cards;
a010
a010
a011
a012
a012
;
run;

data t2;
set t;
by ID;
DupFlag=  first.ID ne last.ID; 
run;
Super User
Posts: 5,499

Re: If duplicate, add flag

Posted in reply to nehalsanghvi

That solution would work for the given data, but would fail when you have 3 or more observations for the same ID.  There are many ways to come up with a more robust program, such as:

 

data want;

set have;

by id;

if first.id=0 or last.id=0 then flag=1;

else flag=0;

run;

Frequent Contributor
Posts: 95

Re: If duplicate, add flag

Posted in reply to nehalsanghvi

hi nehalsanghvi ,

 

thanks for tip... I should have mentioned, my data can have more than 1 duplicate

 

Example output:

IDs                  Dummy_Var

a010                     1

a010                     1

a010                     1

a010                     1

a011                     0

a012                     1

a012                     1

Solution
‎03-02-2017 03:38 PM
Frequent Contributor
Posts: 75

Re: If duplicate, add flag

data t2;
set t;
by ID;
DupFlag=  ^(first.ID and last.ID); 
run;
Super User
Posts: 19,775

Re: If duplicate, add flag

proc sort data=have;
by id;
run;

data want;
set have;
by id;

flag=1;
if first.id and last.id then flag=0; *change flag for unique records;

*flag=0;
*if not (first.id and last.id) flag=1; *change flag for duplicate records;

run;

 

SAS Employee
Posts: 15

Re: If duplicate, add flag

 

data t;

input ID$;

cards;

a010

a010

a010

a011

a012

a012

a013

a013

a013

a013

a014

;

run;

 

proc sql;

create table t2 as

select *

, case when

     ID in(select distinct ID from t group by ID having count(ID)>1) then 1 else 0 end as Dummy_Var

from t;

quit;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 375 views
  • 1 like
  • 5 in conversation