BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
brulard
Pyrite | Level 9

hi,

 

does anyone know how i can flag for duplicate value? I'd like to create a dummy variable where if variable is a duplicate, dummy_var=1, else 0...

 

the variable contains char value.

 

Example output:

IDs                  Dummy_Var

a010                     1

a010                     1

a011                     0

a012                     1

a012                     1

 

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
nehalsanghvi
Pyrite | Level 9
data t2;
set t;
by ID;
DupFlag=  ^(first.ID and last.ID); 
run;

View solution in original post

6 REPLIES 6
nehalsanghvi
Pyrite | Level 9

Try this:

data t;
input ID$;
cards;
a010
a010
a011
a012
a012
;
run;

data t2;
set t;
by ID;
DupFlag=  first.ID ne last.ID; 
run;
Astounding
PROC Star

That solution would work for the given data, but would fail when you have 3 or more observations for the same ID.  There are many ways to come up with a more robust program, such as:

 

data want;

set have;

by id;

if first.id=0 or last.id=0 then flag=1;

else flag=0;

run;

brulard
Pyrite | Level 9

hi nehalsanghvi ,

 

thanks for tip... I should have mentioned, my data can have more than 1 duplicate

 

Example output:

IDs                  Dummy_Var

a010                     1

a010                     1

a010                     1

a010                     1

a011                     0

a012                     1

a012                     1

nehalsanghvi
Pyrite | Level 9
data t2;
set t;
by ID;
DupFlag=  ^(first.ID and last.ID); 
run;
Reeza
Super User
proc sort data=have;
by id;
run;

data want;
set have;
by id;

flag=1;
if first.id and last.id then flag=0; *change flag for unique records;

*flag=0;
*if not (first.id and last.id) flag=1; *change flag for duplicate records;

run;

 

jhlaramore
SAS Employee

 

data t;

input ID$;

cards;

a010

a010

a010

a011

a012

a012

a013

a013

a013

a013

a014

;

run;

 

proc sql;

create table t2 as

select *

, case when

     ID in(select distinct ID from t group by ID having count(ID)>1) then 1 else 0 end as Dummy_Var

from t;

quit;

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 19794 views
  • 1 like
  • 5 in conversation