BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
brulard
Pyrite | Level 9

hi,

 

does anyone know how i can flag for duplicate value? I'd like to create a dummy variable where if variable is a duplicate, dummy_var=1, else 0...

 

the variable contains char value.

 

Example output:

IDs                  Dummy_Var

a010                     1

a010                     1

a011                     0

a012                     1

a012                     1

 

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
nehalsanghvi
Pyrite | Level 9
data t2;
set t;
by ID;
DupFlag=  ^(first.ID and last.ID); 
run;

View solution in original post

6 REPLIES 6
nehalsanghvi
Pyrite | Level 9

Try this:

data t;
input ID$;
cards;
a010
a010
a011
a012
a012
;
run;

data t2;
set t;
by ID;
DupFlag=  first.ID ne last.ID; 
run;
Astounding
PROC Star

That solution would work for the given data, but would fail when you have 3 or more observations for the same ID.  There are many ways to come up with a more robust program, such as:

 

data want;

set have;

by id;

if first.id=0 or last.id=0 then flag=1;

else flag=0;

run;

brulard
Pyrite | Level 9

hi nehalsanghvi ,

 

thanks for tip... I should have mentioned, my data can have more than 1 duplicate

 

Example output:

IDs                  Dummy_Var

a010                     1

a010                     1

a010                     1

a010                     1

a011                     0

a012                     1

a012                     1

nehalsanghvi
Pyrite | Level 9
data t2;
set t;
by ID;
DupFlag=  ^(first.ID and last.ID); 
run;
Reeza
Super User
proc sort data=have;
by id;
run;

data want;
set have;
by id;

flag=1;
if first.id and last.id then flag=0; *change flag for unique records;

*flag=0;
*if not (first.id and last.id) flag=1; *change flag for duplicate records;

run;

 

jhlaramore
SAS Employee

 

data t;

input ID$;

cards;

a010

a010

a010

a011

a012

a012

a013

a013

a013

a013

a014

;

run;

 

proc sql;

create table t2 as

select *

, case when

     ID in(select distinct ID from t group by ID having count(ID)>1) then 1 else 0 end as Dummy_Var

from t;

quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 18161 views
  • 1 like
  • 5 in conversation