BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
brulard
Pyrite | Level 9

hi,

 

does anyone know how i can flag for duplicate value? I'd like to create a dummy variable where if variable is a duplicate, dummy_var=1, else 0...

 

the variable contains char value.

 

Example output:

IDs                  Dummy_Var

a010                     1

a010                     1

a011                     0

a012                     1

a012                     1

 

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
nehalsanghvi
Pyrite | Level 9
data t2;
set t;
by ID;
DupFlag=  ^(first.ID and last.ID); 
run;

View solution in original post

6 REPLIES 6
nehalsanghvi
Pyrite | Level 9

Try this:

data t;
input ID$;
cards;
a010
a010
a011
a012
a012
;
run;

data t2;
set t;
by ID;
DupFlag=  first.ID ne last.ID; 
run;
Astounding
PROC Star

That solution would work for the given data, but would fail when you have 3 or more observations for the same ID.  There are many ways to come up with a more robust program, such as:

 

data want;

set have;

by id;

if first.id=0 or last.id=0 then flag=1;

else flag=0;

run;

brulard
Pyrite | Level 9

hi nehalsanghvi ,

 

thanks for tip... I should have mentioned, my data can have more than 1 duplicate

 

Example output:

IDs                  Dummy_Var

a010                     1

a010                     1

a010                     1

a010                     1

a011                     0

a012                     1

a012                     1

nehalsanghvi
Pyrite | Level 9
data t2;
set t;
by ID;
DupFlag=  ^(first.ID and last.ID); 
run;
Reeza
Super User
proc sort data=have;
by id;
run;

data want;
set have;
by id;

flag=1;
if first.id and last.id then flag=0; *change flag for unique records;

*flag=0;
*if not (first.id and last.id) flag=1; *change flag for duplicate records;

run;

 

jhlaramore
SAS Employee

 

data t;

input ID$;

cards;

a010

a010

a010

a011

a012

a012

a013

a013

a013

a013

a014

;

run;

 

proc sql;

create table t2 as

select *

, case when

     ID in(select distinct ID from t group by ID having count(ID)>1) then 1 else 0 end as Dummy_Var

from t;

quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 17052 views
  • 1 like
  • 5 in conversation