hi,
does anyone know how i can flag for duplicate value? I'd like to create a dummy variable where if variable is a duplicate, dummy_var=1, else 0...
the variable contains char value.
Example output:
IDs Dummy_Var
a010 1
a010 1
a011 0
a012 1
a012 1
Thank you
data t2;
set t;
by ID;
DupFlag= ^(first.ID and last.ID);
run;
Try this:
data t;
input ID$;
cards;
a010
a010
a011
a012
a012
;
run;
data t2;
set t;
by ID;
DupFlag= first.ID ne last.ID;
run;
That solution would work for the given data, but would fail when you have 3 or more observations for the same ID. There are many ways to come up with a more robust program, such as:
data want;
set have;
by id;
if first.id=0 or last.id=0 then flag=1;
else flag=0;
run;
hi nehalsanghvi ,
thanks for tip... I should have mentioned, my data can have more than 1 duplicate
Example output:
IDs Dummy_Var
a010 1
a010 1
a010 1
a010 1
a011 0
a012 1
a012 1
data t2;
set t;
by ID;
DupFlag= ^(first.ID and last.ID);
run;
proc sort data=have;
by id;
run;
data want;
set have;
by id;
flag=1;
if first.id and last.id then flag=0; *change flag for unique records;
*flag=0;
*if not (first.id and last.id) flag=1; *change flag for duplicate records;
run;
data t;
input ID$;
cards;
a010
a010
a010
a011
a012
a012
a013
a013
a013
a013
a014
;
run;
proc sql;
create table t2 as
select *
, case when
ID in(select distinct ID from t group by ID having count(ID)>1) then 1 else 0 end as Dummy_Var
from t;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.