hi,
does anyone know how i can flag for duplicate value? I'd like to create a dummy variable where if variable is a duplicate, dummy_var=1, else 0...
the variable contains char value.
Example output:
IDs Dummy_Var
a010 1
a010 1
a011 0
a012 1
a012 1
Thank you
data t2;
set t;
by ID;
DupFlag= ^(first.ID and last.ID);
run;
Try this:
data t;
input ID$;
cards;
a010
a010
a011
a012
a012
;
run;
data t2;
set t;
by ID;
DupFlag= first.ID ne last.ID;
run;
That solution would work for the given data, but would fail when you have 3 or more observations for the same ID. There are many ways to come up with a more robust program, such as:
data want;
set have;
by id;
if first.id=0 or last.id=0 then flag=1;
else flag=0;
run;
hi nehalsanghvi ,
thanks for tip... I should have mentioned, my data can have more than 1 duplicate
Example output:
IDs Dummy_Var
a010 1
a010 1
a010 1
a010 1
a011 0
a012 1
a012 1
data t2;
set t;
by ID;
DupFlag= ^(first.ID and last.ID);
run;
proc sort data=have;
by id;
run;
data want;
set have;
by id;
flag=1;
if first.id and last.id then flag=0; *change flag for unique records;
*flag=0;
*if not (first.id and last.id) flag=1; *change flag for duplicate records;
run;
data t;
input ID$;
cards;
a010
a010
a010
a011
a012
a012
a013
a013
a013
a013
a014
;
run;
proc sql;
create table t2 as
select *
, case when
ID in(select distinct ID from t group by ID having count(ID)>1) then 1 else 0 end as Dummy_Var
from t;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.