03-02-2017 03:24 PM
does anyone know how i can flag for duplicate value? I'd like to create a dummy variable where if variable is a duplicate, dummy_var=1, else 0...
the variable contains char value.
03-02-2017 03:28 PM - edited 03-02-2017 03:30 PM
data t; input ID$; cards; a010 a010 a011 a012 a012 ; run; data t2; set t; by ID; DupFlag= first.ID ne last.ID; run;
03-02-2017 03:32 PM
That solution would work for the given data, but would fail when you have 3 or more observations for the same ID. There are many ways to come up with a more robust program, such as:
if first.id=0 or last.id=0 then flag=1;
03-02-2017 03:34 PM
hi nehalsanghvi ,
thanks for tip... I should have mentioned, my data can have more than 1 duplicate
03-02-2017 03:42 PM
proc sort data=have; by id; run; data want; set have; by id; flag=1; if first.id and last.id then flag=0; *change flag for unique records; *flag=0; *if not (first.id and last.id) flag=1; *change flag for duplicate records; run;
03-02-2017 04:19 PM
create table t2 as
, case when
ID in(select distinct ID from t group by ID having count(ID)>1) then 1 else 0 end as Dummy_Var
Need further help from the community? Please ask a new question.