if duplicate, check if var1 has content, if no then flag

Solved
Frequent Contributor
Posts: 123

if duplicate, check if var1 has content, if no then flag

[ Edited ]

hi,

can someone recommend some code so that sas check for condition: if value IDs is duplicate [could be more than 1 duplicate], then scan Var1 for content, if no content, create new_dummy_var=1. If value IDs not duplicate, skip.

Var1 is numeric.

UPDATING Example outcome:

IDs          Var 1      new_dummy_var

A001           5

A001           6

A001           5

A002                            1

A002                            1

A003           5

A003

A004

**Update: In the example outcome, note that because one of the duplicate observation A003 has a 5 in Var1, the program should move on, and not assign a value in New_dummy_var.

**New_dummy_var is only meant to be a flag, so not care if char or numeric.

Thank you very much

Dave

Accepted Solutions
Solution
‎03-07-2017 08:07 AM
PROC Star
Posts: 8,149

Re: if duplicate, check if var1 has content, if no then flag

This can also be done in a data step:

```data want (drop=_:);
do until (last.IDs);
set have;
by IDs;
if first.IDs then do;
_Vars=0;
_Recs=0;
end;
_Recs+1;
if not missing(var) then _Vars+1;
end;
do until (last.IDs);
set have;
by IDs;
if _Recs GT 1 and _Vars=0 then new_dummy_var=1;
output;
end;
run;
```

Art, CEO, AnalystFinder.com

All Replies
Frequent Contributor
Posts: 75

Re: if duplicate, check if var1 has content, if no then flag

[ Edited ]

Expand on the previous solution @brulard

``````data t;
input ID\$ Var1\$;
cards;
a001 A
a001 A
a001 A
a002 .
a002 .
a003 .
a004 .
;
run;

data t2;
set t;
by id;
new_dummy_var= ^(first.ID and last.ID) and var1='';
run;``````

Edit: I created Var1 as a character variable, but if yours is numeric, this statement would work:

``new_dummy_var= ^(first.ID and last.ID) and var1=.;``

The missing(var1) function that Art297 used below works with either character or numeric data type.

PROC Star
Posts: 8,149

Re: if duplicate, check if var1 has content, if no then flag

```data want;
set have;
by ID;
if not(first.id and last.id) then do;
if missing(Var1) then new_dummy_var=1;
end;
run;
```

Art, CEO, AnalystFinder.com

Super User
Posts: 6,633

Re: if duplicate, check if var1 has content, if no then flag

Just to clarify the question, what should the dummy variable be in this case?

IDs          Var 1      new_dummy_var

A001

A001

A001           5

Your answer makes a big difference in the complexity of the programming.

Frequent Contributor
Posts: 123

Re: if duplicate, check if var1 has content, if no then flag

It could be a number or char. thanks

Frequent Contributor
Posts: 75

Re: if duplicate, check if var1 has content, if no then flag

@brulard the question is, if there are multiple rows for any ID, but not all rows have a value in Var1 (some do), then what would you want the new_dummy_var to be? Should new_dummy_var be set to 1 only if there are no values in Var1 at all? Or should it be set to 1 even if there are values in Var1 on some rows, but not on all rows?

Frequent Contributor
Posts: 123

Re: if duplicate, check if var1 has content, if no then flag

hi nehalsanghvi,

if there are multiple rows for any ID, but not all rows have a value in Var1 (some do), then what would you want the new_dummy_var to be?

The new_dummy_var to be set to 1 only if there are no values in Var1 at all. Thanks,

Super User
Posts: 10,689

Re: if duplicate, check if var1 has content, if no then flag

As Astounding said.

``````data have;
input IDs  \$        Var    ;
cards;
A001           5 .
A001           6 .
A001           5    .
A002            .
A002            .
A003 . .
A004 . .
;
run;

proc sql;
select *,case when(count(*) ne 1 and sum(missing(var))=count(*)) then 1 else . end as dummy
from have
group by ids;
quit;``````
Solution
‎03-07-2017 08:07 AM
PROC Star
Posts: 8,149

Re: if duplicate, check if var1 has content, if no then flag

This can also be done in a data step:

```data want (drop=_:);
do until (last.IDs);
set have;
by IDs;
if first.IDs then do;
_Vars=0;
_Recs=0;
end;
_Recs+1;
if not missing(var) then _Vars+1;
end;
do until (last.IDs);
set have;
by IDs;
if _Recs GT 1 and _Vars=0 then new_dummy_var=1;
output;
end;
run;
```

Art, CEO, AnalystFinder.com

☑ This topic is solved.