DATA Step, Macro, Functions and more

if duplicate, check if var1 has content, if no then flag

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 95
Accepted Solution

if duplicate, check if var1 has content, if no then flag

[ Edited ]

hi,

 

can someone recommend some code so that sas check for condition: if value IDs is duplicate [could be more than 1 duplicate], then scan Var1 for content, if no content, create new_dummy_var=1. If value IDs not duplicate, skip.

 

Var1 is numeric.

 

UPDATING Example outcome:

 

IDs          Var 1      new_dummy_var

 

A001           5

A001           6

A001           5   

A002                            1

A002                            1

A003           5

A003

A004

 

**Update: In the example outcome, note that because one of the duplicate observation A003 has a 5 in Var1, the program should move on, and not assign a value in New_dummy_var.

 

**New_dummy_var is only meant to be a flag, so not care if char or numeric.

 

Thank you very much

 

Dave

 


Accepted Solutions
Solution
‎03-07-2017 08:07 AM
PROC Star
Posts: 7,492

Re: if duplicate, check if var1 has content, if no then flag

This can also be done in a data step:

 

data want (drop=_:);
  do until (last.IDs);
    set have;
    by IDs;
    if first.IDs then do;
      _Vars=0;
      _Recs=0;
    end;
    _Recs+1;
    if not missing(var) then _Vars+1;
  end;
  do until (last.IDs);
    set have;
    by IDs;
    if _Recs GT 1 and _Vars=0 then new_dummy_var=1;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

View solution in original post


All Replies
Frequent Contributor
Posts: 75

Re: if duplicate, check if var1 has content, if no then flag

[ Edited ]

Expand on the previous solution @brulard

data t;
input ID$ Var1$;
cards;
a001 A
a001 A
a001 A
a002 .
a002 .
a003 .
a004 .
;
run;

data t2;
set t;
by id;
new_dummy_var= ^(first.ID and last.ID) and var1='';
run;

 

Edit: I created Var1 as a character variable, but if yours is numeric, this statement would work:

new_dummy_var= ^(first.ID and last.ID) and var1=.;

 

The missing(var1) function that Art297 used below works with either character or numeric data type.

PROC Star
Posts: 7,492

Re: if duplicate, check if var1 has content, if no then flag

data want;
  set have;
  by ID;
  if not(first.id and last.id) then do;
    if missing(Var1) then new_dummy_var=1;
  end;
run;

Art, CEO, AnalystFinder.com

Super User
Posts: 5,518

Re: if duplicate, check if var1 has content, if no then flag

Just to clarify the question, what should the dummy variable be in this case?

 

IDs          Var 1      new_dummy_var

 

A001           

A001           

A001           5   

 

Your answer makes a big difference in the complexity of the programming.

Frequent Contributor
Posts: 95

Re: if duplicate, check if var1 has content, if no then flag

Posted in reply to Astounding

It could be a number or char. thanks

Frequent Contributor
Posts: 75

Re: if duplicate, check if var1 has content, if no then flag

@brulard the question is, if there are multiple rows for any ID, but not all rows have a value in Var1 (some do), then what would you want the new_dummy_var to be? Should new_dummy_var be set to 1 only if there are no values in Var1 at all? Or should it be set to 1 even if there are values in Var1 on some rows, but not on all rows?

 

 

Frequent Contributor
Posts: 95

Re: if duplicate, check if var1 has content, if no then flag

Posted in reply to nehalsanghvi

hi nehalsanghvi,

 

if there are multiple rows for any ID, but not all rows have a value in Var1 (some do), then what would you want the new_dummy_var to be?

 

The new_dummy_var to be set to 1 only if there are no values in Var1 at all. Thanks,

Super User
Posts: 10,046

Re: if duplicate, check if var1 has content, if no then flag

As Astounding said.

 

 

data have;
input IDs  $        Var    ;
cards;
A001           5 .
A001           6 .
A001           5    .
A002            .                
A002            .                
A003 . .
A004 . .
;
run;

proc sql;
select *,case when(count(*) ne 1 and sum(missing(var))=count(*)) then 1 else . end as dummy
 from have
  group by ids;
quit;
Solution
‎03-07-2017 08:07 AM
PROC Star
Posts: 7,492

Re: if duplicate, check if var1 has content, if no then flag

This can also be done in a data step:

 

data want (drop=_:);
  do until (last.IDs);
    set have;
    by IDs;
    if first.IDs then do;
      _Vars=0;
      _Recs=0;
    end;
    _Recs+1;
    if not missing(var) then _Vars+1;
  end;
  do until (last.IDs);
    set have;
    by IDs;
    if _Recs GT 1 and _Vars=0 then new_dummy_var=1;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 175 views
  • 4 likes
  • 5 in conversation