BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
brulard
Pyrite | Level 9

hi,

 

can someone recommend some code so that sas check for condition: if value IDs is duplicate [could be more than 1 duplicate], then scan Var1 for content, if no content, create new_dummy_var=1. If value IDs not duplicate, skip.

 

Var1 is numeric.

 

UPDATING Example outcome:

 

IDs          Var 1      new_dummy_var

 

A001           5

A001           6

A001           5   

A002                            1

A002                            1

A003           5

A003

A004

 

**Update: In the example outcome, note that because one of the duplicate observation A003 has a 5 in Var1, the program should move on, and not assign a value in New_dummy_var.

 

**New_dummy_var is only meant to be a flag, so not care if char or numeric.

 

Thank you very much

 

Dave

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

This can also be done in a data step:

 

data want (drop=_:);
  do until (last.IDs);
    set have;
    by IDs;
    if first.IDs then do;
      _Vars=0;
      _Recs=0;
    end;
    _Recs+1;
    if not missing(var) then _Vars+1;
  end;
  do until (last.IDs);
    set have;
    by IDs;
    if _Recs GT 1 and _Vars=0 then new_dummy_var=1;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

View solution in original post

8 REPLIES 8
nehalsanghvi
Pyrite | Level 9

Expand on the previous solution @brulard

data t;
input ID$ Var1$;
cards;
a001 A
a001 A
a001 A
a002 .
a002 .
a003 .
a004 .
;
run;

data t2;
set t;
by id;
new_dummy_var= ^(first.ID and last.ID) and var1='';
run;

 

Edit: I created Var1 as a character variable, but if yours is numeric, this statement would work:

new_dummy_var= ^(first.ID and last.ID) and var1=.;

 

The missing(var1) function that Art297 used below works with either character or numeric data type.

art297
Opal | Level 21
data want;
  set have;
  by ID;
  if not(first.id and last.id) then do;
    if missing(Var1) then new_dummy_var=1;
  end;
run;

Art, CEO, AnalystFinder.com

Astounding
PROC Star

Just to clarify the question, what should the dummy variable be in this case?

 

IDs          Var 1      new_dummy_var

 

A001           

A001           

A001           5   

 

Your answer makes a big difference in the complexity of the programming.

brulard
Pyrite | Level 9

It could be a number or char. thanks

nehalsanghvi
Pyrite | Level 9

@brulard the question is, if there are multiple rows for any ID, but not all rows have a value in Var1 (some do), then what would you want the new_dummy_var to be? Should new_dummy_var be set to 1 only if there are no values in Var1 at all? Or should it be set to 1 even if there are values in Var1 on some rows, but not on all rows?

 

 

brulard
Pyrite | Level 9

hi nehalsanghvi,

 

if there are multiple rows for any ID, but not all rows have a value in Var1 (some do), then what would you want the new_dummy_var to be?

 

The new_dummy_var to be set to 1 only if there are no values in Var1 at all. Thanks,

Ksharp
Super User

As Astounding said.

 

 

data have;
input IDs  $        Var    ;
cards;
A001           5 .
A001           6 .
A001           5    .
A002            .                
A002            .                
A003 . .
A004 . .
;
run;

proc sql;
select *,case when(count(*) ne 1 and sum(missing(var))=count(*)) then 1 else . end as dummy
 from have
  group by ids;
quit;
art297
Opal | Level 21

This can also be done in a data step:

 

data want (drop=_:);
  do until (last.IDs);
    set have;
    by IDs;
    if first.IDs then do;
      _Vars=0;
      _Recs=0;
    end;
    _Recs+1;
    if not missing(var) then _Vars+1;
  end;
  do until (last.IDs);
    set have;
    by IDs;
    if _Recs GT 1 and _Vars=0 then new_dummy_var=1;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2317 views
  • 4 likes
  • 5 in conversation