BookmarkSubscribeRSS Feed
jgreenberg321
Fluorite | Level 6

Hello All,

 

I have a fairly long list of variable (all 0/1) that all have the prefix "_any." These also all happen to be in consecutive columns. I am trying to use an array to define a new variable "outcome" as having value "1" if any of the variables with prefix "_any" have "1" as their value. I have tried the following code: 

data want;

set have;

array any(*) any_:;

do i=1 to dim(any);

if any(i) in: ("1") then outcome=1;

else outcome=0;

end;

run;

 

However, this is not returning the correct answer (the count for "outcome" is far less than I know the true value should be). Does anyone have any suggestions regarding what I should change? Thank you!

13 REPLIES 13
ballardw
Super User

The code you show will only ever have the result of 0 or 1 and will only have the "outcome" of the last variable in the array.

If you intend to accumulate a value when the conditions are met something like

data want;
   set have;
   array any(*) any_:;
   do i=1 to dim(any);
      if any(i) in: ("1") then outcome=sum(outcome,1);
   end;
run;

 

What your code was doing was comparing the first value and setting 1 or 0.

Then the next variable was compared and setting 1 or 0 (not accumulating from the previous loop)

Repeat until running out of variables.

 

If you are looking for a "run" count, longest sequence then the above code would use the "else outcome=0". otherwise that zero assignment is not needed.

 

jgreenberg321
Fluorite | Level 6

Thank you very much! This successfully created the counter variable. I didn't realize that in my previous code the array would not look for the variable "1" in any of the array variables. I realize I can always do this after the fact using the summed variable, but what would be the correct way to request the array set outcome=1 if "1" is identified in any of the array variables?

 

As an aside, I tried the "whichc" method, but that did not seem to work (returns "0" for everyone). 

Tom
Super User Tom
Super User

WHICHC() returns a the index into the list of variables where the value was first found.  So it always returns a number. It returns zero when the string is not found.  So whatever string you where searching for was not in the variables you listed.

If your ANY_: variables are numeric then if you use them with WHICHC() they will first be converted to stings using the BEST12. format.  So 1 would become the digit 1 with eleven spaces in front of it so it would not match a string like '1'.

For numeric variables use WHICHN() and search for a numeric value instead of character string.

jgreenberg321
Fluorite | Level 6

Ah I see. So using "whichn" would make more sense, given that my variables are numeric. I switched to that and now have an answer that seems much more reasonable.

 

I am trying to understand how the full statement you provided is getting to that result. I see that whichn is supposed to return the index of the first matching value, but this statement (fortunately) just returns 0/1.

Tom
Super User Tom
Super User

 The code is testing if the result of the function call is zero or not.

0 ne ....

SAS evaluates boolean expression with 0 for FALSE and 1 for TRUE.

jgreenberg321
Fluorite | Level 6

I see. So it returns a value of "1" for outcome when the whichn expression is not equal to 0. Very clever. Since I ultimately wanted to have outcome=1 if the "any_" statements are "1" or if another variable >=1, it looks like I can do the following:

 

outcome=0 ne (whichn("1", of any_:) or var2 >=1);

Tom
Super User Tom
Super User

When using complex expressions it helps to add ( ) for both the SAS parser and the human readers.

outcome=(0 ne whichn(1, of any_:)) or (var2 >=1);

Since SAS will treat any non-zero, non-missing value as TRUE you can shorten that to:

outcome=whichn(1, of any_:) or (var2 >=1);
jgreenberg321
Fluorite | Level 6
Makes sense. Thanks!
ballardw
Super User

You may want to run the following code and then read the log.

SAS will treat any numeric value as a "boolean" when used with "If Variablename " constructs. It is a good idea to know how they are treated.

data example;
   do x = ., -1, -.5, 0, 1 , 3 , 3.1416;
      if x then result ="SAS considers this value as true";
      else result = 'Treated as False';
      put x= result;
   end;
run;

Depending on the actual needed use later in a process sometimes you can save creating two variables by using one to hold both a count(treated as 1 if greater than 0) instead of a count plus an "any in a range".

Tom
Super User Tom
Super User

Are the variables prefixed with ANY_ or _ANY?

Anyway if you just want to know if any of the 'anys' is "1" then use the WHICHC() function. No need for an array or a do loop.

data want;
  set have;
  outcome= 0 ne whichc("1", of any_:);
run;

 

Reeza
Super User

Here is a reference that illustrates several options to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html

s_lassen
Meteorite | Level 14

I think your initial code was almost there, you just have to stop looking when you find a valid value:

data want;
  set have;
  array any(*) any_:;
  do i=1 to dim(any) until(outcome=1);
    outcome=any(i) in: ("1");
    end;
run;

I also changed the the code in the loop a bit to make it shorter, but the important thing is the UNTIL in the DO loop header.

jgreenberg321
Fluorite | Level 6
That makes sense. Thank you!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 2172 views
  • 1 like
  • 5 in conversation