Hello All,
I have a fairly long list of variable (all 0/1) that all have the prefix "_any." These also all happen to be in consecutive columns. I am trying to use an array to define a new variable "outcome" as having value "1" if any of the variables with prefix "_any" have "1" as their value. I have tried the following code:
data want;
set have;
array any(*) any_:;
do i=1 to dim(any);
if any(i) in: ("1") then outcome=1;
else outcome=0;
end;
run;
However, this is not returning the correct answer (the count for "outcome" is far less than I know the true value should be). Does anyone have any suggestions regarding what I should change? Thank you!
The code you show will only ever have the result of 0 or 1 and will only have the "outcome" of the last variable in the array.
If you intend to accumulate a value when the conditions are met something like
data want; set have; array any(*) any_:; do i=1 to dim(any); if any(i) in: ("1") then outcome=sum(outcome,1); end; run;
What your code was doing was comparing the first value and setting 1 or 0.
Then the next variable was compared and setting 1 or 0 (not accumulating from the previous loop)
Repeat until running out of variables.
If you are looking for a "run" count, longest sequence then the above code would use the "else outcome=0". otherwise that zero assignment is not needed.
Thank you very much! This successfully created the counter variable. I didn't realize that in my previous code the array would not look for the variable "1" in any of the array variables. I realize I can always do this after the fact using the summed variable, but what would be the correct way to request the array set outcome=1 if "1" is identified in any of the array variables?
As an aside, I tried the "whichc" method, but that did not seem to work (returns "0" for everyone).
WHICHC() returns a the index into the list of variables where the value was first found. So it always returns a number. It returns zero when the string is not found. So whatever string you where searching for was not in the variables you listed.
If your ANY_: variables are numeric then if you use them with WHICHC() they will first be converted to stings using the BEST12. format. So 1 would become the digit 1 with eleven spaces in front of it so it would not match a string like '1'.
For numeric variables use WHICHN() and search for a numeric value instead of character string.
Ah I see. So using "whichn" would make more sense, given that my variables are numeric. I switched to that and now have an answer that seems much more reasonable.
I am trying to understand how the full statement you provided is getting to that result. I see that whichn is supposed to return the index of the first matching value, but this statement (fortunately) just returns 0/1.
The code is testing if the result of the function call is zero or not.
0 ne ....
SAS evaluates boolean expression with 0 for FALSE and 1 for TRUE.
I see. So it returns a value of "1" for outcome when the whichn expression is not equal to 0. Very clever. Since I ultimately wanted to have outcome=1 if the "any_" statements are "1" or if another variable >=1, it looks like I can do the following:
outcome=0 ne (whichn("1", of any_:) or var2 >=1);
When using complex expressions it helps to add ( ) for both the SAS parser and the human readers.
outcome=(0 ne whichn(1, of any_:)) or (var2 >=1);
Since SAS will treat any non-zero, non-missing value as TRUE you can shorten that to:
outcome=whichn(1, of any_:) or (var2 >=1);
You may want to run the following code and then read the log.
SAS will treat any numeric value as a "boolean" when used with "If Variablename " constructs. It is a good idea to know how they are treated.
data example; do x = ., -1, -.5, 0, 1 , 3 , 3.1416; if x then result ="SAS considers this value as true"; else result = 'Treated as False'; put x= result; end; run;
Depending on the actual needed use later in a process sometimes you can save creating two variables by using one to hold both a count(treated as 1 if greater than 0) instead of a count plus an "any in a range".
Are the variables prefixed with ANY_ or _ANY?
Anyway if you just want to know if any of the 'anys' is "1" then use the WHICHC() function. No need for an array or a do loop.
data want;
set have;
outcome= 0 ne whichc("1", of any_:);
run;
Here is a reference that illustrates several options to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html
I think your initial code was almost there, you just have to stop looking when you find a valid value:
data want;
set have;
array any(*) any_:;
do i=1 to dim(any) until(outcome=1);
outcome=any(i) in: ("1");
end;
run;
I also changed the the code in the loop a bit to make it shorter, but the important thing is the UNTIL in the DO loop header.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.