I have a general question. Thank you in advance.
I am new to arrays. I am used to writing single if/then/else statements, but I want to be more efficient in my code.
I have a set of variables, var1-var3 and I need to create a new variable based on their condition.
if any of the variables=1 then new_var=1
if all variables are 0 then new_var=0
but if variables are only 0 or -1 then new_var=99
My var1-var3 can have values of 0, 1 or -1
Here's what I have. I think this is giving me the OR version, but I am not sure and I don't know how to code the other 2 statements appropriately.
array var (3) var1-var3;
do n=1 to 3;
if var(n)=1 then new_var=1; /*this is saying if var1 or var2 or var3=1 then new_var=1?*/
else if var(n) not in 1 and var(n)=-1 then new_var=.;/*my attempt at getting the next 2 statements to be part of the loop, but I don't know if this is close to evaluating it properly*/
thank you again!
Change your logic.
if any of the variables=1 then new_var=1
Assuming numeric variables use WHICHN() to see if any are 1
if whichn(1, of var1-var3) then new_var = 1;
if all variables are 0 then new_var=0
If all variables are 0, the min and max are both zero as well.
else if max(of var1-var3) = min(of var1-var3) = 0 then new_var = 0;
but if variables are only 0 or -1 then new_var=99
else new_var = 99;
ELSE is your last condition but you can also add another explicit check if needed.
No arrays needed.
If you really want arrays here are some tutorials on it:
Here's a tutorial on using Arrays in SAS
https://stats.idre.ucla.edu/sas/seminars/sas-arrays/
@dchanson wrote:
I have a general question. Thank you in advance.
I am new to arrays. I am used to writing single if/then/else statements, but I want to be more efficient in my code.
I have a set of variables, var1-var3 and I need to create a new variable based on their condition.
if any of the variables=1 then new_var=1
if all variables are 0 then new_var=0
but if variables are only 0 or -1 then new_var=99
My var1-var3 can have values of 0, 1 or -1
Here's what I have. I think this is giving me the OR version, but I am not sure and I don't know how to code the other 2 statements appropriately.
array var (3) var1-var3;
do n=1 to 3;
if var(n)=1 then new_var=1; /*this is saying if var1 or var2 or var3=1 then new_var=1?*/
else if var(n) not in 1 and var(n)=-1 then new_var=.;/*my attempt at getting the next 2 statements to be part of the loop, but I don't know if this is close to evaluating it properly*/
thank you again!
Change your logic.
if any of the variables=1 then new_var=1
Assuming numeric variables use WHICHN() to see if any are 1
if whichn(1, of var1-var3) then new_var = 1;
if all variables are 0 then new_var=0
If all variables are 0, the min and max are both zero as well.
else if max(of var1-var3) = min(of var1-var3) = 0 then new_var = 0;
but if variables are only 0 or -1 then new_var=99
else new_var = 99;
ELSE is your last condition but you can also add another explicit check if needed.
No arrays needed.
If you really want arrays here are some tutorials on it:
Here's a tutorial on using Arrays in SAS
https://stats.idre.ucla.edu/sas/seminars/sas-arrays/
@dchanson wrote:
I have a general question. Thank you in advance.
I am new to arrays. I am used to writing single if/then/else statements, but I want to be more efficient in my code.
I have a set of variables, var1-var3 and I need to create a new variable based on their condition.
if any of the variables=1 then new_var=1
if all variables are 0 then new_var=0
but if variables are only 0 or -1 then new_var=99
My var1-var3 can have values of 0, 1 or -1
Here's what I have. I think this is giving me the OR version, but I am not sure and I don't know how to code the other 2 statements appropriately.
array var (3) var1-var3;
do n=1 to 3;
if var(n)=1 then new_var=1; /*this is saying if var1 or var2 or var3=1 then new_var=1?*/
else if var(n) not in 1 and var(n)=-1 then new_var=.;/*my attempt at getting the next 2 statements to be part of the loop, but I don't know if this is close to evaluating it properly*/
thank you again!
Thank you so much @Reeza! I really appreciate your help and quick reply. This is great! Also, thank you for sharing the link to tutorials.
If you are using the 99 value to indicate something you might share how you expect to use the value. If you want to exclude 99 values from calculations you might be better off leaving the variable as missing.
Consider that if you sum the value of new_var across the data set, or a subset of data, then you get a count of the records that had any of those variables with a 1 value, the mean would be a percent in decimal form. If you have the 99 value then you will have to go to a bit more work to get those values.
@dchanson wrote:
I didn't know there were other missing values other than . for numeric. This is great news! Thank you! I am learning so much and just on my first post! I appreciate it so much!
One of the advantages of the special missing as you could use different values for specific meanings and with a custom format you can even get that information displayed.
proc format library=work; invalue qcode "Yes" = 1 "No" = 0 "Don't Know" = .D "Refused" = .R other = _error_ ; value qcode 1="Yes" 0="No" .D="Don't Know" .R="Refused" ; run; data example; infile datalines dlm=',' truncover; informat q1 - q2 qcode.; input q1 q2; datalines; Yes,No Don't Know,Yes No,Refused No,Yes Yes,Maybe ; run; proc freq data=example; run; proc print data=example; format q1 q2 qcode.; run;
The above reads in line data from the data lines but reading a text file would be similar so that the question codes are read directly into numeric values or special missing. The invalue optional statement Other=_error_ generates an error message in the log about invalid data when an unexpected value is encountered, VERY useful if you have an explicit list of acceptable values in case something fudges the data source. The resulting value assigned is simple missing. So later one you can tell partially that the recorded value was actually unexpected. There are lots of variations on this theme.
The Proc freq shows an example that the special missing are not included in the counts.
Proc Print shows how to display the original values using a complimentary format.
And if you want to show counts including the special missing:
Proc freq data=example; tables q1 q2 / missing; format q1 q2 qcode.; run;
Where the option / missing says that I want to see the missing the resulting count. There are a fair number of procedures that have the option to display or use missing values and if used with the format each level of missing is treated as a group for output just like the counts for Yes and No.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.