I am trying to derive the quality of life score based on the participants’ responses. I used the below-mentioned if-then statement to give the score. There were two observations where this if/then statement was not correctly working. For example, even though physical1= 1, the physical1_qol was shown as 25 instead of 100. In both observations, there were missing values for physical_qol even though physical values were not missing.
if physical1=1 then physical1_qol=100;if physical1=2 then physical1_qol=75;if physical1=3 then physical1_qol=50;if physical1=4 then physical1_qol=25;if physical1=5 then physical1_qol=0;
if physical2=1 then physical2_qol=100;if physical2=2 then physical2_qol=75;if physical2=3 then physical2_qol=50;if physical2=4 then physical2_qol=25;if physical2=5 then physical2_qol=0;
if physical3=1 then physical3_qol=100;if physical3=2 then physical3_qol=75;if physical3=3 then physical3_qol=50;if physical3=4 then physical3_qol=25;if physical3=5 then physical3_qol=0;
if physical4=1 then physical4_qol=100;if physical4=2 then physical4_qol=75;if physical4=3 then physical4_qol=50;if physical4=4 then physical4_qol=25;if physical4=5 then physical4_qol=0;
if physical5=1 then physical5_qol=100;if physical5=2 then physical5_qol=75;if physical5=3 then physical5_qol=50;if physical5=4 then physical1_qol=25;if physical5=5 then physical5_qol=0;
if physical6=1 then physical6_qol=100;if physical6=2 then physical6_qol=75;if physical6=3 then physical6_qol=50;if physical6=4 then physical6_qol=25;if physical6=5 then physical6_qol=0;
if physical7=1 then physical7_qol=100;if physical7=2 then physical7_qol=75;if physical7=3 then physical7_qol=50;if physical7=4 then physical7_qol=25;if physical7=5 then physical7_qol=0;
if physical8=1 then physical8_qol=100;if physical8=2 then physical8_qol=75;if physical8=3 then physical8_qol=50;if physical8=4 then physical8_qol=25;if physical8=5 then physical8_qol=0;
Thank you
Thank you all for the suggestion. I changed the if-then statement to if-then/else, then it worked.
if physical1=1 then physical1_qol=100;
else if physical1=2 then physical1_qol=75;
else if physical1=3 then physical1_qol=50;
else if physical1=4 then physical1_qol=25;
else if physical1=5 then physical1_qol=0;
Format your code better and the mistake might become more visible.
First put only one statement per line.
Consider the first 5 statements.
if physical1=1 then physical1_qol=100;
if physical1=2 then physical1_qol=75;
if physical1=3 then physical1_qol=50;
if physical1=4 then physical1_qol=25;
if physical1=5 then physical1_qol=0;
Now a human can scan them and make sure you haven't used different variable names or values somewhere that would explain how a value of PHYSICAL1=1 would be confused with a value of PHYSiCAL1=4.
But you could also just code the conversion as a formula instead. This one could be:
physical1_qol=100-25*(physical1-1);
Or perhaps less obviously:
physical1_qol=125-25*physical1;
both of which is much less typing (and so much less chance of typos)
Another way to reduce the typing is to use arrays. For that you probably should change the target variable names so the counter is at the END of the name so it can be used more easily in variable lists.
data want;
set have;
array raw physical1-physical8;
array qol physical_qol1-physical_qol8;
do index=1 to dim(raw);
qol[index]=125-25*raw[index];
end;
drop index;
run;
Let's first create some sample have data:
data have(drop=_:);
array physical{8} 8;
do _i=1 to 4;
do _k=1 to dim(physical);
physical[_k]=rand('integer',1,6);
if _i=2 and _k=1 then physical[_k]=1.00000001;
if _i=1 and _k=2 then physical[_k]=.;
end;
output;
end;
run;
Using this sample have data there are several ways to recode your values. Taking your approach to populate a new variable with if/then/ELSE statements using array processing leads to less code that's easier to maintain.
data want_1;
set have;
array physical_in{*} physical1 - physical8;
array physical_qol{8} 8;
do i=1 to dim(physical_in);
if physical_in[i]=1 then physical_qol[i]=100;
else if physical_in[i]=2 then physical_qol[i]=75;
else if physical_in[i]=3 then physical_qol[i]=50;
else if physical_in[i]=4 then physical_qol[i]=25;
else if physical_in[i]=5 then physical_qol[i]=0;
end;
keep physical_qol:;
run;
NB: If creating numbered variables always add the number at the end of the name because this will make it much easier to reference such variables later on (example: keep physical_qol:;
).
If you just need the recoded values for display/printing then creating and applying a format is all you need.
proc format;
value physical
1=100
2=75
3=50
4=25
5=0
other=.
;
run;
proc print data=have;
format physical: physical.;
run;
Instead of using if/then/else statements you can also use an informat for recoding values. This again can make maintenance much easier.
proc format;
invalue physical
1=100
2=75
3=50
4=25
5=0
other=.
;
run;
data want_2;
set have;
array physical_in{*} physical1 - physical8;
array physical_qol{8} 8;
do i=1 to dim(physical_in);
physical_qol[i]=input(physical_in[i],physical.);
end;
keep physical_qol:;
run;
And now for to the issue you actually raised. To further investigate which source values lead to unexpected missings you could use code as below.
proc freq data=have;
table physical1 - physical8 /missing missprint;
format physical1 - physical8 best32.;
run;
Depending on the findings for your real data you then can amend your code for recoding values. Eventually you didn't consider another possible source value, or there is some unexpected fractional value and you need to first round()/floor() your source values or with a format/informat use the fuzz option or ....
I haven't mentioned to just multiply your source values by 25 because I wanted to propose approaches that will work for any mapping of source to target values.
@Patrick Note that informats convert text into values. So you cannot use the to convert numbers into numbers. You will first need to convert the numbers into text. Or you could just use the format you created before to convert the numbers into text and then use the normal numeric informat to convert that text into a number.
physical_qol[i]=input(put(physical_in[i],physical.),32.);
@Tom True and using input(put(...)) is certainly the cleanest way for doing this.
Using the numerical informat directly on a numerical variable will create a compiler note like NOTE: Numeric values have been converted to character values at the places given by: but you still end-up with a numerical variable and the desired recoded values.
proc format;
invalue test 1=100;
run;
data test;
num_have=1;
num_want=input(num_have,test.);
run;
Hi,
at 1st view it seems there is an inconsistancy here:
if physical5=4 then physical1_qol=25;
shouldn't it be
if physical5=4 then physical5_qol=25;
?
- Cheers -
Depending on the other variables in your dataset, I would transpose it to a long layout, which would then make the transformation step extremely simple.
Thank you all for the suggestion. I changed the if-then statement to if-then/else, then it worked.
if physical1=1 then physical1_qol=100;
else if physical1=2 then physical1_qol=75;
else if physical1=3 then physical1_qol=50;
else if physical1=4 then physical1_qol=25;
else if physical1=5 then physical1_qol=0;
With mutually exclusive conditions an ELSE adds clarity and improves performance a bit but it should not have any impact on which condition becomes true - which should only ever be one.
Adding the ELSE might mask existing issues like the one in the code you shared.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.