Hello! I'm writing code to analyze results from a multiple choice survey. My code is below. It works fine, but I'm pretty sure there is a better way of doing this than writing out hcp_score=hcp_score+1 a bunch of times. Does anyone have any suggestions? Some survey questions have multiple answer options with different point values, so I can't just sum the variables. Thank you!
hcp_score=0;
if hcp_screen=1 then hcp_score=hcp_score+1;
else if hcp_screen=2 then hcp_score=hcp_score+0.5;
if hcp_home=1 then hcp_score=hcp_score+1;
else if hcp_home=2 then hcp_score=hcp_score+0.5;
if hcp_list=1 then hcp_score=hcp_score+1;
if hcp_cohort=1 then hcp_score=hcp_score+1;
if hcp_restricted=1 then hcp_score=hcp_score+1;
if hcp_singlefac=1 then hcp_score=hcp_score+1;
if hcp_edu_covid=1 then hcp_score=hcp_score+0.25;
if hcp_edu_sick=1 then hcp_score=hcp_score+0.25;
if hcp_edu_ip=1 then hcp_score=hcp_score+0.25;
if hcp_edu_change in (1,9) then hcp_score=hcp_score+0.25;
You could have an array of 10 variables, and a corresponding array of values to add (i.e. an array of 1's and 0.25, etc.), as in:
data want (drop=v);
set have;
array check1 {*} hcp_screen hcp_home hcp_list hcp_cohort hcp_restricted
hcp_singlefac hcp_edu_covid hcp_edu_sick hcp_edu_ip
hcp_edu_change ;
array value1 {10} _temporary_ (6*1,4*0.25) ;
hcp_score=0;
do v=1 to dim(check1);
if check1{v}=1 then hcp_score=hcp_score+value1{v} ;
end;
array check2 {*} hcp_screen hcp_home;
do v=1 to dim(check2);
if check2{v}=2 then hcp_score=hcp_score+0.5;
end;
if hcp_edu_change=9 then hcp_score=hcp_score+0.25;
run;
The above has 3 loops, one for each value to look for: 1, 2, and 9 (yes the check for 9 is not actually a loop, but think of it as a loop over on variable).
And if you want to reduce it to a single loop, make a two-dimensional array, as in:
data want2 (drop=v);
set have;
array check {*} hcp_screen hcp_home hcp_list hcp_cohort hcp_restricted
hcp_singlefac hcp_edu_covid hcp_edu_sick hcp_edu_ip hcp_edu_change ;
array values {9,10} _temporary_
(6*1, 4*0.25 /*row 1, results for variable=1 */
,2*0.5,8*. /*row 2, results for variable=2 */
,60*. /*rows 3-8*/
,9*.,0.25 ) /*row 9*/ ;
hcp_score=0;
do v=1 to 10;
hcp_score=hcp_score+values{check{v},v};
end;
run;
The latter technique works only if the values to be searched for are integers.
Thank you, this is very helpful!
You could use the fact that SAS considers TRUE = 1 and FALSE = 0 to write a single scoring expression:
hcp_score =
(hcp_screen=1) * 1
+ (hcp_screen=2) * 0.5
+ ...
+ (hcp_edu_change in (1,9)) * 0.25;
"Efficiency" may depend on how many variables with similar values.
Similar calculations for multiple variables often indicates an array approach might work. So the block of the comparisons to single values could be done a couple of ways.
Here is one:
data trial; set have; hcp_score=0; if hcp_screen=1 then hcp_score=hcp_score+1; else if hcp_screen=2 then hcp_score=hcp_score+0.5; if hcp_home=1 then hcp_score=hcp_score+1; else if hcp_home=2 then hcp_score=hcp_score+0.5; if hcp_edu_change in (1,9) then hcp_score=hcp_score+0.25; /* below are the SINGLE value comparisons*/ array vars hcp_list hcp_cohort hcp_restricted hcp_singlefac hcp_edu_covid hcp_edu_sick hcp_edu_ip ; /* this array holds the COMPARISON values for the variables IN ORDER*/ array vals {7} _temporary_ (1,1,1,1,1,1,1); /* this has the score additions*/ array sc {7} _temporary_ (1,1,1,1, 0.25,0.25,0.25); do i=1 to dim(vars); if vars[i]=vals[i] then hcp_score=hcp_score + sc[i]; end; drop i; run;
The only change shown is for the Single value comparisons. The first array has the variables you need to compare, the second array, vals, contains the values that the variables are tested for equality and third has the amount to add to the score. The order of the variables, values and score additions must match in order.
You might see right off had that if I have to add 10 more variables I add them to the VARS list, then the value to compare, then score. The do loop with the number of elements in the vars list takes care of all of the conditional additions to the score total.
Note that this is really simple for single values. You could use it for multiple values by placing the variable on the list twice with the corresponding Values for comparsion and the corresponding score additions. That just is a tad harder to see right away.
If the above test code I show above works as expected then you could try
data trial; set have; hcp_score=0; array vars hcp_list hcp_cohort hcp_restricted hcp_singlefac hcp_edu_covid hcp_edu_sick hcp_edu_ip hcp_screen hcp_screen hcp_home hcp_home hcp_edu_change hcp_edu_change ; /* this array holds the COMPARISON values for the variables IN ORDER*/ array vals {13} _temporary_ (1,1,1,1,1,1,1,1,2,1,2,1,9); /* this has the score additions*/ array sc {13} _temporary_ (1,1,1,1, 0.25,0.25,0.25,1,0.5,1,0.5, 0.25,0.25); do i=1 to dim(vars); if vars[i]=vals[i] then hcp_score=hcp_score + sc[i]; end; drop i; run;
Note that I just added the multi-value comparison variables to end of the list with the corresponding comparison and score values and adjusted the size of the temporary arrays to match.
One of the drawbacks of this approach is having a mismatched number of values and variables will likely cause the error: Array Subscript out of range
And one or more warnings about "partial array initialization" (not enough values) or "Too many values for initialization of the array".
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.