BookmarkSubscribeRSS Feed
megsredl
Obsidian | Level 7

Hello! I'm writing code to analyze results from a multiple choice survey. My code is below. It works fine, but I'm pretty sure there is a better way of doing this than writing out hcp_score=hcp_score+1 a bunch of times. Does anyone have any suggestions? Some survey questions have multiple answer options with different point values, so I can't just sum the variables. Thank you!

 

hcp_score=0;
	if hcp_screen=1 then hcp_score=hcp_score+1;
		else if hcp_screen=2 then hcp_score=hcp_score+0.5;
	if hcp_home=1 then hcp_score=hcp_score+1;
		else if hcp_home=2 then hcp_score=hcp_score+0.5;
	if hcp_list=1 then hcp_score=hcp_score+1;
	if hcp_cohort=1 then hcp_score=hcp_score+1;
	if hcp_restricted=1 then hcp_score=hcp_score+1;
	if hcp_singlefac=1 then hcp_score=hcp_score+1;
	if hcp_edu_covid=1 then hcp_score=hcp_score+0.25;
	if hcp_edu_sick=1 then hcp_score=hcp_score+0.25;
	if hcp_edu_ip=1 then hcp_score=hcp_score+0.25;
	if hcp_edu_change in (1,9) then hcp_score=hcp_score+0.25;
6 REPLIES 6
Reeza
Super User
It looks like those are weights applied to each variable if it's 1 or 1/9 in the case of screen/home/educ_change?

megsredl
Obsidian | Level 7
Hi! Sorry, the scoring wasn't really clear. Basically each question is worth one point. There are a few questions like hcp_screen and hcp_home that are always/sometimes/never questions, so "sometimes" is given 0.5 points. The last four questions are sub-questions within hcp_edu, which is worth one point total. The weird coding with hcp_edu_change is because there is an N/A option.
mkeintz
PROC Star

You could have an array of 10 variables, and a corresponding array of values to add (i.e. an array of 1's and 0.25, etc.), as in:

 

data want (drop=v);
  set have;
  array check1 {*} hcp_screen hcp_home hcp_list hcp_cohort hcp_restricted 
                   hcp_singlefac hcp_edu_covid hcp_edu_sick hcp_edu_ip
                   hcp_edu_change ;
  array value1 {10} _temporary_ (6*1,4*0.25) ;

  hcp_score=0;
  do v=1 to dim(check1);
    if check1{v}=1 then hcp_score=hcp_score+value1{v} ;
  end;

  array check2 {*} hcp_screen hcp_home;
  do v=1 to dim(check2);
    if check2{v}=2 then hcp_score=hcp_score+0.5;
  end;

  if hcp_edu_change=9 then hcp_score=hcp_score+0.25;
run;

The above has 3 loops, one for each value to look for: 1, 2, and 9 (yes the check for 9 is not actually a loop, but think of it as a loop over on variable).

 

And if you want to reduce it to a single loop, make a two-dimensional array, as in:

data want2 (drop=v);
  set have;
  array check {*} hcp_screen hcp_home hcp_list hcp_cohort hcp_restricted 
                   hcp_singlefac hcp_edu_covid hcp_edu_sick hcp_edu_ip hcp_edu_change ;

  array values {9,10} _temporary_
       (6*1, 4*0.25    /*row 1, results for variable=1 */
       ,2*0.5,8*.      /*row 2, results for variable=2 */
       ,60*.           /*rows 3-8*/
       ,9*.,0.25 )     /*row 9*/ ;

  hcp_score=0;
  do v=1 to 10;
    hcp_score=hcp_score+values{check{v},v};
  end;
run;

The latter technique works only if the values to be searched for are integers.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
megsredl
Obsidian | Level 7

Thank you, this is very helpful!

PGStats
Opal | Level 21

You could use the fact that SAS considers TRUE = 1 and FALSE = 0 to write a single scoring expression:

 

hcp_score =
   (hcp_screen=1) * 1
+ (hcp_screen=2) * 0.5
+ ...
+ (hcp_edu_change in (1,9)) * 0.25;

PG
ballardw
Super User

"Efficiency" may depend on how many variables with similar values.

Similar calculations for multiple variables often indicates an array approach might work. So the block of the comparisons to single values could be done a couple of ways.

Here is one:

data trial;
   set have;
   hcp_score=0;
   if hcp_screen=1 then hcp_score=hcp_score+1;
   else if hcp_screen=2 then hcp_score=hcp_score+0.5;
   if hcp_home=1 then hcp_score=hcp_score+1;
   else if hcp_home=2 then hcp_score=hcp_score+0.5;
	if hcp_edu_change in (1,9) then hcp_score=hcp_score+0.25;
   /* below are the SINGLE value comparisons*/
   array vars hcp_list hcp_cohort hcp_restricted hcp_singlefac 
              hcp_edu_covid hcp_edu_sick hcp_edu_ip ;
   /* this array holds the COMPARISON values for the variables
      IN ORDER*/
   array vals {7} _temporary_ (1,1,1,1,1,1,1);
   /* this has the score additions*/
   array sc   {7} _temporary_ (1,1,1,1, 0.25,0.25,0.25);
   do i=1 to dim(vars);
      if vars[i]=vals[i] then hcp_score=hcp_score + sc[i];
   end;
   drop i;
run;

The only change shown is for the Single value comparisons. The first array has the variables you need to compare, the second array, vals, contains the values that the variables are tested for equality and third has the amount to add to the score. The order of the variables, values and score additions must match in order.

You might see right off had that if I have to add 10 more variables I add them to the VARS list, then the value to compare, then score. The do loop with the number of elements in the vars list takes care of all of the conditional additions to the score total.

 

Note that this is really simple for single values. You could use it for multiple values by placing the variable on the list twice with the corresponding Values for comparsion and the corresponding score additions. That just is a tad harder to see right away.

If the above test code I show above works as expected then you could try

data trial;
   set have;
   hcp_score=0;
   array vars hcp_list hcp_cohort hcp_restricted hcp_singlefac 
              hcp_edu_covid hcp_edu_sick hcp_edu_ip 
              hcp_screen hcp_screen
              hcp_home hcp_home
              hcp_edu_change hcp_edu_change
  ;
   /* this array holds the COMPARISON values for the variables
      IN ORDER*/
   array vals {13} _temporary_ (1,1,1,1,1,1,1,1,2,1,2,1,9);
   /* this has the score additions*/
   array sc   {13} _temporary_ (1,1,1,1, 0.25,0.25,0.25,1,0.5,1,0.5, 0.25,0.25);
   do i=1 to dim(vars);
      if vars[i]=vals[i] then hcp_score=hcp_score + sc[i];
   end;
   drop i;
run;

Note that I just added the multi-value comparison variables to end of the list with the corresponding comparison and score values and adjusted the size of the temporary arrays to match.

One of the drawbacks of this approach is having a mismatched number of values and variables will likely cause  the error: Array Subscript out of range

And one or more warnings about "partial array initialization" (not enough values) or "Too many values for initialization of the array".

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 647 views
  • 2 likes
  • 5 in conversation