Hello experts!
I am using the following code to assign scores to Likert scale responses and divy them up into groups depending on the summed score, but the array I used has made many of my datapoints for my indicators go missing. Would someone happen to know why this happened? Thank you!!
data want;
set have;
array orig(7) $ norm_full_imm imm_men_take imm_resp_men imm_important imm_safe imm_protect_kids imm_heard_anything;
array kap(7) kap1-kap7;
do i=1 to 7;
if orig(i)='Strongly Agree' then kap(i)=5;
else if orig(i)='Agree' then kap(i)=4;
else if orig(i)='Neutral' then kap(i)=3;
else if orig(i)='Disagree' then kap(i)=2;
else if orig(i)='Strongly Disagree' then kap(i)=1;
end;
Total_KAP=sum(of kap(*));
if 28 < total_kap < 35 then kap_group='28-35';
else if 7 < total_kap < 14 then kap_group='7-14';
else kap_group='15-27';
run;
You have to make sure that the IF conditions are met with the actual values of your character variables.
So, if NORM_FULL_IMM='Strongly Disag.', the condition if orig(1)='Strongly Disagree' will not be met. You can modify the IF condition in different ways to be applicable:
if orig(i)='Strongly Disag.' then kap(i)=1;
This would be met for an exact match only.
Alternatively, you could require the variable value to start with "Strongly Dis" by using the colon modifier after the equals sign:
if orig(i)=:'Strongly Dis' then kap(i)=1;
Advantage: This would work even if some variables/observations contained 'Strongly Disagree' and others 'Strongly Disag.' or 'Strongly Disagr.' etc.
(Edit: corrected a typo)
The first place I would look as at the original data. Most likely there are differences or variations in spelling. Inspect the results from:
proc freq data=have;
tables norm_full_imm imm_men_take imm_resp_men imm_important imm_safe imm_protect_kids imm_heard_anything
/ missing;
run;
That will probably point you in the right direction.
Another possibility is that the variable names are not spelled correctly, so that's another item to examine.
Good luck.
Nothing is missing from the original data, but I have determined that it is only the "strongly disagree" values that have gone missing. Does that shine anymore light on the situation? I'm afraid I'm stumped...
Also, you should replace "<" by "<=". Otherwise, the categorization of several values (e.g. 7) will be incorrect.
Have you considered defining an informat to assign the KAP values and a format to label the groups?
I'm afraid that's a little (and by that I mean very) outside of my understanding of SAS.
No problem, it's OK to use IF/THEN/ELSE logic to assign those values.
But the issue with "<" vs. "<=" is serious: If, for example, all seven questions were answered "Strongly Disagree", TOTAL_KAP would be 7, but KAP_GROUP would be '15-27' with your code.
You say that the "strongly disagree" values have gone missing. This suggests that there could be an issue with upper/lower case or with truncation ('Strongly Disagree' is the longest among the five character values). Please check if the values in question (of NORM_FULL_IMM etc.) are exactly equal to the string Strongly Disagree. In particular, the seven character variables must have length 17 as a minimum.
I went ahead and added the "<=" and that's what ended up "fixing" part of it - originally it was more than just the "Strongly Disagree" that were missing; now, it's just those. I notice in my proc print output that it only displays "Strongly Disag." Could that be the problem? I assumed because my code still said "Strongly Disagree" that it didn't matter what the output display was. If so, how would I fix that?
You have to make sure that the IF conditions are met with the actual values of your character variables.
So, if NORM_FULL_IMM='Strongly Disag.', the condition if orig(1)='Strongly Disagree' will not be met. You can modify the IF condition in different ways to be applicable:
if orig(i)='Strongly Disag.' then kap(i)=1;
This would be met for an exact match only.
Alternatively, you could require the variable value to start with "Strongly Dis" by using the colon modifier after the equals sign:
if orig(i)=:'Strongly Dis' then kap(i)=1;
Advantage: This would work even if some variables/observations contained 'Strongly Disagree' and others 'Strongly Disag.' or 'Strongly Disagr.' etc.
(Edit: corrected a typo)
The colon did the trick! Thank you very much for all of your help!
You can also change the length of the variable in the array to make sure it works properly:
array array_name(*) $20. listofvariables goes here;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.