Hello SAS Community,
I am working on analyzing a multiple response survey question. It has seven possible selections and respondents were allowed to choose as many of the selections as applied to them. Unfortunately, this has left me in a bit of an analysis bind. The variable was sent to me as a character variable, and all the possible selections were concatenated into one long string. So the variable values look something like:
1 010306
2 0204
3 01
4 0502
5 0205
6 01040507
.
.
.
As it is, this variable is not providing me with much information. I would like to create a new variable for analysis, where if the original variable contained lets say 01somewhere in its string it would be counted as 1, 02 would be 2, etc. Is there a way to subset observations into a new variable based on the whether it contains a specific character combination?
As always, I greatly appreciate your help and expertise!
Thanks!
Hi,
Something like:
data have;
id=1;result="010306"; output;
id=2;result="0204"; output;
id=3;result="01"; output;
id=4;result="0502"; output;
id=5;result="0205"; output;
id=6;result="01040507"; output;
run;
data want (drop=i);
set have;
array result_out{7};
do i=1 to 7;
if index(result,"0"||strip(put(i,best.)))>0 then result_out{i}=i;
end;
run;
I would subtr() out each answer, and out put it to a common variable, which means some kind of transponing the data set.
Then counting can be done with whatever tool/procedure you like.
data want;
set have;
no = length(response);
do i =1 to no by 2;
single_response = substr(response,i,2);
output;
end;
keep response_id single_response;
run;
proc sql;
select single_response, count(*) as no_resp
from want
group by single_response
;
quit;
If you a paying a company to administer your survey then I would address this issue with them to provide the data in another format. Many survey software options allow taking a multiple response question and exporting them as Q1_1 through Q1_n where n represents the number of responses as 0/1 coded dichotomous variables to indicate that choice was selected.
If order of response is not important to you, this may be what you are looking for:
data want;
set have;
q1_1 = (index(result,'01') >0);
q1_2 = (index(result,'02') >0);
q1_3 = (index(result,'03') >0);
/* continue obvious pattern*/
run;
If you have more than 10 response categories and get strings like 02031011 you need to add in some code to use boundaries of the 2 characters like
Q1_2 = (mod(index(result,'02'),2)=1);
This type of coding will assign 0 to those questions without responses which I find useful because a mean of the question will give a percent of respondents picking that choice and a sum the number that did plus you can get confidence intervals on the proportions.
If there is a skip pattern involved then only execute the code where the skip says the respondent should have answered the quesion.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.