@smatthy wrote:
Hello,
I am analyzing survey data containing a "check all that apply" question. Unfortunately the output contains all of the data in one cell.
Example question:
What fruit do you normally eat?
- Apple
- Banana
- Orange
Example datapoint:
apple,banana,orange
Datapoints containing more than one entry show up as missing in SAS, and I need to create a new variable to reflect a single entry based on a hierarchy (if q1=apple then fruit=apple; else if q1=banana then fruit=banana; etc.)
Is there a way to do this in SAS, or will I need to make changes in a copy of the Excel dataset?
Thanks
By hierarchy do you mean if apple appears anywhere in the list that would be result, if apple is not present then if banana is present that would be the result, with comparisons in order? Here would be one way:
data example;
infile datalines truncover;
informat string $50.;
input string;
array list{4} $10 _temporary_ ('apple','banana','orange','pear');
length result $ 10;
do i=1 to dim(list);
if index(string,strip(list[i]))>0 then do;
put list[i]=;
result=list[i];
leave;
end;
end;
drop i;
datalines;
apple,banana,orange
banana,orange
banana,orange,apple
pear
apple,pear
;
run;
The number of elements in the list array, the digit in the {} must match the number of elements listed in the parentheses, the order of the values in the parentheses is the "hierarchy" order.
The LEAVE in the do loop says to exit the loop when encountered, which would be when the value of the list item is found in the string.
The STRIP function is because the value of list[I] will be padded to the defined length of the value in the temporary array (I chose 10 partially to demonstrate this). For character values it would be a good idea for the result variable and the list array size to match.
Note that if you have elements that might be part of another, such as 'apple' and 'pineapple' this won't work exactly the same.