BookmarkSubscribeRSS Feed
smatthy
Calcite | Level 5

Hello, 

 

I am analyzing survey data containing a "check all that apply" question. Unfortunately the output contains all of the data in one cell.

 

Example question:

What fruit do you normally eat?

- Apple

- Banana

- Orange

 

Example datapoint:

apple,banana,orange

 

Datapoints containing more than one entry show up as missing in SAS, and I need to create a new variable to reflect a single entry based on a hierarchy (if q1=apple then fruit=apple; else if q1=banana then fruit=banana; etc.)

 

Is there a way to do this in SAS, or will I need to make changes in a copy of the Excel dataset?

 

Thanks

2 REPLIES 2
Reeza
Super User
Create dummy variables. A variable for apple, orange, etc and code as 0/1 for each value.

To do this, create an array with the full list of variables - assuming you know these and then use a loop with SCAN() to set them to 0/1.

Usually the survey tool has an option to export data like this (SurveyMonkey does) that says separate multiple responses or something similar to that. It’s an option to click before you export the data or the data structure.
ballardw
Super User

@smatthy wrote:

Hello, 

 

I am analyzing survey data containing a "check all that apply" question. Unfortunately the output contains all of the data in one cell.

 

Example question:

What fruit do you normally eat?

- Apple

- Banana

- Orange

 

Example datapoint:

apple,banana,orange

 

Datapoints containing more than one entry show up as missing in SAS, and I need to create a new variable to reflect a single entry based on a hierarchy (if q1=apple then fruit=apple; else if q1=banana then fruit=banana; etc.)

 

Is there a way to do this in SAS, or will I need to make changes in a copy of the Excel dataset?

 

Thanks


By hierarchy do you mean if apple appears anywhere in the list that would be result, if apple is not present then if banana is present that would be the result, with comparisons in order? Here would be one way:

data example;
   infile datalines truncover;
   informat string $50.;
   input string;
   array list{4} $10 _temporary_ ('apple','banana','orange','pear');
   length result $ 10;
   do i=1 to dim(list);
      if index(string,strip(list[i]))>0 then do;
         put list[i]=;
         result=list[i];
         leave;
      end;
   end;
   drop i;
datalines;
apple,banana,orange
banana,orange
banana,orange,apple
pear
apple,pear
;
run;

The number of elements in the list array, the digit in the {} must match the number of elements listed in the parentheses, the order of the values in the parentheses is the "hierarchy" order.

 

 

The LEAVE in the do loop says to exit the loop when encountered, which would be when the value of the list item is found in the string.

The STRIP function is because the value of list[I] will be padded to the defined length of the value in the temporary array (I chose 10 partially to demonstrate this). For character values it would be a good idea for the result variable and the list array size to match.

 

Note that if you have elements that might be part of another, such as 'apple' and 'pineapple' this won't work exactly the same.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 834 views
  • 0 likes
  • 3 in conversation