BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TashaChapman14
Obsidian | Level 7

Hey all!

 

I've got some survey data that came out of Qualtrics where the "Check all that apply" options were all put in the same field with commas separating the options.  For example,

 

We had a question asking what topics people were most interested in (select up to 5) and the results came back like this:

Education,Technology
Health care,Human services
Health care,Human services,Justice and law ,Public safety and emergency ,Workforce and the economy
Education,Housing,Human services,Justice and law ,Technology
Agriculture,Education,Environment and natural resources,Health care,Workforce and the economy
Health care,Human services,Technology,Workforce and the economy
Environment and natural resources,Housing,Justice and law ,Transportation and infrastructure,Workforce and the economy
Education,Environment and natural resources,Health care,Justice and law ,Technology
Finance, accounting and procurement,Justice and law ,Tax and fiscal policy,Technology,Workforce and the economy
Education,Environment and natural resources,Housing,Public safety and emergency ,Transportation and infrastructure
Agriculture,Energy,Environment and natural resources,Transportation and infrastructure,Workforce and the economy
Energy,Health care,Human services,Public safety and emergency ,Workforce and the economy
Health care,Justice and law ,Tax and fiscal policy,Technology,Workforce and the economy
Environment and natural resources
Environment and natural resources,Health care,Justice and law ,Transportation and infrastructure,Technology

 

 

Clearly I'd like to split all this data up so that I can get summary counts of each topic.  Any thoughts on how I can do this quickly and easily?  I'm mostly stumped.  I don't even mind if I have a separate data set with one column where each row is a single response.  I'm totally cool with that.  Just don't know how to get there...

 

 

I found one option that sort of looks like what I want, if only I could get it to work for the entire dataset...  If you check out the SAS Documentation for the SCAN function (http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000214639.htm) under Example #2 there's a piece where it uses a do loop to split apart each word in a string.  However, when I tried this with my dataset (see sample code below) it only gave me the data for the first observation.  It's late in the day and I can't figure this one out.  Thoughts?

 

 

data split; set results;
keep word;
do until(word=' ');
      count+1;
      word = scan(Q6, count);
      output;
   end;
run;

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

The code you came up with is pretty close to working.  First, insert this statement before the DO loop:

 

count=0;

 

Without it, COUNT just keeps on increasing.  So that gets COUNT to start over for each observation.

 

And second, you would probably be better off changing the OUTPUT statement to be:

 

if word > ' ' then output;

View solution in original post

6 REPLIES 6
Astounding
PROC Star

The code you came up with is pretty close to working.  First, insert this statement before the DO loop:

 

count=0;

 

Without it, COUNT just keeps on increasing.  So that gets COUNT to start over for each observation.

 

And second, you would probably be better off changing the OUTPUT statement to be:

 

if word > ' ' then output;

TashaChapman14
Obsidian | Level 7

So I changed my code to this, as suggested:

 

data split; set results;
keep word;
count=0;
do until(word=' ');
      count+1;
      word = scan(Q6, count, ",");
      if word>' ' then output;
   end;
run;

And it worked perfectly!  Thank you!

 

 

PGStats
Opal | Level 21

You better specify the comma as delimiter

 

data split; 
set results;
length word $40;
do count = 1 to countw(Q6, ",");
      word = scan(Q6, count, ",");
      output;
   end;
run;

 

PG
TashaChapman14
Obsidian | Level 7

Yes!  Thank you!  

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Not far off, so changed your slightly:

data split (drop=i); 
  length word $200;
  set results;
  do i=1 to countw(q6,",");
      word=scan(q6,i);
      output;
  end;
run;
ballardw
Super User

From working with a number of Survey software platforms I might suggest seeing if there are data export options from the survey software and re-exporting. It is not uncommon to have an option for multiple response questions to create multiple dichotomous(yes/no) responses from a similar question (best) or a "one response category variable per selection" format.

 

Since the next step after identifying or splitting up your combined field data should be to get something so you know which response is what (the first response does not mean the same thing for all respondents) re-exporting the data with proper settings may save a lot of headaches, especially since if you have many of these multiple response questions.

 

I will say I much prefer software that doesn't use commas in this particular delimiter role as sometimes the individual response categories may contain one or more commas.

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1023 views
  • 0 likes
  • 5 in conversation