BookmarkSubscribeRSS Feed

How to Analyze a Survey using SAS and Data from Qualtrics

Started ‎02-01-2024 by
Modified ‎01-30-2024 by
Views 1,170

The third in a series of papers addressing how to use SAS to analyze and report on data from survey data.  In this paper I describe how to analyze data with SAS that was collected using Qualtrics.

 

Surveys: Finding the Message in the Tables (2019 paper)

In 2019 I made a presentation at the annual American Society of Public Administration and the SAS Global Conference called Surveys: Finding the Message in the Tables. This paper is posted on my LinkedIn page. In that presentation I make the point that while surveys can generate a lot of tables with numbers, it is often overwhelming to identify the message. I observed many skilled survey researchers and executives spend days going over tables, sometimes copying numbers into spreadsheets, sometimes making graphs hoping that the message would just jump out. It always takes more time than anticipated. In this paper I described a technique for collapsing, ordering and color coding the results, and consolidating subgroups so that the message would be more accessible and apparent.

 

Sometimes a Good Idea Needs Polishing! (2021 paper)

After a few years I wrote a second paper that improved on this approach which you will also find on my LinkedIn page, Sometimes a Good Idea Needs Polishing!. I found 5 problematic areas in my initial approach and this second paper provides solutions. These areas were (1) demographic variables have different numbers of levels, (2) response scales have different numbers of levels, (3) sparse code in SAS does not work with proc frequency, (4) using proc means is also not a good alternative, and (5) you also need analyses that include all of the categories for all of the demographics and response scales. I provide the solution and offer the SAS code to demonstrate its implementation.

 

Adapting  the process for Qualtrics

Fast forward to 2022 we moved from our custom software package called Questionnaire Programming Language (QPL) to conduct surveys using the off-the-shelf package -- Qualtrics. This transition created a need for a new approach to writing the SAS code to analyze the data. While there are other software packages that can be used to analyze data, my expertise was in SAS and so I needed a SAS solution. So the challenge was how to get the survey content and the data structure from Qualtrics into SAS.


This paper demonstrates my solution. I certainly believe there are other approaches that will work. But having conducted numerous surveys using this approach (about 50 currently) and how easy it was to train an inexperienced SAS programmer to use this approach, you might adopt this as your go to approach. One additional feature you will see is that once someone reviews this code, it will be apparent how little review is needed since I am repurposing the text that comes from Qualtrics into SAS code so there is no manual intervention (i.e., text will not have typographical errors nor will the variable names not match the question text). 

 

Qualtrics TSV File

Qualtrics has several options for exporting data, one is called TSV. Once a survey is developed and prior to collecting any data, I download a copy of the tsv file with the selection to use numeric values and to split multi-value variables into columns. I then make a duplicate copy of the first row which has the variable information (e.g., Q2) so that when SAS reads the TSV file it has the content of the variable names within the data set. I then use this SAS code, which will work with any survey. (I refer to data=x that is where you would make reference to your dataset.) I did contact the technical support unit in SAS to see if there is an alternative to this manual cutting and pasting and was told what I did was probably appropriate. I do believe that in the future a script of some sort using AI might avoid this manual step.


The general approach is to (1) transpose the dataset so that you can get the pairs of the variable names and the survey content and (2) write out a variable and question content that would be used for an invocation to a macro for a tabulate or a proc report. (Note: Since it will not be apparent which questions are narrative and would need a proc report all invocations are set with %an1 which refers to a proc tabulate. Those would be changed to %an2 for the proc report. However, if you adopt a variable name convention that narratives have TEXT in the variable name, then the macro %an1 which uses a proc tabulate will be used for numeric questions and %an2 will be used for proc report for narrative questions.)

 

Qualtrics SPSS File

In addition to exporting the TSV file which will have the survey questions and the data, you will also need to export the SPSS sav file in order to get the formats for the response categories for your survey.


First, I present the code using the tsv file then the code that uses the SPSS sav file.


Three observations:

  • My data set had some line feeds so I included a hexadecimal compression on the pt5x variable. (pt5x is the name of the variable which contains the invocations to the macros for the entire survey. You can name this variable anything you want.)
  • You might find the need to check for single and double quotes since I can’t anticipate whether or not your questionnaire text might use quotes.
  • You will have to enter your path where you want to write out the pt5x variable. pt5x will have the invocation to the macro for all of the variables in your survey. The new after the txt will replace the pt5x.txt file if you have to create it more than one time during debugging of your program. And the nodate nonumber will produce a listing of pt5x without any extraneous header.
options source nostimer nocenter pagesize = 60 linesize = 132 noquotelenmax;
options validvarname=v7;
filename survey "path for your tsv data.tsv" encoding="utf-16" termstr=CRLF;
run;

proc import datafile=survey out=x dbms=tab replace;
  getname=yes;
run;

proc transpose data=x out=newone;
  var _all_;
run;

data newone;
  set newone;
  keep name label;
  name=_NAME_;
  label= col2;
  if substr(name,1,1)='Q';
run;

data newone2;
  set newone;
  check=index(name,'TEXT');

  /*Note: I did this in separate string fields so you can how your data set is being manipulated
  and you may need to make adjustments. For the vast majority of the time, I believe no
  adjustments will be needed. */
  if check=0 then
    do;
      pt1="%an1 (var=";
      pt2=strip(pt1)||left(trim(name));
      pt3=strip(pt2)||', title="';
      pt3a=strip(pt3)||left(trim(name))||'. ';
      PT4=strip(pt3a)||left(trim(label));
      PT5=strip(pt4)||'"); run; ';
      pt5x=compress(compbl(pt5),"0D0A"X);
    end;

  if check gt 0 then
    do;
      pt1="%an2 (var=";
      pt2=strip(pt1)||left(trim(name));
      pt3=strip(pt2)||', title="';
      pt3a=strip(pt3)||left(trim(name))||'. ';
      PT4=strip(pt3a)||left(trim(label));
      PT5=strip(pt4)||'"); run; ';
      pt5x=compress(compbl(pt5),"0D0A"X);
    end;
run;

data newfin;
  set newone2;
  keep pt5x;
run;

options nodate nonumber;

proc printto print=' path where you stored the invocations to the macro pt5x.txt' new;
run;

proc report data=newfin noheader;
  column pt5x;
  define pt5x/display pt5x/display width=100 flow;
run;

proc printto;
run;

/*
this is the code to read in the data from the survey. This is the second export from Qualtrics. In
this case you are not modifying the download
*/
filename survey "path where you tsv file is stored—survey_results.tsv" encoding="utf-16" termstr=CRLF;
run;

proc import datafile=survey
  out=survey_results dbms=tab replace;
  getnames=yes;
  datarow=4;
  guessingrows=10000;
run;

 

This is the code to take the SPSS SAV formats and convert them into SAS format syntax.

libname a "path where you will store the formats";

proc import datafile="path of where the save file is stored"
  out=survey_formats dbms=sav replace;
  fmtlib=a.formats;
run;

proc format lib=a.formats;
  select a-zzzzzzzz $a-$zzzzzzzz;
run;

/* Set up a file reference for the transport file */
libname trans xport "path where the sas formats are stored/FORMATS.sas7bcat";

/* Write all the format entries in the format library to CNTLFMT data set and store in transport
format */
proc format library=A cntlout=trans.cntlfmt;
  /*take all the blanks out of the start/end values*/
proc sql;
  create table spss_fmts as
    select fmtname, compress(start) as start, compress(end) as end, type, label
      from trans.cntlfmt;
quit;

run;
/*Use the new dataset to create a format catalog in SAS format*/
proc format library=WORK cntlin=spss_fmts;
run;

/*the following SAS code takes the formats that came from SPSS sav file and reorganizes them
into SAS syntax and ultimately the file use has the proc format with code to say what the
response categories stand for and the format statement that associates each variable with the
appropriate format. The lenx statement was used because some of the formats ended in a
number and SAS cannot process formats with numbers. So using a substr length statement
drops the extra letter for the variable names while maintaining them for the format name*/
data work.spss_fmts;
  set work.spss_fmts;
  lenx=length(fmtname);
  var=substr(fmtname,1,lenx-1);
run;

data make; 
 set work.spss_fmts; 
 keep var start label val list; var=var;
 if var='FINISHE' or var='STATUS' then delete;
 if label="Don't know" then label="Do not know";
 label=left(trim(label));
 if start=1 then do;
  val=';value'||' '||strip(fmtname)||' '||strip(start) ||"="||"'"||strip(label)||"'";
  end=strip(fmtname)||'.';
  list=strip(var)||' '||strip(end);
 end;
 if 2<=start<=5 then do;
   val= strip(start) ||"='"||""||strip(label)||"'";
 end;
 if start=6 then do;
  val= strip(start) ||"='"||""||strip(label)||"';";
 end;
run;

data make1x;
  length pval $65.;
  set make;
  keep pval;
  pval="run; proc format;";

  if _N_=1;
run;

data make1;
  length pval $65.;
  set make;
  keep pval;
  pval=val;
run;

data make2;
  length pval $65.;
  set make;
  keep pval;
  pval='run; data survey_results; set survey_results format';

  if _N_=1;
run;

data make3;
  length pval $65.;
  set make;
  keep pval;

  if list ne ' ';
  pval=list;
run;

data make4;
  length pval $65.;
  set make;
  keep pval;
  pval='; run;';

  if _N_=1;
run;

data use;
  set make1x make1 make2 make3 make4;
run;

options nodate nonumber;

proc printto print='path where you store the content of the formats formats.txt' new;
run;

proc report data=use noheader;
  column pval;
  define pval/display width=100 flow;
run;

proc printto;
run;

 

Step 1: SAS Code to invoke the macro for all of the variables in your survey. I begin with a proc template that specifies the style for reporting. Feel free to change that to your preference. 

proc template;
  define style styles.test;
    parent=styles.minimal;
    style systemtitle from systemtitle /
      font_face=helvetica font_size=3
      font_weight=bold just=l;
    style systemfooter from systemfooter /
      font_face=helvetica font_size=3;
    style header from header/
      font_face=helvetica
      just=l vjust=b;
    style data from data/
      font_face=helvetica;
    style rowheader from rowheader/
      font_face=helvetica;
    style table from table /
      just=left;
  end;
run;

 

 Step 2: The path to write your tables.

ods html body="your path.an1.html" (Title=' survey title')
  headtext="<style> hr {page-break-after:always} thead {display:table-header-group}
  @media all {font {font-size=80%}} </style>" 
  style=styles.test;

 

 Step 3: A macro for the proc tabulate and a macro for the proc report.

%include "path where you stored the formats/formats.txt";
run;

ods html body="path where you want to write the results of the analysis/results.html"
  (Title='Results of Survey')
  headtext="<style> hr {page-break-after:always} thead {display:table-header-group}
  @media all {font {font-size=80%}} </style>" style=styles.test;

%macro an1 (var=var, title=title);

  proc tabulate data=survey_results missing;
    class &var/ order=data;
    tables (&var=''*(pctn<&var>='%'*f=6.1 n='Number of Cases'*f=comma8.) all*n='Total
      Cases'*f=6.0) /
      rts=55 printmiss misstext='0';
    title &title;
  RUN;

%mend an1;

%macro an2 (var=var, title=title);

  proc report data=survey_resutls nowindows spacing=1 pspace=1 split='*' missing headline;
    where &var ne ' ';
    columns &var;
    define &var /display flow 'Comment' format=$8000. style ={just=left cellwidth=8 in};
    title1 &title;
  run;

%mend an2;

%include "path where you stored the invocations to the macro pt5x.txt";
run;

ods html close;
run;
Version history
Last update:
‎01-30-2024 08:53 AM
Updated by:
Contributors

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags