The third in a series of papers addressing how to use SAS to analyze and report on data from survey data. In this paper I describe how to analyze data with SAS that was collected using Qualtrics.
In 2019 I made a presentation at the annual American Society of Public Administration and the SAS Global Conference called Surveys: Finding the Message in the Tables. This paper is posted on my LinkedIn page. In that presentation I make the point that while surveys can generate a lot of tables with numbers, it is often overwhelming to identify the message. I observed many skilled survey researchers and executives spend days going over tables, sometimes copying numbers into spreadsheets, sometimes making graphs hoping that the message would just jump out. It always takes more time than anticipated. In this paper I described a technique for collapsing, ordering and color coding the results, and consolidating subgroups so that the message would be more accessible and apparent.
After a few years I wrote a second paper that improved on this approach which you will also find on my LinkedIn page, Sometimes a Good Idea Needs Polishing!. I found 5 problematic areas in my initial approach and this second paper provides solutions. These areas were (1) demographic variables have different numbers of levels, (2) response scales have different numbers of levels, (3) sparse code in SAS does not work with proc frequency, (4) using proc means is also not a good alternative, and (5) you also need analyses that include all of the categories for all of the demographics and response scales. I provide the solution and offer the SAS code to demonstrate its implementation.
Fast forward to 2022 we moved from our custom software package called Questionnaire Programming Language (QPL) to conduct surveys using the off-the-shelf package -- Qualtrics. This transition created a need for a new approach to writing the SAS code to analyze the data. While there are other software packages that can be used to analyze data, my expertise was in SAS and so I needed a SAS solution. So the challenge was how to get the survey content and the data structure from Qualtrics into SAS.
This paper demonstrates my solution. I certainly believe there are other approaches that will work. But having conducted numerous surveys using this approach (about 50 currently) and how easy it was to train an inexperienced SAS programmer to use this approach, you might adopt this as your go to approach. One additional feature you will see is that once someone reviews this code, it will be apparent how little review is needed since I am repurposing the text that comes from Qualtrics into SAS code so there is no manual intervention (i.e., text will not have typographical errors nor will the variable names not match the question text).
Qualtrics has several options for exporting data, one is called TSV. Once a survey is developed and prior to collecting any data, I download a copy of the tsv file with the selection to use numeric values and to split multi-value variables into columns. I then make a duplicate copy of the first row which has the variable information (e.g., Q2) so that when SAS reads the TSV file it has the content of the variable names within the data set. I then use this SAS code, which will work with any survey. (I refer to data=x that is where you would make reference to your dataset.) I did contact the technical support unit in SAS to see if there is an alternative to this manual cutting and pasting and was told what I did was probably appropriate. I do believe that in the future a script of some sort using AI might avoid this manual step.
The general approach is to (1) transpose the dataset so that you can get the pairs of the variable names and the survey content and (2) write out a variable and question content that would be used for an invocation to a macro for a tabulate or a proc report. (Note: Since it will not be apparent which questions are narrative and would need a proc report all invocations are set with %an1 which refers to a proc tabulate. Those would be changed to %an2 for the proc report. However, if you adopt a variable name convention that narratives have TEXT in the variable name, then the macro %an1 which uses a proc tabulate will be used for numeric questions and %an2 will be used for proc report for narrative questions.)
In addition to exporting the TSV file which will have the survey questions and the data, you will also need to export the SPSS sav file in order to get the formats for the response categories for your survey.
First, I present the code using the tsv file then the code that uses the SPSS sav file.
Three observations:
options source nostimer nocenter pagesize = 60 linesize = 132 noquotelenmax;
options validvarname=v7;
filename survey "path for your tsv data.tsv" encoding="utf-16" termstr=CRLF;
run;
proc import datafile=survey out=x dbms=tab replace;
getname=yes;
run;
proc transpose data=x out=newone;
var _all_;
run;
data newone;
set newone;
keep name label;
name=_NAME_;
label= col2;
if substr(name,1,1)='Q';
run;
data newone2;
set newone;
check=index(name,'TEXT');
/*Note: I did this in separate string fields so you can how your data set is being manipulated
and you may need to make adjustments. For the vast majority of the time, I believe no
adjustments will be needed. */
if check=0 then
do;
pt1="%an1 (var=";
pt2=strip(pt1)||left(trim(name));
pt3=strip(pt2)||', title="';
pt3a=strip(pt3)||left(trim(name))||'. ';
PT4=strip(pt3a)||left(trim(label));
PT5=strip(pt4)||'"); run; ';
pt5x=compress(compbl(pt5),"0D0A"X);
end;
if check gt 0 then
do;
pt1="%an2 (var=";
pt2=strip(pt1)||left(trim(name));
pt3=strip(pt2)||', title="';
pt3a=strip(pt3)||left(trim(name))||'. ';
PT4=strip(pt3a)||left(trim(label));
PT5=strip(pt4)||'"); run; ';
pt5x=compress(compbl(pt5),"0D0A"X);
end;
run;
data newfin;
set newone2;
keep pt5x;
run;
options nodate nonumber;
proc printto print=' path where you stored the invocations to the macro pt5x.txt' new;
run;
proc report data=newfin noheader;
column pt5x;
define pt5x/display pt5x/display width=100 flow;
run;
proc printto;
run;
/*
this is the code to read in the data from the survey. This is the second export from Qualtrics. In
this case you are not modifying the download
*/
filename survey "path where you tsv file is stored—survey_results.tsv" encoding="utf-16" termstr=CRLF;
run;
proc import datafile=survey
out=survey_results dbms=tab replace;
getnames=yes;
datarow=4;
guessingrows=10000;
run;
This is the code to take the SPSS SAV formats and convert them into SAS format syntax.
libname a "path where you will store the formats";
proc import datafile="path of where the save file is stored"
out=survey_formats dbms=sav replace;
fmtlib=a.formats;
run;
proc format lib=a.formats;
select a-zzzzzzzz $a-$zzzzzzzz;
run;
/* Set up a file reference for the transport file */
libname trans xport "path where the sas formats are stored/FORMATS.sas7bcat";
/* Write all the format entries in the format library to CNTLFMT data set and store in transport
format */
proc format library=A cntlout=trans.cntlfmt;
/*take all the blanks out of the start/end values*/
proc sql;
create table spss_fmts as
select fmtname, compress(start) as start, compress(end) as end, type, label
from trans.cntlfmt;
quit;
run;
/*Use the new dataset to create a format catalog in SAS format*/
proc format library=WORK cntlin=spss_fmts;
run;
/*the following SAS code takes the formats that came from SPSS sav file and reorganizes them
into SAS syntax and ultimately the file use has the proc format with code to say what the
response categories stand for and the format statement that associates each variable with the
appropriate format. The lenx statement was used because some of the formats ended in a
number and SAS cannot process formats with numbers. So using a substr length statement
drops the extra letter for the variable names while maintaining them for the format name*/
data work.spss_fmts;
set work.spss_fmts;
lenx=length(fmtname);
var=substr(fmtname,1,lenx-1);
run;
data make;
set work.spss_fmts;
keep var start label val list; var=var;
if var='FINISHE' or var='STATUS' then delete;
if label="Don't know" then label="Do not know";
label=left(trim(label));
if start=1 then do;
val=';value'||' '||strip(fmtname)||' '||strip(start) ||"="||"'"||strip(label)||"'";
end=strip(fmtname)||'.';
list=strip(var)||' '||strip(end);
end;
if 2<=start<=5 then do;
val= strip(start) ||"='"||""||strip(label)||"'";
end;
if start=6 then do;
val= strip(start) ||"='"||""||strip(label)||"';";
end;
run;
data make1x;
length pval $65.;
set make;
keep pval;
pval="run; proc format;";
if _N_=1;
run;
data make1;
length pval $65.;
set make;
keep pval;
pval=val;
run;
data make2;
length pval $65.;
set make;
keep pval;
pval='run; data survey_results; set survey_results format';
if _N_=1;
run;
data make3;
length pval $65.;
set make;
keep pval;
if list ne ' ';
pval=list;
run;
data make4;
length pval $65.;
set make;
keep pval;
pval='; run;';
if _N_=1;
run;
data use;
set make1x make1 make2 make3 make4;
run;
options nodate nonumber;
proc printto print='path where you store the content of the formats formats.txt' new;
run;
proc report data=use noheader;
column pval;
define pval/display width=100 flow;
run;
proc printto;
run;
Step 1: SAS Code to invoke the macro for all of the variables in your survey. I begin with a proc template that specifies the style for reporting. Feel free to change that to your preference.
proc template;
define style styles.test;
parent=styles.minimal;
style systemtitle from systemtitle /
font_face=helvetica font_size=3
font_weight=bold just=l;
style systemfooter from systemfooter /
font_face=helvetica font_size=3;
style header from header/
font_face=helvetica
just=l vjust=b;
style data from data/
font_face=helvetica;
style rowheader from rowheader/
font_face=helvetica;
style table from table /
just=left;
end;
run;
Step 2: The path to write your tables.
ods html body="your path.an1.html" (Title=' survey title')
headtext="<style> hr {page-break-after:always} thead {display:table-header-group}
@media all {font {font-size=80%}} </style>"
style=styles.test;
Step 3: A macro for the proc tabulate and a macro for the proc report.
%include "path where you stored the formats/formats.txt";
run;
ods html body="path where you want to write the results of the analysis/results.html"
(Title='Results of Survey')
headtext="<style> hr {page-break-after:always} thead {display:table-header-group}
@media all {font {font-size=80%}} </style>" style=styles.test;
%macro an1 (var=var, title=title);
proc tabulate data=survey_results missing;
class &var/ order=data;
tables (&var=''*(pctn<&var>='%'*f=6.1 n='Number of Cases'*f=comma8.) all*n='Total
Cases'*f=6.0) /
rts=55 printmiss misstext='0';
title &title;
RUN;
%mend an1;
%macro an2 (var=var, title=title);
proc report data=survey_resutls nowindows spacing=1 pspace=1 split='*' missing headline;
where &var ne ' ';
columns &var;
define &var /display flow 'Comment' format=$8000. style ={just=left cellwidth=8 in};
title1 &title;
run;
%mend an2;
%include "path where you stored the invocations to the macro pt5x.txt";
run;
ods html close;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.