Solved: Calculate most common issues from survey data

megsredl · Posted 01-21-2021 11:48 AM

Hello! I am working with a set of survey data. Each question is saved as a variable, and the answers are coded as 1/0 for yes/no or 1/2/3 for always/sometimes/never. There are around 50 questions. Example data below:

data have;
input name $ q1 q2 q3 q4 q5;
datalines;
FacA 1 0 1 2 2
FacB 0 0 3 2 3
FacC 1 1 2 1 3
;
run;

I'm trying to generate a list of the most common issues identified on the survey, i.e. the most common questions with an answer other than 1. How can I create a table with the question number (variable) and the number of survey responses that identified that as an issue? I am using 9.4.

Thank you for the help!

ballardw · Posted 01-21-2021 01:19 PM

@megsredl wrote:

Thank you! The questions are not weighted and there are not any sampling issues that need to be considered. This code gets a table for each variable, but since I have so many variables it would take a long time to manually order the results. Is there a way to export the data from all of those proc freq tables into one new dataset?

If you want all of the results in a single data set you can use the ODS OUTPUT option

proc freq data=have;

ods output onewayfreqs = yourdatasetname;

table q1-q5;

The output data set will have variable named Table with something like Table Q1 to identify the table request generating the row of data, then a number of additional variables, your variable names with the format used in the freq and another prefixing each of your variables with an F_ . The F_ will be a character version of the formatted value. Not always the prettiest for some purposes.

This might be helpful:

proc freq data=have;
   ods output onewayfreqs=work.summary;
   tables q1-q5;
   format q1-q5 Not_one.;
run;

data toprint;
  set work.summary;
  onerowlabel = coalescec(of F_:);
  if onerowlabel='Not One';
run;

proc print data=toprint;
   var table onerowlabel frequency percent;
run;

The data step uses a function, Coalescec to select the first of the F_ values present on the line. the OF says we're going to use a group of variable ans the F_: colon list says "use all the variables whose names start with F_.

Then the IF keep on the ones that are "not one".

A different format to exclude missing would prevent all the "not one" displays for the other variable rows. But that's hindsight. Exercise for the interested reader: to modify that format to display missing values as " "

View solution in original post

ballardw · Posted 01-21-2021 11:57 AM

@megsredl wrote:

Hello! I am working with a set of survey data. Each question is saved as a variable, and the answers are coded as 1/0 for yes/no or 1/2/3 for always/sometimes/never. There are around 50 questions. Example data below:
data have;
input name $ q1 q2 q3 q4 q5;
datalines;
FacA 1 0 1 2 2
FacB 0 0 3 2 3
FacC 1 1 2 1 3
;
run;
I'm trying to generate a list of the most common issues identified on the survey, i.e. the most common questions with an answer other than 1. How can I create a table with the question number (variable) and the number of survey responses that identified that as an issue? I am using 9.4.

Thank you for the help!

One relatively easy start is a custom format to group the "not one" values and count them.

proc format;
   value Not_one
1 = 'One'
other = 'Not One'
;

proc freq data=have;
   tables q1-q5;
   format q1-q5 Not_one.;
run;

Groups created by formats in this manner are generally supported by report, graphing and analysis procedures.

Question though. Surveys often have weights attached and may come from a complex sample structure. Does that apply to your data?

megsredl · Posted 01-21-2021 12:12 PM

Thank you! The questions are not weighted and there are not any sampling issues that need to be considered. This code gets a table for each variable, but since I have so many variables it would take a long time to manually order the results. Is there a way to export the data from all of those proc freq tables into one new dataset?

ballardw · Posted 01-21-2021 01:19 PM

@megsredl wrote:

Thank you! The questions are not weighted and there are not any sampling issues that need to be considered. This code gets a table for each variable, but since I have so many variables it would take a long time to manually order the results. Is there a way to export the data from all of those proc freq tables into one new dataset?

If you want all of the results in a single data set you can use the ODS OUTPUT option

proc freq data=have;

ods output onewayfreqs = yourdatasetname;

table q1-q5;

The output data set will have variable named Table with something like Table Q1 to identify the table request generating the row of data, then a number of additional variables, your variable names with the format used in the freq and another prefixing each of your variables with an F_ . The F_ will be a character version of the formatted value. Not always the prettiest for some purposes.

This might be helpful:

proc freq data=have;
   ods output onewayfreqs=work.summary;
   tables q1-q5;
   format q1-q5 Not_one.;
run;

data toprint;
  set work.summary;
  onerowlabel = coalescec(of F_:);
  if onerowlabel='Not One';
run;

proc print data=toprint;
   var table onerowlabel frequency percent;
run;

The data step uses a function, Coalescec to select the first of the F_ values present on the line. the OF says we're going to use a group of variable ans the F_: colon list says "use all the variables whose names start with F_.

Then the IF keep on the ones that are "not one".

A different format to exclude missing would prevent all the "not one" displays for the other variable rows. But that's hindsight. Exercise for the interested reader: to modify that format to display missing values as " "

megsredl · Posted 02-02-2021 04:07 PM

Thank you so much for the help, this worked perfectly. I appreciate you taking the time to help!

Calculate most common issues from survey data

Re: Calculate most common issues from survey data

Re: Calculate most common issues from survey data

Re: Calculate most common issues from survey data

Re: Calculate most common issues from survey data

Re: Calculate most common issues from survey data

Calculate most common issues from survey data

Re: Calculate most common issues from survey data

Re: Calculate most common issues from survey data

Re: Calculate most common issues from survey data

Re: Calculate most common issues from survey data

Re: Calculate most common issues from survey data

Registration is open

SAS Training: Just a Click Away