Hello members,
I am seeking the most efficient solution or examples to output percentages of response categories in a tabular format for an opinion survey using the answer categories and specifications further below,. Any feedback or code would be greatly appreciated, especially for the the most and least positive answer categories.
The metric, Score0-100 is calculated by assigning a higher weight to more positive responses. In this way, the higher the Score0-100, the more positive the result. Across the questions with a five-point response scale, the most positive response is assigned a weight of 100, the second most positive response is given a weight of 75, the third most positive response is given a weight of 50, the second least positive response is assigned a weight of 25 and the least positive response category is given a weight of 0. When Score 0-100 is provided for a group, the average score is calculated across respondents in that group.
For example, if there were 20 respondents in a group, the Score0-100 for each person would be summed and divided by 20 to give the average of the Score0-100.
SCORE100 is calculated by removing the DK and NA, and then attributing a score of…
100 to the most positive answer category
75 to the second most positive answer category
50 to the neutral category
25 to the second most negative answer category
0 to the most negative answer category
ods table onewayfreqs=temp;
proc freq data=fakedata;
tables qa1-qa5;
run;
data percents;
/* This data step code from */
/* https://communities.sas.com/t5/SAS-Programming/proc-freq-one-way-table-for-multiple-vars-export-to-data-set/m-p/924730#M363982 */
length variable $32. variable_value $50.;
set temp;
variable=scan(table, 2);
Variable_Value=strip(trim(vvaluex(variable)));
keep variable variable_value frequency percent cum:;
label variable='Variable' variable_value='Variable Value';
run;
proc transpose data=percents out=percents_t prefix=pct_;
by variable;
var percent;
id variable_value;
run;
data high_score;
set percents_t;
high_score=sum(100*pct_01 , 75*pct_02 , 50*pct_03 , 25*pct_04)/100;
run;
proc report data=high_score Split='~';
columns variable pct_01--high_score;
define variable/display "Question";
define pct_01/display "Strongly~Agree~(%)" format=8.0;
define pct_02/display "Somewhat~Agree~(%)" format=8.0;
define pct_03/display "Neither~Agree~nor~Disagree~(%)" format=8.0;
/* I'm lazy, you type the rest */
run;
Sample data would help us with writing code. Please provide the sample data as WORKING data step code (Examples and instructions) and not in any other format. If there are confidentiality issues, the data can be fake as long as they represent the problem and represent the layout of the raw data.
Although, this seems to me to be fairly standard survey problem; is there not already code at your company or university that will do this?
Also many of us will not download attached files here. The proper way to show us something is to make a screen capture and then use the "Insert Photos" icon to include the screen capture in your reply.
Hi Paige,
The raw data is considered confidential, so unfortunately I cannot post it
here. I will upload some fake data.
Also, I do not have the code, it was available at the time that the example table
was produced but has since been deleted (by someone else) to save disk space. 😛
Please supply usable example data which goes into this report.
Here is some more fake data if this helps; apologies it isn't more comprehensive as I am juggling work today. Least positive is assigned a weight of zero....
/*Answer Categories Code
Strongly agree 01
Somewhat agree 02
Neither agree nor disagree 03
Somewhat disagree 04
Strongly disagree 05
Not applicable 06
Don’t know 97
Not stated 99 */
data WORK.Fakedata(label='Opinion Question Data');
infile datalines dsd truncover;
input Obs:1. QA1:$2. QA2:$2. QA3:$2. QA4:$2. QA5:$2.;
datalines;
1 01 05 01 01 01
2 05 01 01 01 01
3 97 06 06 06 06
4 06 06 06 06 99
5 02 05 01 05 01
6 03 02 03 04 06
7 01 02 02 02 02
8 99 01 01 01 01
9 04 01 01 01 01
10 04 01 02 02 03
11 05 01 02 02 03
12 01 04 01 02 02
13 05 04 01 02 02
14 06 04 01 02 05
15 02 06 04 01 02
16 03 06 04 01 02
17 01 01 02 02 02
18 04 01 02 02 02
19 04 01 02 02 02
20 04 06 04 01 02
21 05 01 02 02 03
22 01 04 06 99 02
23 05 04 01 02 97
24 06 04 01 02 05
25 02 06 04 01 02
26 03 06 04 01 02
27 01 01 02 02 02
28 04 01 02 02 02
29 04 01 02 97 99
30 04 06 04 01 02
31 05 01 02 02 03
32 01 04 06 99 02
33 05 04 01 02 02
34 06 04 01 02 05
35 02 06 04 01 02
36 03 06 04 01 02
37 01 01 02 02 02
38 04 01 02 02 02
39 04 01 02 97 99
40 04 06 04 01 02
41 01 05 01 01 01
42 05 01 01 01 01
43 97 06 06 06 06
44 06 06 06 06 06
45 02 05 01 05 01
46 03 02 03 04 06
47 01 02 02 02 02
48 99 01 01 01 01
49 04 01 01 01 01
50 04 01 02 02 03
;
What a shame you didn't give us fake data for two or more questions, we could have given you an answer if the survey had two or more questions.
Here is my answer. I have not programmed the formula for the score of the "Least Positive" as you have not provided a formula, but in any case, you can follow what I did and program that yourself. I also got lazy in PROC REPORT, there are more lines needed but you can again follow what I did and fill them in.
proc freq data=fakedata;
tables qa/noprint out=_counts_;
run;
proc transpose data=_counts_ out=_counts_t prefix=pct_;
var percent;
id qa;
run;
data high_score;
set _counts_t;
high_score=(100*pct_01 + 75*pct_02 + 50*pct_03 + 25*pct_04)/100;
run;
proc report data=high_score Split='~';
columns pct_01--high_score;
define pct_01/display "Strongly~Agree~(%)";
define pct_02/display "Somewhat~Agree~(%)";
define pct_03/display "Neither~Agree~nor~Disagree~(%)";
/* I'm lazy, you type the rest */
run;
Hi, thanks for your solution. Here is some more fake data if this helps; apologies it isn't more comprehensive as I am juggling work today. Least positive is assigned a weight of zero....
/*Answer Categories Code
Strongly agree 01
Somewhat agree 02
Neither agree nor disagree 03
Somewhat disagree 04
Strongly disagree 05
Not applicable 06
Don’t know 97
Not stated 99 */
data WORK.Fakedata(label='Opinion Question Data');
infile datalines dsd truncover;
input Obs:1. QA1:$2. QA2:$2. QA3:$2. QA4:$2. QA5:$2.;
datalines;
1 01 05 01 01 01
2 05 01 01 01 01
3 97 06 06 06 06
4 06 06 06 06 99
5 02 05 01 05 01
6 03 02 03 04 06
7 01 02 02 02 02
8 99 01 01 01 01
9 04 01 01 01 01
10 04 01 02 02 03
11 05 01 02 02 03
12 01 04 01 02 02
13 05 04 01 02 02
14 06 04 01 02 05
15 02 06 04 01 02
16 03 06 04 01 02
17 01 01 02 02 02
18 04 01 02 02 02
19 04 01 02 02 02
20 04 06 04 01 02
21 05 01 02 02 03
22 01 04 06 99 02
23 05 04 01 02 97
24 06 04 01 02 05
25 02 06 04 01 02
26 03 06 04 01 02
27 01 01 02 02 02
28 04 01 02 02 02
29 04 01 02 97 99
30 04 06 04 01 02
31 05 01 02 02 03
32 01 04 06 99 02
33 05 04 01 02 02
34 06 04 01 02 05
35 02 06 04 01 02
36 03 06 04 01 02
37 01 01 02 02 02
38 04 01 02 02 02
39 04 01 02 97 99
40 04 06 04 01 02
41 01 05 01 01 01
42 05 01 01 01 01
43 97 06 06 06 06
44 06 06 06 06 06
45 02 05 01 05 01
46 03 02 03 04 06
47 01 02 02 02 02
48 99 01 01 01 01
49 04 01 01 01 01
50 04 01 02 02 03
;
ods table onewayfreqs=temp;
proc freq data=fakedata;
tables qa1-qa5;
run;
data percents;
/* This data step code from */
/* https://communities.sas.com/t5/SAS-Programming/proc-freq-one-way-table-for-multiple-vars-export-to-data-set/m-p/924730#M363982 */
length variable $32. variable_value $50.;
set temp;
variable=scan(table, 2);
Variable_Value=strip(trim(vvaluex(variable)));
keep variable variable_value frequency percent cum:;
label variable='Variable' variable_value='Variable Value';
run;
proc transpose data=percents out=percents_t prefix=pct_;
by variable;
var percent;
id variable_value;
run;
data high_score;
set percents_t;
high_score=sum(100*pct_01 , 75*pct_02 , 50*pct_03 , 25*pct_04)/100;
run;
proc report data=high_score Split='~';
columns variable pct_01--high_score;
define variable/display "Question";
define pct_01/display "Strongly~Agree~(%)" format=8.0;
define pct_02/display "Somewhat~Agree~(%)" format=8.0;
define pct_03/display "Neither~Agree~nor~Disagree~(%)" format=8.0;
/* I'm lazy, you type the rest */
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.