Solved: How do I get percentages favorable/unfavorable for all questions in a ...

plzsiga · Posted 10-27-2020 01:21 PM

From a data set with 40 schools’ responses to a 9 Likert Scale question survey, (sample below and excel file attached) I am trying to get one excel file with the percentages of the Strongly Disagree/Disagree (1 and 2) and of the Agree/Strongly Agree. (4 and 5). I have tried several options in Proc Freq, Proc Tabulate and Proc Survey Freq, and cannot get those percentages all in one table.

sch	Q1	Q2	Q3	Q4	Q5	Q6	Q7	Q8	Q9
sch1	5	4	5	4	4	4	4	4	4
sch1	5	5	5	5	5	5	5	5	5
sch1	3	2	4	2	3	2	2	2	2
sch1	5	5	5	4	5	4	5	5	5
sch1	5	5	5	5	5	5	5	4	5
sch1	5	5	5	4	5	5	5	5	5
sch2	5	5	5	4	4	4	2	2	4
sch2	1	1	4	4	5	4	1	1	4
sch2	5	5	5	4	4	4	5	5	4
sch2	4	4	4	4	4	4	3	4	4
sch2	4	4	5	4	4	5	4	4	5
sch2	3	2	4	2	4	2	2	4	4
sch3	5	5	5	5	4	2	5	4	5
sch3	2	2	1	3	1	1	1	1	1
sch3	4	4	4	4	4	4	4	4	4
sch3	2	5	4	1	1	1	1	1	1
sch4	4	2	2	5	2	5	2	2	3
sch4	2	2	3	2	4	4	2	4	5
sch4	1	1	2	1	3	2	1	1	4
sch4	5	5	5	4	4	5	5	5	4
sch4	4	4	4	4	4	2	4	4	4
sch5	4	2	4	3	4	4	2	3	4
sch5	4	2	4	2	4	2	2	2	4
sch5	5	4	3	5	4	4	4	5	4

After converting so 1&2 = D, 3 = O and 4 &5 = A, the closest I have come is 9 separate tables for each question as below using a proc freq crosslist with nocol nofreq nopercent.

proc freq data=ParSurFall20aod; tables school*ans1 /crosslist out=aodcrlist1 nocol nofreq nopercent; run;

But when I try to merge those together I lose the percentages for all but the first question.

Table of School by Q1
School	Q1	Row
Percent
Sch1	A	79.17
	D	12.5
	O	8.33
	Total	100
Sch2	A	91.43
	D	5.71
	O	2.86
	Total	100
Sch3	A	75
	D	12.5
	O	12.5
	Total	100
Sch4	A	80.95
	D	11.9
	O	7.14
	Total	100
Sch5	A	85.71
	D	0
	O	14.29
	Total	100

How can I get this table for all 9 questions without manipulating it all in excel every time we do a survey of our schools.

Question	Q1	Q2	Q1	Q2
Choice	1&2	1&2	4&5	4&5
Sch1	12.5	25.0	79.2	68.8
Sch2	5.7	17.1	91.4	74.3
Sch3	12.5	6.3	75.0	87.5
Sch4	11.9	31.0	81.0	59.5
Sch5	0.0	28.6	85.7	71.4

Kurt_Bremser · Posted 10-29-2020 10:36 AM

Sorry, the n is essential for the first transpose to work, and it has to be set in the initial dataset from the automatic variable _N_:

data have;
input sch $ q1-q9;
n = _n_;
datalines;
sch1
5
4
5
4
4
4
4
4
4
sch1
5
5
5
5
5
5
5
5
5
sch1
3
2
4
2
3
2
2
2
2
sch1
5
5
5
4
5
4
5
5
5
sch1
5
5
5
5
5
5
5
4
5
sch1
5
5
5
4
5
5
5
5
5
sch2
5
5
5
4
4
4
2
2
4
sch2
1
1
4
4
5
4
1
1
4
sch2
5
5
5
4
4
4
5
5
4
sch2
4
4
4
4
4
4
3
4
4
sch2
4
4
5
4
4
5
4
4
5
sch2
3
2
4
2
4
2
2
4
4
sch3
5
5
5
5
4
2
5
4
5
sch3
2
2
1
3
1
1
1
1
1
sch3
4
4
4
4
4
4
4
4
4
sch3
2
5
4
1
1
1
1
1
1
sch4
4
2
2
5
2
5
2
2
3
sch4
2
2
3
2
4
4
2
4
5
sch4
1
1
2
1
3
2
1
1
4
sch4
5
5
5
4
4
5
5
5
4
sch4
4
4
4
4
4
2
4
4
4
sch5
4
2
4
3
4
4
2
3
4
sch5
4
2
4
2
4
2
2
2
4
sch5
5
4
3
5
4
4
4
5
4
;

Please post example data as a data step in the future, so we don't get ugly code from the website formatting.

With this dataset and my code, you get this dataset:

sch	q1A	q2A	q2D	q3A	q4A	q4D	q5A	q6A	q6D	q7A	q7D	q8A	q8D	q9A	q9D	q1D	q3D	q5D
sch1	83.3%	83.3%	16.7%	100%	83.3%	16.7%	83.3%	83.3%	16.7%	83.3%	16.7%	83.3%	16.7%	83.3%	16.7%	.	.	.
sch2	66.7%	66.7%	33.3%	100%	83.3%	16.7%	100%	83.3%	16.7%	33.3%	50.0%	66.7%	33.3%	100%	.	16.7%	.	.
sch3	50.0%	75.0%	25.0%	75.0%	50.0%	25.0%	50.0%	25.0%	75.0%	50.0%	50.0%	50.0%	50.0%	50.0%	50.0%	50.0%	25.0%	50.0%
sch4	60.0%	40.0%	60.0%	40.0%	60.0%	40.0%	60.0%	60.0%	40.0%	40.0%	60.0%	60.0%	40.0%	80.0%	.	40.0%	40.0%	20.0%
sch5	100%	33.3%	66.7%	66.7%	33.3%	33.3%	100%	66.7%	33.3%	33.3%	66.7%	33.3%	33.3%	100%	.	.	.	.

To get the horizontal order right, you can add a pattern set of all possible Q/A combinations to the top of the dataset, and filter it out in the end result.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

ChrisNZ · Posted 10-29-2020 01:24 AM

So you want to end up with 3 times as many columns?

>But when I try to merge

I don't see much code, and certainly no merge.

High-Performance SAS Coding - Third Edition

plzsiga · Posted 10-29-2020 09:48 AM

Chris,

Thanks for asking. Yes, I actually want to get 18 columns for each school. 1 for each question with % 1or2 and 1 with %4or5.

This code gets me 9 crosslists with the percentages I need, as in the initial post. When I try to merge all the crosslists, with the code below, I get the output below.

data ghost2;
set ghost;

ans1 = " "; if Q1 ne . and Q1 le 2 then ans1 = "D" ;
else if Q1 ne . and Q1 ge 4 then ans1 = "A" ;
else if Q1 ne . and Q1 eq 3 then ans1 = "O" ;
ans2 = " "; if Q2 ne . and Q2 le 2 then ans2 = "D" ;
else if Q2 ne . and Q2 ge 4 then ans2 = "A" ;
else if Q2 ne . and Q2 eq 3 then ans2 = "O" ;
ans3 = " "; if Q3 ne . and Q3 le 2 then ans3 = "D" ;
else if Q3 ne . and Q3 ge 4 then ans3 = "A" ;
else if Q3 ne . and Q3 eq 3 then ans3 = "O" ;
ans4 = " "; if Q4 ne . and Q4 le 2 then ans4 = "D" ;
else if Q4 ne . and Q4 ge 4 then ans4 = "A" ;
else if Q4 ne . and Q4 eq 3 then ans4 = "O" ;
ans5 = " "; if Q5 ne . and Q5 le 2 then ans5 = "D" ;
else if Q5 ne . and Q5 ge 4 then ans5 = "A" ;
else if Q5 ne . and Q5 eq 3 then ans5 = "O" ;
ans6 = " "; if Q6 ne . and Q6 le 2 then ans6 = "D" ;
else if Q6 ne . and Q6 ge 4 then ans6 = "A" ;
else if Q6 ne . and Q6 eq 3 then ans6 = "O" ;
ans7 = " "; if Q7 ne . and Q7 le 2 then ans7 = "D" ;
else if Q7 ne . and Q7 ge 4 then ans7 = "A" ;
else if Q7 ne . and Q7 eq 3 then ans7 = "O" ;
ans8 = " "; if Q8 ne . and Q8 le 2 then ans8 = "D" ;
else if Q8 ne . and Q8 ge 4 then ans8 = "A" ;
else if Q8 ne . and Q8 eq 3 then ans8 = "O" ;
ans9 = " "; if Q9 ne . and Q9 le 2 then ans9 = "D" ;
else if Q9 ne . and Q9 ge 4 then ans9 = "A" ;
else if Q9 ne . and Q9 eq 3 then ans9 = "O" ;

run;

proc freq data=ghost2; tables sch*ans1 /crosslist out=crlist1 nocol nofreq nopercent; run;
proc freq data=ghost2; tables sch*ans2 /crosslist out=crlist2 nocol nofreq nopercent; run;
proc freq data=ghost2; tables sch*ans3 /crosslist out=crlist3 nocol nofreq nopercent; run;
proc freq data=ghost2; tables sch*ans4 /crosslist out=crlist4 nocol nofreq nopercent; run;
proc freq data=ghost2; tables sch*ans5 /crosslist out=crlist5 nocol nofreq nopercent; run;
proc freq data=ghost2; tables sch*ans6 /crosslist out=crlist6 nocol nofreq nopercent; run;
proc freq data=ghost2; tables sch*ans7 /crosslist out=crlist7 nocol nofreq nopercent; run;
proc freq data=ghost2; tables sch*ans8 /crosslist out=crlist8 nocol nofreq nopercent; run;
proc freq data=ghost2; tables sch*ans9 /crosslist out=crlist9 nocol nofreq nopercent; run;

data allcrlist;
merge crlist1 crlist2 crlist3 crlist4 crlist5 crlist6 crlist7 crlist8 crlist9
;
by sch;
run;

sch	ans1	COUNT	PERCENT	ans2	ans3	ans4	ans5	ans6	ans7	ans8	ans9
sch1	A	5	20.83333	A	A	A	A	A	A	A	A
sch1	O	1	4.166667	D	A	D	O	D	D	D	D
sch2	A	6	25	A	A	A	A	A	A	A	A
sch2	D	2	8.333333	D	A	D	A	D	D	D	A
sch2	O	1	4.166667	D	A	D	A	D	O	D	A
sch3	A	2	8.333333	A	A	A	A	A	A	A	A
sch3	D	2	8.333333	D	D	D	D	D	D	D	D
sch3	D	1	4.166667	D	D	O	D	D	D	D	D
sch4	A	4	16.66667	A	A	A	A	A	A	A	A
sch4	D	1	4.166667	D	D	D	D	D	D	D	O
sch4	D	1	4.166667	D	O	D	O	D	D	D	O
sch5	A	3	12.5	A	A	A	A	A	A	A	A
sch5	A	1	4.166667	D	O	D	A	D	D	D	A
sch5	A	1	4.166667	D	O	O	A	D	D	O	A

andreas_lds · Posted 10-29-2020 01:59 AM

Please post the data in usable form and the code you already have.

I would start by transposing the data from wide to long, then "conversion" is assigning one format to one variable. Afterwards proc summary to count per school, question and rating, then ... mmh ... maybe proc tabulate.

Kurt_Bremser · Posted 10-29-2020 04:43 AM

See this:

proc transpose
  data=have
  out=long (
    drop=n
    rename=(_name_=question)
  )
;
by n sch;
var q:;
run;

proc format;
value ans
  1,2 = 'D'
  3 = 'O'
  4,5 = 'A'
;
run;

data long2;
set long;
answer = put(col1,ans.);
drop col1;
run;

proc summary data=long2 nway;
class sch question answer;
output out=sum (drop=_type_);
run;

data sum_long (where=(answer ne 'O'));
do until (last.question);
  set sum;
  by sch question;
  count = sum(count,_freq_);
end;
format percent percent7.2;
do until (last.question);
  set sum;
  by sch question;
  percent = _freq_ / count;
  output;
end;
keep sch question answer percent;
run;

proc transpose data=sum_long out=want (drop=_name_);
by sch;
var percent;
id question answer;
run;

The resulting dataset is quite close to what you want.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

plzsiga · Posted 10-29-2020 10:06 AM

Thanks, Kurt.

That did something, but I had to drop 'n' from the by statement 'n sch' statement to go anywhere. I did get through all the data steps, but all the QA1, QA2,...QD1, QD2,.. report out at 100%.

If I delete (where=(answer ne 'O')), then I get 27 columns with 100%, so if I could get the correct %s, for A,D,O, then i could use a formula, or drop the O columns.

I think something essential is lost in the shift from Long to Long2 (short snip below), but I don't see it.

Data Long									DATa long2
sch	question	_LABEL_	COL1	COL2	COL3	COL4	COL5	COL6	sch	question	_LABEL_	COL2	COL3	COL4	COL5	COL6	answer
sch1	Q1	Q1	5	5	3	5	5	5	sch1	Q1	Q1	5	3	5	5	5	A
sch1	Q2	Q2	4	5	2	5	5	5	sch1	Q2	Q2	5	2	5	5	5	A
sch1	Q3	Q3	5	5	4	5	5	5	sch1	Q3	Q3	5	4	5	5	5	A
sch1	Q4	Q4	4	5	2	4	5	4	sch1	Q4	Q4	5	2	4	5	4	A
sch1	Q5	Q5	4	5	3	5	5	5	sch1	Q5	Q5	5	3	5	5	5	A
sch1	Q6	Q6	4	5	2	4	5	5	sch1	Q6	Q6	5	2	4	5	5	A
sch1	Q7	Q7	4	5	2	5	5	5	sch1	Q7	Q7	5	2	5	5	5	A
sch1	Q8	Q8	4	5	2	5	4	5	sch1	Q8	Q8	5	2	5	4	5	A
sch1	Q9	Q9	4	5	2	5	5	5	sch1	Q9	Q9	5	2	5	5	5	A
sch2	Q1	Q1	5	1	5	4	4	3	sch2	Q1	Q1	1	5	4	4	3	A
sch2	Q2	Q2	5	1	5	4	4	2	sch2	Q2	Q2	1	5	4	4	2	A
sch2	Q3	Q3	5	4	5	4	5	4	sch2	Q3	Q3	4	5	4	5	4	A
sch2	Q4	Q4	4	4	4	4	4	2	sch2	Q4	Q4	4	4	4	4	2	A
sch2	Q5	Q5	4	5	4	4	4	4	sch2	Q5	Q5	5	4	4	4	4	A
sch2	Q6	Q6	4	4	4	4	5	2	sch2	Q6	Q6	4	4	4	5	2	A
sch2	Q7	Q7	2	1	5	3	4	2	sch2	Q7	Q7	1	5	3	4	2	D
sch2	Q8	Q8	2	1	5	4	4	4	sch2	Q8	Q8	1	5	4	4	4	D
sch2	Q9	Q9	4	4	4	4	5	4	sch2	Q9	Q9	4	4	4	5	4	A

Kurt_Bremser · Posted 10-29-2020 10:36 AM

Sorry, the n is essential for the first transpose to work, and it has to be set in the initial dataset from the automatic variable _N_:

data have;
input sch $ q1-q9;
n = _n_;
datalines;
sch1
5
4
5
4
4
4
4
4
4
sch1
5
5
5
5
5
5
5
5
5
sch1
3
2
4
2
3
2
2
2
2
sch1
5
5
5
4
5
4
5
5
5
sch1
5
5
5
5
5
5
5
4
5
sch1
5
5
5
4
5
5
5
5
5
sch2
5
5
5
4
4
4
2
2
4
sch2
1
1
4
4
5
4
1
1
4
sch2
5
5
5
4
4
4
5
5
4
sch2
4
4
4
4
4
4
3
4
4
sch2
4
4
5
4
4
5
4
4
5
sch2
3
2
4
2
4
2
2
4
4
sch3
5
5
5
5
4
2
5
4
5
sch3
2
2
1
3
1
1
1
1
1
sch3
4
4
4
4
4
4
4
4
4
sch3
2
5
4
1
1
1
1
1
1
sch4
4
2
2
5
2
5
2
2
3
sch4
2
2
3
2
4
4
2
4
5
sch4
1
1
2
1
3
2
1
1
4
sch4
5
5
5
4
4
5
5
5
4
sch4
4
4
4
4
4
2
4
4
4
sch5
4
2
4
3
4
4
2
3
4
sch5
4
2
4
2
4
2
2
2
4
sch5
5
4
3
5
4
4
4
5
4
;

Please post example data as a data step in the future, so we don't get ugly code from the website formatting.

With this dataset and my code, you get this dataset:

sch	q1A	q2A	q2D	q3A	q4A	q4D	q5A	q6A	q6D	q7A	q7D	q8A	q8D	q9A	q9D	q1D	q3D	q5D
sch1	83.3%	83.3%	16.7%	100%	83.3%	16.7%	83.3%	83.3%	16.7%	83.3%	16.7%	83.3%	16.7%	83.3%	16.7%	.	.	.
sch2	66.7%	66.7%	33.3%	100%	83.3%	16.7%	100%	83.3%	16.7%	33.3%	50.0%	66.7%	33.3%	100%	.	16.7%	.	.
sch3	50.0%	75.0%	25.0%	75.0%	50.0%	25.0%	50.0%	25.0%	75.0%	50.0%	50.0%	50.0%	50.0%	50.0%	50.0%	50.0%	25.0%	50.0%
sch4	60.0%	40.0%	60.0%	40.0%	60.0%	40.0%	60.0%	60.0%	40.0%	40.0%	60.0%	60.0%	40.0%	80.0%	.	40.0%	40.0%	20.0%
sch5	100%	33.3%	66.7%	66.7%	33.3%	33.3%	100%	66.7%	33.3%	33.3%	66.7%	33.3%	33.3%	100%	.	.	.	.

To get the horizontal order right, you can add a pattern set of all possible Q/A combinations to the top of the dataset, and filter it out in the end result.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

plzsiga · Posted 10-29-2020 10:51 AM

Kurt,

You rock!

Thanks for taking the time to go through this both times.

You have saved me hours of work for each time this survey is collected.

Kurt_Bremser · Posted 10-29-2020 11:10 AM

@plzsiga wrote:

Kurt,

You rock!

Thanks for taking the time to go through this both times.

You have saved me hours of work for each time this survey is collected.

As it should be. Let the code do the work, and SAS is very good at that.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

How do I get percentages favorable/unfavorable for all questions in a likert scale survey

Re: How do I get percentages favorable/unfavorable for all questions in a likert scale survey

Re: How do I get percentages favorable/unfavorable for all questions in a likert scale survey

Re: How do I get percentages favorable/unfavorable for all questions in a likert scale survey

Re: How do I get percentages favorable/unfavorable for all questions in a likert scale survey

Re: How do I get percentages favorable/unfavorable for all questions in a likert scale survey

Re: How do I get percentages favorable/unfavorable for all questions in a likert scale survey

Re: How do I get percentages favorable/unfavorable for all questions in a likert scale survey

Re: How do I get percentages favorable/unfavorable for all questions in a likert scale survey

Re: How do I get percentages favorable/unfavorable for all questions in a likert scale survey

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!