It seems like this should be easy, but I don't know how yet. I have data like this
q1 q2 q3 q4
yes no yes yes
no yes yes yes
yes yes no no
Each q is a question in a survey. How do I get a table showing percent yes and percent no, for all the questions, like this:
%yes %no
q1 30% 70%
q2 20% 80%
q3 60% 40%
Thanks
Gene
Hello geneshackman,
Maybe try:
data work.have;
length
id 8
q1-q4 $3;
input
q1-q4 $;
retain
id(0);
id + 1;
datalines;
yes no yes yes
no yes yes yes
yes yes no no
;
data work.want;
set work.have;
retain id(0);
array
q{4} q1-q4;
do question=1 to 4;
answer=q{question};
output;
end;
id + 1;
drop q1-q4;
run;
proc freq
data=work.want;
tables question*answer / nocum nopercent nofreq nocol;
run;
See: Transposing Data for Statistical Analysis :: Base SAS(R) 9.4 Procedures Guide, Second Edition
Thanks very much for the suggestion. Would you be able to explain, perhaps in general, what this is doing?
Hello Gene,
Sure. The first data step creates a SAS data set from the data you have, formatted as above, minus the first row which contains what look like column headings. Variables with the same names as the column headings are created. A variable (id) is added to uniquely identify each row.
The second data step creates a SAS data set containing the data you want, and which can easily be consumed by PROC FREQ. Basically, it creates a row for each unique combination of id, question number (1-4), and answer (yes/no).
PROC FREQ then generates a crosstab (question*answer), and suppresses some output (nocum, etc.). You have a lot of control over what output PROC FREQ generates and also how it is formatted and presented, so if you wanted percentages rounded to the nearest integer you could use the FORMAT= option to specify a format, for example.
For the specific meaning of options "NOCUM", etc., see: TABLES Statement :: Base SAS(R) 9.4 Procedures Guide: Statistical Procedures, Second Edition, or just turn them off to see how the output changes.
Thanks Jason. Just fyi, and in case anyone else is interested, I got another suggestion for another way that works.
data test;
input qu1 $ qu2 $ qu3 $ qu4 $;
datalines;
yes no yes yes
no yes yes yes
yes yes no no
;;
run;
proc transpose data=test out=test1;
var q:;
run;
data test2 (drop=col:);
set test1;
num_yes=count(cats(of col:),'yes');
num_no=count(cats(of col:),'no');
per_yes=num_yes/(num_yes+num_no);
per_no=num_no/(num_yes+num_no);
run;
proc print data=test2 noobs label;
var _name_ per_yes per_no;
label _name_='question' per_yes='%yes' per_no='%no';
format per_yes per_no percent5.0;
run;
Sure, no problem. I think SAS, not Perl, was probably the original "There's more than one way to do it" language.
This may be easy to understand.
data have;
input q1 $ q2 $ q3 $ q4 $;
datalines;
yes no yes yes
no yes yes yes
yes yes no no
;
data want(keep=vars response);
set have;
array q{*} q1 q2 q3 q4;
do i=1 to dim(q);
response=q{i};
vars=vname(q{i});
output;
end;
run;
proc tabulate data=want;
class response vars;
table vars=' ' *rowpctn=' ' all= 'Total'*n=' ',response=' ' all='Total';
run;
Thanks "stat" too. Would you also be willing to explain what this is doing?
Hi,
I am creating two classification variables one based on variable names and other based on their responses to put them in a crosstab to get desired output.
Thanks,
Hi I have a diffrent approch to do the same ,please have a look ; data have ; input q1 $ q2 $ q3 $ q4 $ ; datalines; yes no yes yes no yes yes no yes yes yes no ; run; proc transpose data = have out = want name = quest; var q1 q2 q3 q4 ; run; data want1 ; set want ; by quest ; array q (3) $ col1- col3 ; if first.quest then count_y = 0 ; do i = 1 to 3 ; if q(i) = "yes" then count = 1 ; else count = 0 ; count_y + count ; end; yes_per = (count_y/3)*100 ; no_per = 100- yes_per ; run;
Naveen, your answer looks like it might be good but I can't tell in the format you have it in. Could you post it again with each command or data step in one row? I tried and couldn't quite straighten out your posting. Something to do with my computer perhaps.
thanks
gene
Hi,
Sorry for the late reply, I had this on a SAS session which crashed yesterday when I was working on it, so am posting this late (note that you could replace the yes_results and no_results with a multi level array, but I kept It like this for simplicity:
data have;
attrib q1 q2 q3 q4 format=$10.;
infile cards dlm=',';
input q1 $ q2 $ q3 $ q4 $;
cards;
yes,no,yes,yes
no,yes,yes,yes
yes,yes,no,no
;
run;
data want (drop=i);
set have end=last;
attrib quarter format=best. yes no format=$20.;
array q{4};
array yes_results{4} (0,0,0,0);
array no_results{4} (0,0,0,0);
do i=1 to 4;
if q{i}="yes" then yes_results{i}=yes_results{i}+1;
if q{i}="no" then no_results{i}=no_results{i}+1;
end;
if last then do;
do i=1 to 4;
quarter=i;
yes="%"||strip(put( (yes_results{i} / 4) * 100,best.));
no="%"||strip(put( (no_results{i} / 4) * 100,best.));
output;
end;
end;
run;
Hi this will solve the purpose ....
data have ;
input q1 $ q2 $ q3 $ q4 $ ;
datalines;
yes no yes yes
no yes yes no
yes yes yes no
;
run;
proc transpose data = have out = want name = quest;
var q1 q2 q3 q4 ;
run;
data want1 ;
set want ;
by quest ;
array q (3) $ col1- col3 ;
if first.quest then count_y = 0 ;
do i = 1 to 3 ;
if q(i) = "yes" then count = 1 ; else count = 0 ;
count_y + count ;
end;
yes_per = (count_y/3)*100 ;
no_per = 100- yes_per ;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.