Is there a way to find the frequency of all of the responses of several categorical variables and output this data in a new data set? The possible responses of each categorical variable are as follows:
1: No, strongly disagree
2: No, somewhat disagree
3: Neither agree nor disagree
4: Yes, somewhat agree
5: Yes, strongly agree
Here is a sample data set that I would want to run a proc freq on:
data sample;
input id A1Q1 A2Q1 RFQ1 SE1Q1 SE2Q1 SE3Q1 SE4Q1 I1Q1 I2Q1 I3Q1;
datalines;1 2 1 1 3 3 4 2 5 3 2
2 1 1 1 1 1 1 1 1 1 1
3 2 3 4 5 1 2 3 4 5 1
4 1 2 3 4 5 5 4 3 2 1
5 1 1 1 1 1 1 1 1 1 1;
After I do so, I would want to output the frequency of responses for each of these categorical variables in a new data set. Is there a way to do so? How would I do so for the following variables: A1Q1 A2Q1 RFQ1 SE1Q1 SE2Q1 SE3Q1 SE4Q1 I1Q1 I2Q1 I3QQ1 I4Q1 all in one new data set?
Sure, here's one way. To get the values displayed with what you want, I suggest creating a format, see the second set of code that illustrates how to do that.
*Run frequency for tables;
ods table onewayfreqs=temp;
proc freq data=sample;
table A1Q1 A2Q1 RFQ1 SE1Q1 SE2Q1 SE3Q1 SE4Q1 I1Q1 I2Q1 I3Q1;
run;
*Format output;
data want;
length variable $32. variable_value $50.;
set temp;
Variable=scan(table, 2);
Variable_Value=strip(trim(vvaluex(variable)));
keep variable variable_value frequency percent cum:;
label variable='Variable'
variable_value='Variable Value';
run;
*Display;
proc print data=want(obs=20) label;
run;
Here's how to create and apply formats:
https://github.com/statgeek/SAS-Tutorials/blob/master/proc_format_example.sas
This does it for age ranges, but you can do single values as well. That should get you started.
@JackZ295 wrote:
Is there a way to find the frequency of all of the responses of several categorical variables and output this data in a new data set? The possible responses of each categorical variable are as follows:
1: No, strongly disagree
2: No, somewhat disagree
3: Neither agree nor disagree
4: Yes, somewhat agree
5: Yes, strongly agree
Here is a sample data set that I would want to run a proc freq on:
data sample; input id A1Q1 A2Q1 RFQ1 SE1Q1 SE2Q1 SE3Q1 SE4Q1 I1Q1 I2Q1 I3Q1; datalines;1 2 1 1 3 3 4 2 5 3 2 2 1 1 1 1 1 1 1 1 1 1 3 2 3 4 5 1 2 3 4 5 1 4 1 2 3 4 5 5 4 3 2 1 5 1 1 1 1 1 1 1 1 1 1;
After I do so, I would want to output the frequency of responses for each of these categorical variables in a new data set. Is there a way to do so? How would I do so for the following variables: A1Q1 A2Q1 RFQ1 SE1Q1 SE2Q1 SE3Q1 SE4Q1 I1Q1 I2Q1 I3QQ1 I4Q1 all in one new data set?
Hi @Reeza , thanks for your help. Is there an easier way to do it? This seems rather complicated and not all of the variables show up.
There are other approaches.
1. Transpose to a long format via PROC TRANSPOSE or a data step
2. PROC FREQ for summary statistics same as above
The data is all there, it's just not show, the display is limited to 20 observations because of the (obs=20) data set option on the PROC PRINT statement.
@JackZ295 wrote:
Hi @Reeza , thanks for your help. Is there an easier way to do it? This seems rather complicated and not all of the variables show up.
Hi @Reeza, thanks for your help. However, can't proc freq only output summary statistics for one variable at a time?
@JackZ295 wrote:
Hi @Reeza, thanks for your help. However, can't proc freq only output summary statistics for one variable at a time?
If you ran my first solution you should look at the intermediate table after the proc freq that has all the values into a single table, from multiple variables. But that doesn't matter with the transpose approach because you then run a two way table and the outputs all go to a single table.
Your data would look like:
Question Value
Q1 1
Q1 3
Q3 5
Q2 2
Q4 5
Q8 2
Then run a PROC FREQ between the question and value to get the desired table in a cleaner format. Think it's the same amount of steps overall. You'll still need to add your PROC FORMAT.
Hi @Reeza thanks so much for your help. However, is there a source where I can read up more on how to create formats? I somewhat understand the output, but I don't really understand the code used to create it.
Is the code "ods table" interchangeable with the code "ods output"? Is that just to create an ODS Output dataset? Is onewayfreqs the name of the table while temp is the name of the data set in which the table is created?
I am especially confused about these two lines of code:
Variable=scan(table, 2);
Variable_Value=strip(trim(vvaluex(variable)));
Thanks again for your help.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.