Hi,
I have a problem with proc tabulate that I'm trying to explain below
Q0.What is your gender
1.Male [ ]
2.Female [ ]
Q1.What is (are) your favorite fruit?
1.Apple [ ]
2.Orange [ ]
3.Banana [ ]
4.Melon [ ]
5.Mango [ ]
6.Strawberry [ ]
.....
25.Other
Responders can mark one or more than one or all.
If it is marked 1, if not marked 2 it is in the data.
Each option is located as a separate column in data.(Q1_1=1 /Yes,Q1_1=2 /No).
There are 25 columns for these question(Q1) answers.
I calculate the frequency of the selection of the Apple.(Sum of markers that apple/Total respondents)*100
I can calculate all the options like this.
How can I calculate the first 5 most marked/least first 5 fruits according to sex and total with proc tabulate?
How can I do it?
best regards
You essentially want PROC TABULATE to rank the mean (or sum) of the Q1 responses.
So what exactly do you want to display? Just the 5 highest and 5 lowest Q1's? Or all 25, with the 5 min and 5 max flagged somehow?
I want to display a table like this
The most preferred 5 fruits (%) | Total | The least preferred 5 fruits (%) | ||||||||||
Total | 1 | 2 | 3 | 4 | 5 | 1 | 2 | 3 | 4 | 5 | ||
Total | 100 | .. | ||||||||||
Male | .. | |||||||||||
Female | ... |
After taking rowpctsum and fill the cells
I calculate the frequencies of the individual fruits.
For example apple ; (sum of Apple markers(=1) / Total responders)*100.
In this way, instead of calculating separately but I think, all the fruits must be calculated together.then sorting and should be the most preferred.is it wrong?
In your example, there is no place to name the fruits. Is that intentional?
I gave it as an example.
After 25 fruits were ranked in the markers then Sequence names will be written.
If the orange is most marked in the marked,It will be the first orange
The most preferred 5 fruits (%) | Total | The least preferred 5 fruits (%) | ||||||||||
Total | Orange | 2 | 3 | 4 | 5 | 1 | 2 | 3 | 4 | 5 | ||
Total | 100 | .. | ||||||||||
Male | .. | |||||||||||
Female | ... |
I try to explain with a small data set.
responder_id | Q1_1 | Q1_2 | Q1_3 | Q1_4 | ….. | Q1_25 |
1 | 1 | 1 | 2 | 1 | ….. | ….. |
2 | 2 | 1 | 1 | 1 | ….. | ….. |
3 | 2 | 1 | 1 | 1 | ….. | ….. |
4 | 2 | 2 | 2 | 2 | ….. | ….. |
5 | 1 | 2 | 1 | 1 | ….. | ….. |
6 | 2 | 2 | 2 | 2 | ….. | ….. |
7 | 1 | 1 | 1 | 1 | ….. | ….. |
8 | 1 | 2 | 2 | 1 | ….. | ….. |
9 | 1 | 1 | 2 | 2 | ….. | ….. |
10 | 1 | 2 | 2 | 2 | ….. | ….. |
11 | 2 | 2 | 2 | 2 | ….. | ….. |
COUNTIF(yes=1) | 6 | 5 | 4 | 6 | ….. | ….. |
Percentage of each options | 54.55 | 45.45 | 36.36 | 54.55 | ….. | ….. |
want_% |
28.57143% (=6/21) |
23.80952% (=5/21) |
19.04762%
(=4/21) |
28.57143%
(=6/21) |
COUNTIF(yes=1)=21 | |
Rank | 1 | 3 | 4 | 1 |
After this I will proc tabulate between the gender and most preferred fruits rank
Let's see if this brings you a step closer.
Step 1: Replace all the 2 values with 0. So you have a data set with 1=selected, 0=not selected.
Step 2: PROC MEANS will get you the numbers you need (but not the report that you are asking for). For example:
proc means data=have sum mean;
var q1_1 - q1_25;
output out=stats (drop=_type_ _freq_) mean=q1_mean1 - q1_mean25 sum=q1_sum1 - q1_sum25;
run;
In your output data set, you will have the numbers that correspond to your "COUNTIF" row and your "Percentage of each options" row, although your percentages will be decimal fractions. You may want to multiply by 100 before printing them. So if this is halfway there, you would still need to compute your ranking, your "want_%", and reformat a final report.
PROC TABULATE doesn't do this. Whatever you do for one variable, you do for all in PROC TABULATE. And you can't really rank the results. So the short answer becomes:
Then it's relatively easy to calculate the sum of each fruit variable by gender. But PROC TABULATE doesn't come into play at all.
Provide a small example of the actual data. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to create data step code we can run to duplicate your data.
Then show us what you would expect the frequencies from SPSS to look like for that example data.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.