BookmarkSubscribeRSS Feed
erdem_ustun
Obsidian | Level 7

Hi,

I have a problem with proc tabulate that I'm trying to explain below

 

Q0.What is your gender
1.Male [ ]
2.Female [ ]
Q1.What is (are) your favorite fruit?
1.Apple [ ]
2.Orange [ ]
3.Banana [ ]
4.Melon [ ]
5.Mango [ ]
6.Strawberry [ ]
.....
25.Other

Responders can mark one or more than one or all.
If it is marked 1, if not marked 2 it is in the data.
Each option is located as a separate column in data.(Q1_1=1 /Yes,Q1_1=2 /No).
There are 25 columns for these question(Q1) answers.
I calculate the frequency of the selection of the Apple.(Sum of markers that apple/Total respondents)*100
I can calculate all the options like this.
How can I calculate the first 5 most marked/least first 5 fruits according to sex and total with proc tabulate?

How can I do it?
best regards

9 REPLIES 9
mkeintz
PROC Star

You essentially want PROC TABULATE to rank the mean (or sum) of the Q1 responses. 

 

So what exactly do you want to display?   Just the 5 highest and 5 lowest Q1's?  Or all 25, with the 5 min and 5 max flagged somehow?

 

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
erdem_ustun
Obsidian | Level 7

I want to display a table like this

 

  The most preferred  5 fruits (%) Total The least preferred 5 fruits (%)
  Total 1 2 3 4 5   1 2 3 4 5
Total 100   ..                  
Male  ..                      
Female  ...                      

After taking rowpctsum and fill the cells

I calculate the frequencies of the individual fruits.

For example apple ; (sum of Apple markers(=1) / Total responders)*100.

In this way, instead of calculating separately but I think, all the fruits must be calculated together.then  sorting and  should be the most preferred.is it wrong?

mkeintz
PROC Star

In your example, there is no place to name the fruits.  Is that intentional?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
erdem_ustun
Obsidian | Level 7

I gave it as an example.

 

After 25 fruits were ranked in the markers then Sequence names will be written.

If the orange is most marked in the marked,It will be the first orange

 

  The most preferred  5 fruits (%) Total The least preferred 5 fruits (%)
  Total Orange 2 3 4 5   1 2 3 4 5
Total 100   ..                  
Male  ..                      
Female  ...                      
erdem_ustun
Obsidian | Level 7

I try to explain with a small data set.

 

responder_id Q1_1 Q1_2 Q1_3 Q1_4 ….. Q1_25
1 1 1 2 1 ….. …..
2 2 1 1 1 ….. …..
3 2 1 1 1 ….. …..
4 2 2 2 2 ….. …..
5 1 2 1 1 ….. …..
6 2 2 2 2 ….. …..
7 1 1 1 1 ….. …..
8 1 2 2 1 ….. …..
9 1 1 2 2 ….. …..
10 1 2 2 2 ….. …..
11 2 2 2 2 ….. …..
COUNTIF(yes=1) 6 5 4 6 ….. …..
Percentage of each options 54.55 45.45 36.36 54.55 ….. …..
             
want_%

28.57143%

(=6/21)

23.80952%

(=5/21)

19.04762%

(=4/21)

28.57143%

(=6/21)

COUNTIF(yes=1)=21  
Rank 1 3 4 1    

After this I will  proc tabulate between the gender and most preferred fruits rank

Astounding
PROC Star

Let's see if this brings you a step closer.

 

Step 1:  Replace all the 2 values with 0.  So you have a data set with 1=selected, 0=not selected.

 

Step 2:  PROC MEANS will get you the numbers you need (but not the report that you are asking for).  For example:

 

proc means data=have sum mean;

   var q1_1 - q1_25;

   output out=stats (drop=_type_ _freq_) mean=q1_mean1 - q1_mean25 sum=q1_sum1 - q1_sum25;

run;

 

In your output data set, you will have the numbers that correspond to your "COUNTIF" row and your "Percentage of each options" row, although your percentages will be decimal fractions.  You may want to multiply by 100 before printing them.  So if this is halfway there, you would still need to compute your ranking, your "want_%", and reformat a final report.

Astounding
PROC Star

PROC TABULATE doesn't do this.  Whatever you do for one variable, you do for all in PROC TABULATE.  And you can't really rank the results.  So the short answer becomes:

 

  • Reformat Q0 by coming up with one variable that is either "M" or "F".
  • Reformat the fruit questions by changing all the "2" values to "0"

Then it's relatively easy to calculate the sum of each fruit variable by gender.  But PROC TABULATE doesn't come into play at all.

erdem_ustun
Obsidian | Level 7
I could ask shorter
After defining Analyze/multiple responses in Spss, I can get the frequencies.
How can I do this in sas
ballardw
Super User

Provide a small example of the actual data. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to create data step code we can run to duplicate your data.

 

Then show us what you would expect the frequencies from SPSS to look like for that example data.

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1890 views
  • 0 likes
  • 4 in conversation