BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
monsterpie
Obsidian | Level 7

I have a table with two variables:

(1) Account type (3 levels, categorical) across the columns.

(2) For each account type, I have listed the top 5 food products and the top 5 food products are different for each account type (see attached image for an example of my table).

 

I'm wondering if it's possible to perform any kind of statistical test of association for these variables (e.g., chi square/Fisher's exact)? I assumed not because the food product variable changes depending on the account type variable, but I just wanted to be sure. Or would I have to create a table for each account and its respective food products and then perform a statistical test, such as chi square. variable table

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

>  I assumed not because the food product variable changes depending on the account type variable, but I just wanted to be sure

 

I think you are correct. The margins for the rows would have to be the same, such as "Fast food", "Alcohol", etc., and the cells would have to contain the proportion of the i_th food item for the j_th column. Then you could test whether the proportions differ among columns.

 

If you want to pursue this, my advice is to look at the top 5 food items OVERALL (and probably an "other" category so column margins add to 100%). Put those top items in the left margin. You can then test whether the proportion for those categories differ among the account types. 

View solution in original post

5 REPLIES 5
Rick_SAS
SAS Super FREQ

>  I assumed not because the food product variable changes depending on the account type variable, but I just wanted to be sure

 

I think you are correct. The margins for the rows would have to be the same, such as "Fast food", "Alcohol", etc., and the cells would have to contain the proportion of the i_th food item for the j_th column. Then you could test whether the proportions differ among columns.

 

If you want to pursue this, my advice is to look at the top 5 food items OVERALL (and probably an "other" category so column margins add to 100%). Put those top items in the left margin. You can then test whether the proportion for those categories differ among the account types. 

monsterpie
Obsidian | Level 7

Thank you! That makes sense. A follow up question then: if I did look at the top 5 food products overall, and I wanted to run a fisher's exact in SAS, would I need to list each food product as a separate fisher's test (since each food category is its own column in my dataset).

proc freq data=dataset1;
tables (fast_food candy alcohol energy_drink meals)*account_type;
exact fisher/mc;
run;

Rick_SAS
SAS Super FREQ

It depends on what you want to test.

 

If you want to test whether the proportions of Top 5 categories differ by account type, you can use a TABLE statement like (FoodCategory * account_type). This analysis is on a two-way 6x3 table (assuming there is an "Other" category).  If the test rejects the null hypothesis ("no association"), you know that there is a difference, but you don't know which food(s) are responsible.

 

If you want to test whether a particular food differs by account type, then you are doing an analysis on a one-way 1x3 table. You can probably don't need to create the extra variables. Depending on the structure of your data, you might be able to use a WHERE clause and/or a BY statement.

 

 

monsterpie
Obsidian | Level 7
I believe I am trying to test the first option you mentioned, that there is a difference in the proportion of food categories by account type. So just to confirm, I would re run the fisher's exact code for each food category (e.g., fast_food*account_type, then alcohol*account_type, etc.).
Also, I just want to confirm, that with the output I get from these results, the p value I should be focusing on is the pr<=p value from the MC estimate for the Exact test?
Rick_SAS
SAS Super FREQ

> I believe I am trying to test the first option you mentioned, that there is a difference in the proportion of food categories by account type. So just to confirm, I would re run the fisher's exact code for each food category (e.g., fast_food*account_type, then alcohol*account_type, etc.).

 

No, that is the opposite of what I said. Please re-read my response.

 

Regarding the p-value, see the article "Monte Carlo simulation for contingency tables in SAS."

The p-value for the chi-square test is the value next to the row that says "Pr >= ChiSq".

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 3407 views
  • 0 likes
  • 2 in conversation