Solved: Re: Fisher's exact test with more than 2 variables?

monsterpie · Posted 02-17-2021 09:17 AM

I have a table with two variables:

(1) Account type (3 levels, categorical) across the columns.

(2) For each account type, I have listed the top 5 food products and the top 5 food products are different for each account type (see attached image for an example of my table).

I'm wondering if it's possible to perform any kind of statistical test of association for these variables (e.g., chi square/Fisher's exact)? I assumed not because the food product variable changes depending on the account type variable, but I just wanted to be sure. Or would I have to create a table for each account and its respective food products and then perform a statistical test, such as chi square. variable table

Rick_SAS · Posted 02-17-2021 09:40 AM

> I assumed not because the food product variable changes depending on the account type variable, but I just wanted to be sure

I think you are correct. The margins for the rows would have to be the same, such as "Fast food", "Alcohol", etc., and the cells would have to contain the proportion of the i_th food item for the j_th column. Then you could test whether the proportions differ among columns.

If you want to pursue this, my advice is to look at the top 5 food items OVERALL (and probably an "other" category so column margins add to 100%). Put those top items in the left margin. You can then test whether the proportion for those categories differ among the account types.

View solution in original post

Rick_SAS · Posted 02-17-2021 09:40 AM

> I assumed not because the food product variable changes depending on the account type variable, but I just wanted to be sure

I think you are correct. The margins for the rows would have to be the same, such as "Fast food", "Alcohol", etc., and the cells would have to contain the proportion of the i_th food item for the j_th column. Then you could test whether the proportions differ among columns.

If you want to pursue this, my advice is to look at the top 5 food items OVERALL (and probably an "other" category so column margins add to 100%). Put those top items in the left margin. You can then test whether the proportion for those categories differ among the account types.

monsterpie · Posted 02-17-2021 09:56 AM

Thank you! That makes sense. A follow up question then: if I did look at the top 5 food products overall, and I wanted to run a fisher's exact in SAS, would I need to list each food product as a separate fisher's test (since each food category is its own column in my dataset).

proc freq data=dataset1;
tables (fast_food candy alcohol energy_drink meals)*account_type;
exact fisher/mc;
run;

Rick_SAS · Posted 02-17-2021 10:09 AM

It depends on what you want to test.

If you want to test whether the proportions of Top 5 categories differ by account type, you can use a TABLE statement like (FoodCategory * account_type). This analysis is on a two-way 6x3 table (assuming there is an "Other" category). If the test rejects the null hypothesis ("no association"), you know that there is a difference, but you don't know which food(s) are responsible.

If you want to test whether a particular food differs by account type, then you are doing an analysis on a one-way 1x3 table. You can probably don't need to create the extra variables. Depending on the structure of your data, you might be able to use a WHERE clause and/or a BY statement.

monsterpie · Posted 02-17-2021 10:34 AM

I believe I am trying to test the first option you mentioned, that there is a difference in the proportion of food categories by account type. So just to confirm, I would re run the fisher's exact code for each food category (e.g., fast_food*account_type, then alcohol*account_type, etc.).
Also, I just want to confirm, that with the output I get from these results, the p value I should be focusing on is the pr<=p value from the MC estimate for the Exact test?

Rick_SAS · Posted 02-17-2021 10:45 AM

> I believe I am trying to test the first option you mentioned, that there is a difference in the proportion of food categories by account type. So just to confirm, I would re run the fisher's exact code for each food category (e.g., fast_food*account_type, then alcohol*account_type, etc.).

No, that is the opposite of what I said. Please re-read my response.

Regarding the p-value, see the article "Monte Carlo simulation for contingency tables in SAS."

The p-value for the chi-square test is the value next to the row that says "Pr >= ChiSq".

Fisher's exact test with more than 2 variables?

Re: Fisher's exact test with more than 2 variables?

Re: Fisher's exact test with more than 2 variables?

Re: Fisher's exact test with more than 2 variables?

Re: Fisher's exact test with more than 2 variables?

Re: Fisher's exact test with more than 2 variables?

Re: Fisher's exact test with more than 2 variables?

The 2025 SAS Hackathon has begun!