turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Fisher's Exact Test

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-01-2017 05:07 PM - edited 02-01-2017 05:25 PM

I am trying to run fisher's exact test to see if there are any differences between two categorical variables.

var # 1-->Collection_Center=5 categories

var # 2-->Education= 7 categories

B/c the frequencies in the cross tabs for these 2 variables were < 5, I decided to run a Fisher's exact est instead of a chi-square test for independence.

The following is the code I put into SAS to run fisher's exact test:

proc freq data=baseline_Characteristics; tables collection_center*ca_education/fisher; run;

However, I received this warning from SAS:

WARNING: Computing exact p-values for this problem may require much time and memory. Press the system interrupt key to terminate exact computations.

And sure enough it took so long that I had to stop it.

I have two questions regarding this situation:

1. Can anyone please confirm that I am indeed running the correct statistical test?

2. If I am running the correct test why is it taking so long in SAS and what can I do about it? Am I doing something wrong?

please help!

Accepted Solutions

Solution

07-03-2017
02:12 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2017 07:58 AM

As PGStats states, use the EXACT / MC statement and specify the number of simulations.

To answer your questions, you might want to read two articles:

"Monte Carlo simulation for contingency tables in SAS"

Briefly, exact tests for your data will try to compute the proportion of 5x7 tables that have the same row and column sums as the observed table and whose chi-square values are more extreme than the observed chi-square statistic. Exact statistics are intended for small tables and for small sample sizes. The long computation time is probably because your sample size is in the hundreds or more, which means that there are zillions of possible tables that fit the marginal sums.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-01-2017 05:45 PM

If your sample is too small for the statistical tests you'd like to run, you either:

1. Find a test that accounts for small sample sizes

2. Report descriptive statistics with the caveat that the sample size is too small to determine any significance or generalize.

A Fisher test is the correct test, when Chi-Square is not appropriate, but perhaps there's another option. I'm not seeing anything incorrect with your code.

I'll move this over to the Statistical Procedures forum and others can chime in, who know way more than me on stats topics!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-01-2017 06:15 PM - edited 02-01-2017 06:15 PM

can you please try as below

tables var1*var2 / chisq exact;

or

tables var1*var2 / exact;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-01-2017 07:17 PM

I think it would help to provide example data that duplicates the behavior. Only the category variables would be needed. The instructions here will show how to create datastep code from an existing data set that you can paste here in a code box so we can test code or identify problems with the data.

How many records are involved?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-01-2017 10:15 PM

You can get a good estimate of Fisher exact test p-value with Monte-Carlo estimation:

```
proc freq data=baseline_Characteristics;
tables collection_center*ca_education;
exact Fisher / mc n=20000;
run;
```

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2017 08:54 AM

@PGStats YES THANKS! I did some googling and decided to use Monte Carlo.

what is the n=2000? I haven't seen that in any online examples so far.

thanks for your help

Solution

07-03-2017
02:12 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2017 07:58 AM

As PGStats states, use the EXACT / MC statement and specify the number of simulations.

To answer your questions, you might want to read two articles:

"Monte Carlo simulation for contingency tables in SAS"

Briefly, exact tests for your data will try to compute the proportion of 5x7 tables that have the same row and column sums as the observed table and whose chi-square values are more extreme than the observed chi-square statistic. Exact statistics are intended for small tables and for small sample sizes. The long computation time is probably because your sample size is in the hundreds or more, which means that there are zillions of possible tables that fit the marginal sums.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-02-2017 08:56 AM

@Rick_SAS thank you for those resources. That is very helpful!