I am trying to calculate the fishers score on 4 variables but it seems to be taking forever. Could it be that the code is wrong? It did not show any error in the log. Here is my code;
proc freq data = &curlib..IRAOut order = data;
tables CurTotal * Basetotal * CurTrans * BaseTrans / fishers;
run;
Is there another way to calculate the fishers score? when I used the chisq option it did not give me the fscore.
Forever, as in minutes, or hours, or days???
How many observations in your data set?
You might want to read the section "Computational Resources" here: https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=procstat&docsetTarget=pro...
It was running for several hours so I terminated the program. My data set has 2268 observations.
How many levels do your 4 variables have? With names like CurTotal, Basetotal , CurTrans and BaseTrans I would be tempted to think that these really aren't categorical variables.
I would be tempted to run something like:
ods select nlevels; proc freq data=data = &curlib..IRAOut nlevels; tables CurTotal Basetotal CurTrans BaseTrans ; run;
Then look at the product of the number of levels reported combined. Since this code is going to make a separate table of curtrans*basetrans for each combination of curtotal and basetotal in the data you may be seeing the effect of creating a very large amount of output: one row and one column for each value of curtrans*basetrans . If you have 10 levels of curtrans and 10 of basetrans that is a 10 by 10 table with the associated frequencies and percentages. If in addition you have 10 levels of each of your total variables that is 100 tables of 10 by 10 output. If you have more levels that gets worse.
And from the Freq documentation for the Fisher Tables option:
Note: PROC FREQ computes exact tests by using fast and efficient algorithms that are superior to direct enumeration. Exact tests are appropriate when a data set is small, sparse, skewed, or heavily tied. For some large problems, computation of exact tests might require a substantial amount of time and memory. Consider using asymptotic tests for such problems. Alternatively, when asymptotic methods might not be sufficient for such large problems, consider using Monte Carlo estimation of exact p-values. You can request Monte Carlo estimation by specifying the MC computation-option in the EXACT statement. See the section Computational Resources for more information.
So you may want to consider the Exact statement instead of the tables option.
Fisher scoring is not the same as Fisher's exact test.
You only calculate Fisher's Test when your number of records in a particular group is low, usually less than 5 or 10. Otherwise, the Chi Square Test will have the same values.
If you're trying to calculate the Fisher Score, https://arxiv.org/abs/1202.3725
That's a very different calculation/problem.
@Stacy1 wrote:
I am trying to calculate the fishers score on 4 variables but it seems to be taking forever. Could it be that the code is wrong? It did not show any error in the log. Here is my code;
proc freq data = &curlib..IRAOut order = data;
tables CurTotal * Basetotal * CurTrans * BaseTrans / fishers;
run;
Is there another way to calculate the fishers score? when I used the chisq option it did not give me the fscore.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.