My nonparametric students and I stumbled on a small example (n=7) of a data set where Spearman's and Kendall's Tau-b come out to be perfectly 1.0, which is correct because the data show a perfect monotonic relationship. But the p-value from Proc Freq's Exact test for Kendall's also printed a p-value of 1.00, which doesn't make sense - the probability of a perfect monotonic ordering just by chance is small. There are no ties in the data. Is there something odd happening in the p-value for Kendall's using the exact algorithm?
OBS=7 ? That is too small, I guess there must be 0 in some cells ? I would drop it and not test it by proc freq .
" which doesn't make sense - the probability of a perfect monotonic ordering just by chance is small."
That doesn't make any sense . Statistical test is based on some statistical distribution like Normal , Chisq .... (Bays estimation is another question) .
Your data is so small which means your H0's Chisq Value is so small , so there are almost all the Chisq greater than H0's Chisq (i.e. P=1) .
Xia Keshan
n=7 is certainly to small for the ChiSq value or any asymptotic result, but this is the Exact test based on the permutation distribution. The answer for the p-value should be 2/(7!)
DMohr is right. One can definitely do an exact test with n=7 (or smaller). I even have a table of tabulated exact p values for a wide range of small n values for this statistic. p should be almost 0. I was first thinking that you are looking at a one-sided statistic, for the probability of finding a correlation smaller than 1 (which would give p ~ 1). The probability of finding a larger correlation would then be one minus the displayed p (what you are really interested in). The output does look like you are getting this left-sided p. However, FREQ also gives the two-sided p-value as 1. I have never used PROC FREQ for exact tests, so I have not read the documentation. Perhaps others can comment. I am guessing that the 'exact' algorithm has a problem with this correlation and small n (but one typically only wants the exact p value when n is small). Interestingly, the exact statement seems to work fine for the Spearman correlation More interestingly, for Spearman, the displayed one-side p is listed as "Pr >=r"; but for Kendall tau, it is "Pr <= t". This supports my claim that you are getting the "other" side alternative.
yes, I noticed that switch in the labeling also. It's as if, in this one odd-ball case, the algorithm is picking the wrong side of the distribution for the calculation of the p-value.
Interestingly, if you did not have a perfect monotonic relationship, the exact result is for "Pr >=t" for Kendall tau (the desired direction). It only switches sides when t=1. Maybe you can find something in the documentation about that. If not, it might be worth a message to Technical Support.
I think you are right - all the previous examples it worked fine. It's only this oddball case. I will send a message on to Technical Support.
If you get a good explanation from Tech Support about the switched side of the test, please let us all know.
Here is what we got back from SAS Technical Support. Hurray, one tiny little speck of mortar added to the brick wall of science.
Hi
Donna:
This
does appear to be a defect in FREQ. In this particular case (where tauB=1
and ASE=0), PRO FREQ displays the one-sided p-value as the left-sided p-value
(Pr <= t), which is indeed 1 when tauB=1 (obviously, because the range of
tauB is between -1 and 1). But PROC FREQ should display the right-sided p-value
(Pr >= t) when t > 0.
A work-around to get the value of (Pr >= t) for this example is to specify the
POINT option in the EXACT statement. In this example where tauB=1, (Pr >= t)
= (Pr = t), which means that the one-sided p-value (Pr >=t) is identical to
the point probability (Pr = t).
Thank you for bringing this to our attention. I will make sure that it gets
fixed.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.