04-02-2015 04:41 PM
My nonparametric students and I stumbled on a small example (n=7) of a data set where Spearman's and Kendall's Tau-b come out to be perfectly 1.0, which is correct because the data show a perfect monotonic relationship. But the p-value from Proc Freq's Exact test for Kendall's also printed a p-value of 1.00, which doesn't make sense - the probability of a perfect monotonic ordering just by chance is small. There are no ties in the data. Is there something odd happening in the p-value for Kendall's using the exact algorithm?
04-03-2015 09:05 AM
OBS=7 ? That is too small, I guess there must be 0 in some cells ? I would drop it and not test it by proc freq .
" which doesn't make sense - the probability of a perfect monotonic ordering just by chance is small."
That doesn't make any sense . Statistical test is based on some statistical distribution like Normal , Chisq .... (Bays estimation is another question) .
Your data is so small which means your H0's Chisq Value is so small , so there are almost all the Chisq greater than H0's Chisq (i.e. P=1) .
04-03-2015 10:00 AM
n=7 is certainly to small for the ChiSq value or any asymptotic result, but this is the Exact test based on the permutation distribution. The answer for the p-value should be 2/(7!)
04-03-2015 10:16 AM
DMohr is right. One can definitely do an exact test with n=7 (or smaller). I even have a table of tabulated exact p values for a wide range of small n values for this statistic. p should be almost 0. I was first thinking that you are looking at a one-sided statistic, for the probability of finding a correlation smaller than 1 (which would give p ~ 1). The probability of finding a larger correlation would then be one minus the displayed p (what you are really interested in). The output does look like you are getting this left-sided p. However, FREQ also gives the two-sided p-value as 1. I have never used PROC FREQ for exact tests, so I have not read the documentation. Perhaps others can comment. I am guessing that the 'exact' algorithm has a problem with this correlation and small n (but one typically only wants the exact p value when n is small). Interestingly, the exact statement seems to work fine for the Spearman correlation More interestingly, for Spearman, the displayed one-side p is listed as "Pr >=r"; but for Kendall tau, it is "Pr <= t". This supports my claim that you are getting the "other" side alternative.
04-03-2015 10:29 AM
yes, I noticed that switch in the labeling also. It's as if, in this one odd-ball case, the algorithm is picking the wrong side of the distribution for the calculation of the p-value.
04-03-2015 10:55 AM
Interestingly, if you did not have a perfect monotonic relationship, the exact result is for "Pr >=t" for Kendall tau (the desired direction). It only switches sides when t=1. Maybe you can find something in the documentation about that. If not, it might be worth a message to Technical Support.
04-03-2015 11:43 AM
I think you are right - all the previous examples it worked fine. It's only this oddball case. I will send a message on to Technical Support.
04-08-2015 12:58 PM
Here is what we got back from SAS Technical Support. Hurray, one tiny little speck of mortar added to the brick wall of science.
does appear to be a defect in FREQ. In this particular case (where tauB=1
and ASE=0), PRO FREQ displays the one-sided p-value as the left-sided p-value
(Pr <= t), which is indeed 1 when tauB=1 (obviously, because the range of
tauB is between -1 and 1). But PROC FREQ should display the right-sided p-value
(Pr >= t) when t > 0.
A work-around to get the value of (Pr >= t) for this example is to specify the
POINT option in the EXACT statement. In this example where tauB=1, (Pr >= t)
= (Pr = t), which means that the one-sided p-value (Pr >=t) is identical to
the point probability (Pr = t).
Thank you for bringing this to our attention. I will make sure that it gets