BookmarkSubscribeRSS Feed
DMohr
Calcite | Level 5

 
My nonparametric students and I stumbled on a small example (n=7) of a data set where Spearman's and Kendall's Tau-b come out to be perfectly 1.0, which is correct because the data show a perfect monotonic relationship. But the p-value from Proc Freq's Exact test for Kendall's also printed a p-value of 1.00, which doesn't make sense - the probability of a perfect monotonic ordering just by chance is small. There are no ties in the data. Is there something odd happening in the p-value for Kendall's using the exact algorithm?

8 REPLIES 8
Ksharp
Super User

OBS=7 ? That is too small, I guess there must be 0 in some cells ? I would drop it and not test it by proc freq  .

" which doesn't make sense - the probability of a perfect monotonic ordering just by chance is small."

That doesn't make any sense . Statistical test is based on some statistical distribution like Normal , Chisq .... (Bays estimation is another question) .

Your data is so small which means your H0's Chisq Value is so small , so there are almost all the Chisq greater than H0's Chisq (i.e. P=1) .

Xia Keshan

DMohr
Calcite | Level 5

n=7 is certainly to small for the ChiSq value or any asymptotic result, but this is the Exact test based on the permutation distribution. The answer for the p-value should be 2/(7!)

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

DMohr is right. One can definitely do an exact test with n=7 (or smaller). I even have a table of tabulated exact p values for a wide range of small n values for this statistic. p should be almost 0. I was first thinking that you are looking at a one-sided statistic, for the probability of finding a correlation smaller than 1 (which would give p ~ 1). The probability of finding a larger correlation would then be one minus the displayed p (what you are really interested in). The output does look like you are getting this left-sided p. However, FREQ also gives the two-sided p-value as 1.  I have never used PROC FREQ for exact tests, so I have not read the documentation. Perhaps others can comment. I am guessing that the 'exact' algorithm has a problem with this correlation and small n (but one typically only wants the exact p value when n is small). Interestingly, the exact statement seems to work fine for the Spearman correlation More interestingly, for Spearman, the displayed one-side p is listed as "Pr >=r"; but for Kendall tau, it is "Pr <= t". This supports my claim that you are getting the "other" side alternative.

DMohr
Calcite | Level 5

yes, I noticed that switch in the labeling also. It's as if, in this one odd-ball case, the algorithm is picking the wrong side of the distribution for the calculation of the p-value.

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

Interestingly, if you did not have a perfect monotonic relationship, the exact result is for "Pr >=t" for Kendall tau (the desired direction). It only switches sides when t=1. Maybe you can find something in the documentation about that. If not, it might be worth a message to Technical Support.

DMohr
Calcite | Level 5

I think you are right - all the previous examples it worked fine. It's only this oddball case. I will send a message on to Technical Support.

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

If you get a good explanation from Tech Support about the switched side of the test, please let us all know.

DMohr
Calcite | Level 5

Here is what we got back from SAS Technical Support. Hurray, one tiny little speck of mortar added to the brick wall of science.

Hi
Donna:

 

This
does appear to be a defect in FREQ.  In this particular case (where tauB=1
and ASE=0), PRO FREQ displays the one-sided p-value as the left-sided p-value
(Pr <= t), which is indeed 1 when tauB=1 (obviously, because the range of
tauB is between -1 and 1). But PROC FREQ should display the right-sided p-value
(Pr >= t) when t > 0.

A work-around to get the value of (Pr >= t) for this example is to specify the
POINT option in the EXACT statement. In this example where tauB=1, (Pr >= t)
= (Pr = t), which means that the one-sided p-value (Pr >=t) is identical to
the point probability (Pr = t).

Thank  you for bringing this to our attention.  I will make sure that it gets
fixed.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 2735 views
  • 5 likes
  • 3 in conversation