For n <= 20, proc univariate uses the exact distribution to compute the significance of S, but what exactly happens when there are tied ranks within the small dataset? I've read all 3 references (Iman, Conover, and Lehmann), but none of them really explain what happens when ties occur with a small sample (n <= 20) with the exact distribution. An example is below...
Consider the data:
Grp1 Grp2 Diff Rank
.32 .39 -0.07 3.5
.4 .47 -0.07 3.5
.11 .11 0.00 ---
.47 .43 0.04 1
.32 .42 -0.10 5
.35 .3 0.05 2
.32 .43 -0.11 6
.63 .98 -0.35 8
.5 .86 -0.36 9
.6 .79 -0.19 7
Sum of ranks for positive differences (ri+) = 3
Given that the ranks are {1, 2, 3.5, 3.5, 5, 6, 7, 8, 9}, the only ways to get a sum of ranks that is less than or equal to 3 is for the set of positive ranks to be one of:
Set Sum
{} = 0
{1} = 1
{2} = 2
{1,2} = 3
So, there are 4 configurations on the left-hand side extreme and 4 on the right. Thus, the p-value should be 8/2^9 = 8/512 = 0.0156. However, SAS reports p=10/512 = 0.0195.
Perhaps, SAS is saying that {3.5} is either {3} or {4} with ½ probability. Thus, this would be ½ more cases. Since there are two ranks of 3.5, each could be {3} with ½ probability and thus there would be a total of 5 configurations on the left extreme and 5 on the right?
Set Sum
{} = 0 with 100% probability = 1.0 case
{1} = 1 with 100% probability = 1.0 case
{2} = 2 with 100% probability = 1.0 case
{1,2} = 3 with 100% probability = 1.0 case
{3.5} = 3 with 50% probability = 0.5 case
{3.5} = 3 with 50% probability = 0.5 case
---------
5.0 cases
If you know how SAS is computing the exact p-value with ties (all possible combinations of the sum of ranks less than or equal to the sum of positive ranks) or know where a 'useful' reference might be, please let me know. Thanks in advance!!
For details of exact test calculation methods, you might have to look in the references within :
PG
This SUGI 1994 paper contains a macro that uses the DATA step to reproduce the WSR test, so you ought to be able to see exactly what happens for tied values.
http://www.sascommunity.org/sugi/SUGI94/Sugi-94-172%20Tian.pdf
Thanks for the paper Rick, but it doesn't really tell how SAS calculates the exact p-values, just shows an attached file in the macro code (WSRtable.dat). Thanks for your help though!! Have a great day!
For details of exact test calculation methods, you might have to look in the references within :
PG
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.