For n <= 20, proc univariate uses the exact distribution to compute the significance of S, but what exactly happens when there are tied ranks within the small dataset? I've read all 3 references (Iman, Conover, and Lehmann), but none of them really explain what happens when ties occur with a small sample (n <= 20) with the exact distribution. An example is below... Consider the data: Grp1 Grp2 Diff Rank .32 .39 -0.07 3.5 .4 .47 -0.07 3.5 .11 .11 0.00 --- .47 .43 0.04 1 .32 .42 -0.10 5 .35 .3 0.05 2 .32 .43 -0.11 6 .63 .98 -0.35 8 .5 .86 -0.36 9 .6 .79 -0.19 7 Sum of ranks for positive differences (ri+) = 3 Given that the ranks are {1, 2, 3.5, 3.5, 5, 6, 7, 8, 9}, the only ways to get a sum of ranks that is less than or equal to 3 is for the set of positive ranks to be one of: Set Sum {} = 0 {1} = 1 {2} = 2 {1,2} = 3 So, there are 4 configurations on the left-hand side extreme and 4 on the right. Thus, the p-value should be 8/2^9 = 8/512 = 0.0156. However, SAS reports p=10/512 = 0.0195. Perhaps, SAS is saying that {3.5} is either {3} or {4} with ½ probability. Thus, this would be ½ more cases. Since there are two ranks of 3.5, each could be {3} with ½ probability and thus there would be a total of 5 configurations on the left extreme and 5 on the right? Set Sum {} = 0 with 100% probability = 1.0 case {1} = 1 with 100% probability = 1.0 case {2} = 2 with 100% probability = 1.0 case {1,2} = 3 with 100% probability = 1.0 case {3.5} = 3 with 50% probability = 0.5 case {3.5} = 3 with 50% probability = 0.5 case --------- 5.0 cases If you know how SAS is computing the exact p-value with ties (all possible combinations of the sum of ranks less than or equal to the sum of positive ranks) or know where a 'useful' reference might be, please let me know. Thanks in advance!!
... View more