<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Derivation of Wilcoxon Signed Rank test p-value for small n in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Derivation-of-Wilcoxon-Signed-Rank-test-p-value-for-small-n/m-p/792353#M38833</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am having difficulty understanding how SAS derives the p value for the Wilcoxon Signed Rank test for small n.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I understand how to get S, the test statistic for the Wilcoxon Signed Rank Test (see documentation &lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/procstat/procstat_univariate_details17.htm#procstat_univariate015076" target="_self"&gt;here&lt;/A&gt;). However, I don't understand how to go from S to the pvalue. The documentation (&lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/procstat/procstat_univariate_details17.htm#procstat_univariate015076" target="_self"&gt;link again&lt;/A&gt;) says, for small n, "&lt;SPAN&gt;the significance of&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;S&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;is computed from the exact distribution of&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;S&lt;/SPAN&gt;&lt;SPAN&gt;, where the distribution is a convolution of scaled binomial distributions." What exactly does that mean?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Here is an example:&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;data test3;
input v1
	  v2;
datalines;
2 1
1 0 
3 0 
1 0 
;

data test4;
	set test3;
	diffVar = v1 - v2;
run; 

proc univariate data = test4;
	var diffVar;
run; &lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;We see that S = 5, but why is Pr &amp;gt;= |S| = 0.1250?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 26 Jan 2022 01:37:53 GMT</pubDate>
    <dc:creator>pywils</dc:creator>
    <dc:date>2022-01-26T01:37:53Z</dc:date>
    <item>
      <title>Derivation of Wilcoxon Signed Rank test p-value for small n</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Derivation-of-Wilcoxon-Signed-Rank-test-p-value-for-small-n/m-p/792353#M38833</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am having difficulty understanding how SAS derives the p value for the Wilcoxon Signed Rank test for small n.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I understand how to get S, the test statistic for the Wilcoxon Signed Rank Test (see documentation &lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/procstat/procstat_univariate_details17.htm#procstat_univariate015076" target="_self"&gt;here&lt;/A&gt;). However, I don't understand how to go from S to the pvalue. The documentation (&lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/procstat/procstat_univariate_details17.htm#procstat_univariate015076" target="_self"&gt;link again&lt;/A&gt;) says, for small n, "&lt;SPAN&gt;the significance of&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;S&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;is computed from the exact distribution of&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;S&lt;/SPAN&gt;&lt;SPAN&gt;, where the distribution is a convolution of scaled binomial distributions." What exactly does that mean?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Here is an example:&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;data test3;
input v1
	  v2;
datalines;
2 1
1 0 
3 0 
1 0 
;

data test4;
	set test3;
	diffVar = v1 - v2;
run; 

proc univariate data = test4;
	var diffVar;
run; &lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;We see that S = 5, but why is Pr &amp;gt;= |S| = 0.1250?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jan 2022 01:37:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Derivation-of-Wilcoxon-Signed-Rank-test-p-value-for-small-n/m-p/792353#M38833</guid>
      <dc:creator>pywils</dc:creator>
      <dc:date>2022-01-26T01:37:53Z</dc:date>
    </item>
    <item>
      <title>Re: Derivation of Wilcoxon Signed Rank test p-value for small n</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Derivation-of-Wilcoxon-Signed-Rank-test-p-value-for-small-n/m-p/792431#M38834</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/380685"&gt;@pywils&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Interesting question. Looking into pp. 129 f. of the book by&amp;nbsp;&lt;SPAN&gt;Lehmann and D’Abrera (&lt;/SPAN&gt;&lt;A tabindex="0" href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/procstat/procstat_univariate_references.htm#procstat_univariatelehm_e75" target="_blank"&gt;1975&lt;/A&gt;&lt;SPAN&gt;) the documentation refers to, I think the argument (applied to your example) is as follows:&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;The ranks of the four &lt;FONT face="courier new,courier"&gt;diffVar&lt;/FONT&gt; values 1, 1, 3, 1 are 2, 2, 4, 2, respectively (where 2 is the average rank of the original ranks 1, 2 and 3, which are assigned to the three tied values &lt;FONT face="courier new,courier"&gt;diffVar=1&lt;/FONT&gt;). Under the null hypothesis (of symmetry about 0) the sign of each of the&amp;nbsp;&lt;FONT face="courier new,courier"&gt;diffVar&lt;/FONT&gt; values could be positive or negative with probability 1/2. Due to independence, the 2**4=16 possible sign combinations are equally likely (probability 1/16).&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;The first term in the formula of the Wilcoxon signed-rank test statistic S is the sum of the ranks of &lt;EM&gt;positive&lt;/EM&gt; differences:&lt;/SPAN&gt;&lt;/P&gt;
&lt;TABLE&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;STRONG&gt;Positive ranks&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD&gt;None&lt;/TD&gt;
&lt;TD&gt;2&lt;/TD&gt;
&lt;TD&gt;4&lt;/TD&gt;
&lt;TD&gt;2, 2&lt;/TD&gt;
&lt;TD&gt;2, 4&lt;/TD&gt;
&lt;TD&gt;2, 2, 2&lt;/TD&gt;
&lt;TD&gt;2, 2, 4&lt;/TD&gt;
&lt;TD&gt;2, 2, 2, 4&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;STRONG&gt;Probability&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD&gt;1/16&lt;/TD&gt;
&lt;TD&gt;3/16&lt;/TD&gt;
&lt;TD&gt;1/16&lt;/TD&gt;
&lt;TD&gt;3/16&lt;/TD&gt;
&lt;TD&gt;3/16&lt;/TD&gt;
&lt;TD&gt;1/16&lt;/TD&gt;
&lt;TD&gt;3/16&lt;/TD&gt;
&lt;TD&gt;1/16&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For example, 3 of the 16 possible sign combinations yield one positive rank 2 and one positive rank 4, hence the probability 3/16. Each of the four ranks 2, 2, 4, 2 contributes its value -- independently -- with probability 1/2 to the sum. So their distributions are Bernoulli distributions (&lt;EM&gt;binomial&lt;/EM&gt; with parameters n=1 and p=1/2), but with a "&lt;EM&gt;scale&lt;/EM&gt;" factor of 2 or 4, respectively. The distribution of a sum of independent random variables is called a &lt;EM&gt;convolution&lt;/EM&gt;. A simulation of this "convolution of scaled binomial distributions" and then S (after subtracting the expectation 4*(4+1)/4=5) could look like this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data sim;
call streaminit(314159);
array r[4] _temporary_;
do i=1 to 1e6;
  r[1]=2*rand('bern',0.5);
  r[2]=2*rand('bern',0.5);
  r[3]=4*rand('bern',0.5);
  r[4]=2*rand('bern',0.5);
  S=sum(of r[*])-5;
  output;
end;
run;

proc freq data=sim;
tables S;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Your observed case "2, 2, 2, 4" yields S=5 and is the only one with S&amp;gt;=5, hence the &lt;EM&gt;two-sided&lt;/EM&gt; p-value is &lt;EM&gt;twice&lt;/EM&gt; the probability of obtaining S=5, i.e. 2*1/16=1/8=0.125.&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jan 2022 12:12:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Derivation-of-Wilcoxon-Signed-Rank-test-p-value-for-small-n/m-p/792431#M38834</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2022-01-26T12:12:22Z</dc:date>
    </item>
    <item>
      <title>Re: Derivation of Wilcoxon Signed Rank test p-value for small n</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Derivation-of-Wilcoxon-Signed-Rank-test-p-value-for-small-n/m-p/792528#M38839</link>
      <description>&lt;P&gt;When you get something like "&lt;SPAN&gt;the significance of&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;S&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;is computed from the exact distribution of&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;S&lt;/SPAN&gt;". That means that ALL permutations of the values are created. Then you count the numbers that have x or more of interest.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For a smaller sample (4) you can look at this:&lt;/P&gt;
&lt;PRE&gt;data small;
   do i=0,1;
   do j=0,1;
   do k=0,1;
   do l=0,1;
      samp = cats(i,j,k,l);&lt;BR /&gt;      num =sum(i,j,k,l);&lt;BR /&gt;      output;
   end;
   end;
   end;
   end;
run;

proc freq data=small;
   tables samp num;
run;&lt;/PRE&gt;
&lt;P&gt;Each of the results has the same probability of occuring 0.0625 (6.25%). So you can look at the totals of the 1(the + in sign rank) and see that the probability of getting 3 or more, as an example, is the sum of ways you get 3 or 4. 25+6.25% or 31.25. or P(s &amp;gt;=3)=.3125 , P(s=4)=.0625.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Left as an exercise is to create the distribution for 5 elements and see what P(s=5) comes out as.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Yes there is more efficient coding for larger sample sizes. The above is very easy to follow for beginners.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jan 2022 16:35:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Derivation-of-Wilcoxon-Signed-Rank-test-p-value-for-small-n/m-p/792528#M38839</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2022-01-26T16:35:17Z</dc:date>
    </item>
    <item>
      <title>Re: Derivation of Wilcoxon Signed Rank test p-value for small n</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Derivation-of-Wilcoxon-Signed-Rank-test-p-value-for-small-n/m-p/793038#M38867</link>
      <description>&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Fri, 28 Jan 2022 00:14:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Derivation-of-Wilcoxon-Signed-Rank-test-p-value-for-small-n/m-p/793038#M38867</guid>
      <dc:creator>pywils</dc:creator>
      <dc:date>2022-01-28T00:14:57Z</dc:date>
    </item>
  </channel>
</rss>

