BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pywils
Fluorite | Level 6

Hello,

I trying to write an algorithm that matches exactly how SAS performs its Wilcoxon Signed Rank test (ultimately, I would like to do this in R).

I tried to follow SAS’ documentation of the Wilcoxon Signed Rank test, which you can see here, but my result ended up quite differently than SAS’ result. Specifically, I got a value for my test statistic, S, = -6, while SAS reports that S = 7.5.

Here is my code. I tried to put a few comments in SAS explaining how my work aligned with SAS’ documentation. 

If anyone can help me to understand why my algorithm and/or test statistic doesn't match SAS' algorithm / test statistic, I would be very grateful.

/*made up data (actually, the data is scorelines from Celtic Football Club in Scotland, but that doesn't matter)*/
data test;
input v1
	  v2;
datalines;
5 1
1 1
6 0
1 0
1 2
3 0 
5 0 
2 1
3 2
1 0
3 0
1 0 
;

/*Run SAS' Wilcoxon signed rank test*/
proc univariate data = test;
	var v1 v2;
run; 
/*note that here S = 7.5*/

/*derive x_[i}*/
data test2;
	set test;
	diffVar = v1 - v2;
	diffVarAbs = abs(diffVar);  
run; 

/*constant is \mu_{0}*/
proc sql noprint;
	select mean(diffVar) into: constant TRIMMED
	from test2;
quit; 

/*note that no observations have a difference equal to \mu_{0} so I don't have to discard any observations*/

/*derive |x_{i} - \mu| */
data test3; 
	set test2;
	scaledDiffVar = abs(diffVar - &constant);
run; 

/*derive r_{i}*/
proc rank data = test3 out = test4;
	var scaledDiffVar;
	ranks Finish;
run; 

/* filtering observations such that x_{i} > \mu_{0}*/
data test5;
	set test4; 
	WHERE diffVar > &constant;
run; 

/*constant2 is n(t)*/
proc sql noprint;
	select count(v1) into: constant2 TRIMMED
	from test3;
quit; 

/*constant3 is the sum of r_{i}s*/
proc sql noprint;
	select sum(Finish) into: constant3 TRIMMED
	from test5;
quit; 


/*Note that I only run the below step because I couldn't figure out how else to get a dataset with one observation*/
data test6;
	set test5;
	WHERE v2 = 1;
run; 

/*calculate S*/
proc sql; 
	select &constant3 - (&constant2 * (&constant2 + 1)) / 4 into: constant4 TRIMMED
	FROM test6  ;
quit; 
/*note that now S = -6*/

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @pywils,

 


@pywils wrote:
/*Run SAS' Wilcoxon signed rank test*/
proc univariate data = test;
	var v1 v2;
run; 
/*note that here S = 7.5*/

Note that you perform two Wilcoxon signed rank tests: one for v1 and one for v2. That S=7.5 is the test statistic for v2. What you want, however, is the Wilcoxon signed rank test for the difference v1-v2 -- and that with a non-default µ0 (according to your code), i.e.:

proc univariate data=test2 mu0=&constant;
var diffVar;
run; 

Now your S=-6 is confirmed.

View solution in original post

3 REPLIES 3
Reeza
Super User
Are you certain PROC UNIVARIATE is what you should be using here? I would have assumed PROC NPAR1WAY would be more appropriate here.
Reeza
Super User
Here's a fully worked example, you can test your methods against this. You'll have to find the example program (name is at bottom of page) in your installation. I'm not sure where they're being saved online anymore.

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/procstat/procstat_univariate_examples13.htm
FreelanceReinh
Jade | Level 19

Hello @pywils,

 


@pywils wrote:
/*Run SAS' Wilcoxon signed rank test*/
proc univariate data = test;
	var v1 v2;
run; 
/*note that here S = 7.5*/

Note that you perform two Wilcoxon signed rank tests: one for v1 and one for v2. That S=7.5 is the test statistic for v2. What you want, however, is the Wilcoxon signed rank test for the difference v1-v2 -- and that with a non-default µ0 (according to your code), i.e.:

proc univariate data=test2 mu0=&constant;
var diffVar;
run; 

Now your S=-6 is confirmed.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 403 views
  • 3 likes
  • 3 in conversation