BookmarkSubscribeRSS Feed
RyanSimmons
Pyrite | Level 9

I am running some simulations in SAS, a component of which involves the Wilcoxon rank-sum test to compare two "treatment" groups, which I am implementing with PROC NPAR1WAY. Specifically, I am extracting the value of the Z statistic for the Normal approximation for the Wilcoxon test for a series of specified alternative hypotheses.

 

However, what I noticed was that, between simulations, the SIGN of the Z statistic was switching from negative to positive, despite the fixed (positive) effect size (and a large sample size). Upon further investigation, it appears that that PROC NPAR1WAY is changing the "reference" group against which it is deciding to calculate the Z statistic. It always chooses the first value of the class variable that appears in the dataset as the group for which the Z statistic is calculated (that is, if the the group associated with that first value has a positive effect, it gets a positive Z statistic, and similarly if it has a negative effect).

 

Here is some toy data demonstrating this behavior:

 


DATA A_ref;
call streaminit(395);
do i = 1 to 100;
if i<=50 then do;
y = rand('Normal',0,1);
trt = 'A';
end;
else do;
y = rand('Normal',1,1);
trt = 'B';
end;
output;
end;
run;

PROC SORT data=A_ref out=B_ref;
by descending trt;
run;

PROC NPAR1WAY data=A_ref wilcoxon;
class trt;
var y;
run;

PROC NPAR1WAY data=B_ref wilcoxon;
class trt;
var y;
run;

You can see that the only difference between "A_ref" and "B_ref" is the order in which the treatment groups appear in the dataset.

 

And as you can then see from the PROC NPAR1WAY calls, the results are identical, except for the sign of the Z statistic (and the order in which the boxplots appear in the histogram, etc.). 

 

Now, after viewing the documentation for PROC NPAR1WAY, I can't seem to find a way to set the class order within the proc itself. There is no "order" option in the proc OR class statements. Is there a way to do this that I am missing? Obviously I can manually sort the datasets before the call to make sure they are in the order I want, but this seems a bit clunky, especially since so many other SAS procedures allow for custom ordering of class variables within the proc.

2 REPLIES 2
PaigeMiller
Diamond | Level 26

Why don't you use the absolute value of the Z statistic?

--
Paige Miller
RyanSimmons
Pyrite | Level 9

That actually is what I have been using for now. The main reason I am hesitant to use this going forward is that in the future we may actually be interested in differentiating between "negative" and "positive" results (not so much for large effect sizes, but for small or null effect sizes, we may be interested in situations where, due to random sampling variability, a test statistic indicates a "negative" effect, since the context of our simulation is in group sequential analyses with different types of stopping rules).

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2335 views
  • 0 likes
  • 2 in conversation