BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SteveDenham
Jade | Level 19

I should probably delete the earlier post, as it did NOT use the FREQ statement.  Here is code for a 2 parameter and a 3 parameter weibull.  The 2 fits the earlier points better, the 3 fits the later better (see attached output file).

 

proc reliability data=los;
class treatment;
distribution weibull;
model length_of_stay = treatment ;
probplot length_of_stay = treatment;
freq freq;
run;

proc reliability data=los;
class treatment;
distribution weibull3;
model length_of_stay = treatment ;
probplot length_of_stay = treatment;
freq freq;
run;

The median values for the 3 parameter are 7.3 for Yes and 7.5 for No, 10.1 for Yes and 10.0 for No for the 2 parameter, with substantial overlap of the 95% CI's.  I am still mystified as to why the median test in NPAR1WAY comes up significant.

 

SteveDenham

 

 

Recep
Quartz | Level 8

Thanks @SteveDenham ! I'll treat them as "not" statistically significantly different. I'm also puzzled by the median test results by NPAR1WAY and I asked about it to the SAS technical support. Though, as with my original post above, I made a typo when I was writing down the p-values of two tests by NPAR1WAY. They should read as 0.2012 for the Wilcoxon Two-Sample test and 0.0175 (not 0.175) for the Median Two-Sample test. I'll wait to hear back from SAS and post their reply here.

PGStats
Opal | Level 21

A possible element of explanation for the sensitivity of the median test might be illustrated by the cumulative distribution function obtained with:

proc univariate data=los;
class treatment;
var Length_of_stay;
freq freq;
histogram;
cdfplot / Weibull(theta=est) statref=median overlay;
ods output CDFplot=losCDF;
run;

title "Empirical distribution function for Length_of_stay";
proc sgplot data=losCDF;
where CDFx > 0;
series x=ECDFx y=ECDFy / group=class1;
refline 50 / axis=y label="Median" labelloc=inside;
xaxis type=log;
run;

PGStats_0-1614370911022.png

note that the greatest difference between the two curves is situated just about at the median.

 

This is also found with :

proc npar1way data=los edf plots=none;
class treatment;
freq freq;
var Length_of_stay;
run;

PGStats_1-1614371566785.png

i.e. the median happens to be the value where the two distributions differ the most.

PG
Recep
Quartz | Level 8

Thanks a lot! I think this explains the difference between two contradicting results: The ranked values of length of stay within each group are very similar to each other hence the insignificant Wilcoxon Rank Sum Test (i.e. Wilcoxon Mann-Whitney) results yet the medians are different enough to have a statistically significant different Median Two-Sample Test results. This shows that one should be very careful interpreting the test results! Though I still wonder what's the use of "Median Two-Sample" test?

 

By the way I said above that I would post here SAS's response and here it is:

 

"Hi Recep,

 

You can find the formulas in the documentation here, if you would like to review them.   

 

Most people use the Wilcoxon rank sum test to compare the medians of two groups of data, however you may need to consult with a statistician regarding your particular analysis and input data.  

 

Sincerely,

XXX 

SAS Technical Support

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 18 replies
  • 10053 views
  • 11 likes
  • 8 in conversation