BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Epi_Stats
Obsidian | Level 7

Hi,

 

I’m running some hypothesis tests on my data to compare the median length of stay (LOS) between 2 groups (treatment and control, categorical variable=treatment (y/n)), using the following code:

 

Proc NPAR1WAY data=LOS median wilcoxon;

            class treatment; 

var LOS;

run;

 

However, I also want to test for differences in median LOS between the patients who were treated/untreated and then either survived or died (mortality variable), but I can’t figure out how to do this…

 

I know that only one class variable is allowed when using the NPAR1WAY proc, and I get this error when I try to run the following code:

 

Proc NPAR1WAY data=LOS median wilcoxon;

            class treatment mortality;

var LOS;

run;

 

But is there another proc I can use to compare median LOS classified by both treatment and mortality variables?

 

I include some example data for context.

 

Appreciate all support and happy to clarify anything that may be unclear.

 

 

data example_data;
	input ID treatment mortality los;
datalines;
1 1 1 96
2 1 1 71
3 1 1 92
4 0 0 99
5 1 0 41
6 0 0 37
7 1 0 17
8 1 1 65
9 0 1 7
10 1 0 12
;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
You can make a new variable by combining these two variables.
new_class=catx(' ', treatment ,mortality);

And include this new variable in CLASS .
Proc NPAR1WAY data=LOS median wilcoxon;
class new_class ;
var LOS;
run;

View solution in original post

9 REPLIES 9
Ksharp
Super User
You can make a new variable by combining these two variables.
new_class=catx(' ', treatment ,mortality);

And include this new variable in CLASS .
Proc NPAR1WAY data=LOS median wilcoxon;
class new_class ;
var LOS;
run;
Epi_Stats
Obsidian | Level 7

Thank you @Ksharp for your reply.

 

I had tried this but was unsure if it was correct to combine these variables!..

 

When I create the new concatenated variable and re-run the analysis, am I correct that it's the p value from the "Median One-Way Analysis" output that I read? (In my above example, p=0.0658, so there is no difference between the median LOS among the treated/untreated who died/survived).

 

Thank you again

Ksharp
Super User

Yeah. I think you are right.

But median  test of PROC NPAR1WAY is not very powerful, Maybe @StatDave know a better PROC .

StatDave
SAS Super FREQ

If you are willing to assume a distribution for your LOS response, then you can probably get a more powerful test. For a non-negative response like length of stay, a distribution like gamma or inverse gaussian might be reasonable. With the data you show and using the combined predictors, the following finds a strongly significant effect - but you have to be comfortable with the distributional assumption. The LSMEANS statement gives multiple comparisons among the groups. The Mean column gives the estimates on the mean (original) scale.

proc genmod;
class new_class;
model los=new_class / dist=gamma link=log type3;
lsmeans new_class / ilink diff;
run;
Epi_Stats
Obsidian | Level 7

Thank you very much - that is really interesting to be aware of these different (more powerful) tests!

Epi_Stats
Obsidian | Level 7

Hi @StatDave ,

 

I'm wondering since the output from PROC GENMOD is means, is it correct to use this proc (instead of NPAR1WAY) to compare the median LOS? - and would I still use PROC NPAR1WAY to compute medians for my data when I have ≥2 class variables, and then use PROC GENMOD to test for any difference? Is this correct to do?

SteveDenham
Jade | Level 19

Using the ilink option when the link is a log should result in a location estimate that approximates the median (geometric mean) rather than the expected value, but again it becomes a matter of distributional assumptions because it is not the same for gamma, inverse gaussian or log normal.

 

If you really, really want to look at differences in medians, you might consider bootstrapping as an approach.

 

SteveDenham

StatDave
SAS Super FREQ

GENMOD allows you to estimate the mean of whichever distribution is specified, not the median. The LSMEANS statement I showed provides estimates of the gamma mean at each level of the predictor. If you want to estimate the median, or other quantile, that is what quantile regression is for, which PROC QUANTREG can do.

Reeza
Super User

If you have the LOS and other variables, I wonder if survival analysis isn't an option as well, but with no censoring may be equivalent to genmod.

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 1874 views
  • 9 likes
  • 5 in conversation