Solved: Re: Compare medians using more than one class variable

Epi_Stats · Posted 04-29-2023 07:40 PM

Hi,

I’m running some hypothesis tests on my data to compare the median length of stay (LOS) between 2 groups (treatment and control, categorical variable=treatment (y/n)), using the following code:

Proc NPAR1WAY data=LOS median wilcoxon;

class treatment;

var LOS;

run;

However, I also want to test for differences in median LOS between the patients who were treated/untreated and then either survived or died (mortality variable), but I can’t figure out how to do this…

I know that only one class variable is allowed when using the NPAR1WAY proc, and I get this error when I try to run the following code:

Proc NPAR1WAY data=LOS median wilcoxon;

class treatment mortality;

var LOS;

run;

But is there another proc I can use to compare median LOS classified by both treatment and mortality variables?

I include some example data for context.

Appreciate all support and happy to clarify anything that may be unclear.

data example_data;
	input ID treatment mortality los;
datalines;
1 1 1 96
2 1 1 71
3 1 1 92
4 0 0 99
5 1 0 41
6 0 0 37
7 1 0 17
8 1 1 65
9 0 1 7
10 1 0 12
;
run;

Ksharp · Posted 04-29-2023 11:19 PM

You can make a new variable by combining these two variables.
new_class=catx(' ', treatment ,mortality);

And include this new variable in CLASS .
Proc NPAR1WAY data=LOS median wilcoxon;
class new_class ;
var LOS;
run;

View solution in original post

Ksharp · Posted 04-29-2023 11:19 PM

You can make a new variable by combining these two variables.
new_class=catx(' ', treatment ,mortality);

And include this new variable in CLASS .
Proc NPAR1WAY data=LOS median wilcoxon;
class new_class ;
var LOS;
run;

Epi_Stats · Posted 04-30-2023 06:54 AM

Thank you @Ksharp for your reply.

I had tried this but was unsure if it was correct to combine these variables!..

When I create the new concatenated variable and re-run the analysis, am I correct that it's the p value from the "Median One-Way Analysis" output that I read? (In my above example, p=0.0658, so there is no difference between the median LOS among the treated/untreated who died/survived).

Thank you again

Ksharp · Posted 04-30-2023 10:38 AM

Yeah. I think you are right.

But median test of PROC NPAR1WAY is not very powerful, Maybe @StatDave know a better PROC .

StatDave · Posted 04-30-2023 11:38 AM

If you are willing to assume a distribution for your LOS response, then you can probably get a more powerful test. For a non-negative response like length of stay, a distribution like gamma or inverse gaussian might be reasonable. With the data you show and using the combined predictors, the following finds a strongly significant effect - but you have to be comfortable with the distributional assumption. The LSMEANS statement gives multiple comparisons among the groups. The Mean column gives the estimates on the mean (original) scale.

proc genmod;
class new_class;
model los=new_class / dist=gamma link=log type3;
lsmeans new_class / ilink diff;
run;

Epi_Stats · Posted 04-30-2023 03:07 PM

Thank you very much - that is really interesting to be aware of these different (more powerful) tests!

Epi_Stats · Posted 05-01-2023 07:03 AM

Hi @StatDave ,

I'm wondering since the output from PROC GENMOD is means, is it correct to use this proc (instead of NPAR1WAY) to compare the median LOS? - and would I still use PROC NPAR1WAY to compute medians for my data when I have ≥2 class variables, and then use PROC GENMOD to test for any difference? Is this correct to do?

SteveDenham · Posted 05-01-2023 02:46 PM

Using the ilink option when the link is a log should result in a location estimate that approximates the median (geometric mean) rather than the expected value, but again it becomes a matter of distributional assumptions because it is not the same for gamma, inverse gaussian or log normal.

If you really, really want to look at differences in medians, you might consider bootstrapping as an approach.

SteveDenham

StatDave · Posted 05-01-2023 06:13 PM

GENMOD allows you to estimate the mean of whichever distribution is specified, not the median. The LSMEANS statement I showed provides estimates of the gamma mean at each level of the predictor. If you want to estimate the median, or other quantile, that is what quantile regression is for, which PROC QUANTREG can do.

Reeza · Posted 05-01-2023 03:00 PM

If you have the LOS and other variables, I wonder if survival analysis isn't an option as well, but with no censoring may be equivalent to genmod.

Registration is open