I am running a multilevel ordinal logistic regression model using PROC GLIMMIX. The outcome variable is number of school suspensions (none, 1, 2 or more), with children (level 1) nested in schools (level 2). There are multiple categorical predictor variables in the model as well.
The code I have used is as follows:
proc glimmix data=have method=quad ;
class suspension_cat school_ID child_sex child_esl child_atsi mh_dx different_school siblings patage_cat matage_cat mat_maritalstat parent_highestED seifa_quintile school_aria;
model suspension_cat = child_sex child_esl child_atsi mh_dx different_school siblings patage_cat matage_cat mat_maritalstat parent_highestED
seifa_quintile school_aria / CL DIST= multi LINK= clogit SOLUTION ODDSRATIO (DIFF=last label);
random intercept / subject=school_ID type=vc ;
run;
The odds ratio estimates I obtain from the above code look like this:
Odds Ratio Estimates | ||||
Comparison | Estimate | DF | 95% Confidence Limits | |
male vs female | 9.07 | 14256 | 6.48 | 12.69 |
ESL vs non-ESL | 0.99 | 14256 | 0.63 | 1.57 |
ATSI vs non ATSI | 2.27 | 14256 | 1.64 | 3.14 |
mh_dx vs no_mh_dx | 10.73 | 14256 | 7.18 | 16.03 |
different school vs same school | 1.41 | 14256 | 1.10 | 1.79 |
siblings 4 or more vs 1 | 3.15 | 14256 | 2.15 | 4.61 |
siblings 3 vs 1 | 2.04 | 14256 | 1.42 | 2.92 |
siblings 2 vs 1 | 1.14 | 14256 | 0.83 | 1.59 |
siblings none vs 1 | 1.84 | 14256 | 1.24 | 2.72 |
patage_cat under 20 vs 30 to 39 | 1.38 | 14256 | 0.77 | 2.48 |
patage_cat 20 to 29 vs 30 to 39 | 1.41 | 14256 | 1.04 | 1.92 |
patage_cat 40 and over vs 30 to 39 | 1.01 | 14256 | 0.61 | 1.65 |
matage_cat under 20 vs 30 to 39 | 1.64 | 14256 | 0.99 | 2.71 |
matage_cat 20 to 29 vs 30 to 39 | 1.31 | 14256 | 0.94 | 1.82 |
matage_cat 40 and over vs 30 to 39 | 1.39 | 14256 | 0.63 | 3.09 |
mat_maritalstat unmarried vs married | 1.62 | 14256 | 1.18 | 2.21 |
mat_maritalstat divorced/widowed vs married | 1.63 | 14256 | 0.78 | 3.42 |
parent_highestED high school vs university | 1.47 | 14256 | 1.00 | 2.15 |
parent_highestED vocational vs university | 1.46 | 14256 | 1.01 | 2.11 |
ses_quintile 1 vs 5 | 2.65 | 14256 | 1.58 | 4.45 |
ses_quintile 2 vs 5 | 2.07 | 14256 | 1.22 | 3.50 |
ses_quintile 3 vs 5 | 1.50 | 14256 | 0.87 | 2.58 |
ses_quintile 4 vs 5 | 1.02 | 14256 | 0.56 | 1.83 |
school_aria remote vs metro | 1.59 | 14256 | 0.99 | 2.55 |
school_aria regional vs metro | 1.29 | 14256 | 0.87 | 1.91 |
What I need is to have separate odds ratios for '1 suspension' and '2 or more suspensions' (i.e. all categories of suspension other than the reference category). I know that when running a multinomial logistic regression you can use the 'group= ' option to obtain odds ratios for each level of the DV, and this is what I'm needing for the ordinal logistic regression.
I believe I need to write an 'estimate' statement with the /exp option, but I haven't been able to work out exactly how to write this. Can someone please provide me with some sample syntax?
An odds ratio from a cumulative logistic model evaluates the effect of a predictor on all possible dichotomizations of the response levels formed by splitting the ordered levels into two groups. The model assumes that the effect is the same no matter where the split occurs, which is why you only get a single odds ratio estimate for a continuous predictor. The odds ratios that you want contrast only two distinct response levels instead of all of them split into two groups and are not natural odds ratios obtained from the ordinal model. The odds ratios you want naturally arise from the nominal logistic model using generalized, rather than cumulative, logits.
So, the easiest way to get the odds ratios you want is to change the model to use generalized logits. While it is not impossible to estimate the odds ratios you want from the ordinal model, it would require some work. If you just need the point estimates, you could compute them from the predicted mean values (which are cumulative probabilities) from the OUTPUT statement. But if you also want standard errors for them (and possibly confidence intervals), then you would need to use the NLEstimate macro by writing each odds ratio as a function of the model parameters.
An odds ratio from a cumulative logistic model evaluates the effect of a predictor on all possible dichotomizations of the response levels formed by splitting the ordered levels into two groups. The model assumes that the effect is the same no matter where the split occurs, which is why you only get a single odds ratio estimate for a continuous predictor. The odds ratios that you want contrast only two distinct response levels instead of all of them split into two groups and are not natural odds ratios obtained from the ordinal model. The odds ratios you want naturally arise from the nominal logistic model using generalized, rather than cumulative, logits.
So, the easiest way to get the odds ratios you want is to change the model to use generalized logits. While it is not impossible to estimate the odds ratios you want from the ordinal model, it would require some work. If you just need the point estimates, you could compute them from the predicted mean values (which are cumulative probabilities) from the OUTPUT statement. But if you also want standard errors for them (and possibly confidence intervals), then you would need to use the NLEstimate macro by writing each odds ratio as a function of the model parameters.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.