Hi all,
I am trying to determine if there is an association between a categorical but not ordered predictor variable (occupation) and a categorical ordered outcome variable (relatedness). Which are assigned the following levels:
Occupation - 0 = farmer, 1 = fisher, 2 = baker
Relatedness - 0 = low, 1 = medium, 2 = high
I am trying to perform the regression using the following code:
proc logistic data = mydata; class occupation(ref = '0') relatedness(ref = '0'); model relatedness = occupation; run;
The model is converging and the score test for the proportional odds assumption is not significant. I am however a bit confused about how to interpret the odds ratio estimates from the output. I have zero or the farmer occupation as the reference and the reported OR compared to fishers is 0.672 and the OR compared to bakers is 1.189. Is this OR the same over the low, medium, and high levels of relatedness or am I off base? Any help would be appreciated.
First, it is important that you always examine the Response Profile table to be sure that your response levels are in logically ascending or descending order when you fit an ordinal response model. Otherwise, the results would be meaningless. Note in the log that your reference level setting for the response variable is ignored since this is an ordinal response. In general, do not specify the response variable in the CLASS statement. By default, you will be modeling the probability of lower response levels. If you want to model the probability of higher levels, then specify the DESCENDING option in parens following the response variable in the MODEL statement (model relatedness(desc) = occupation;). Now, concerning the odds ratio estimates, they are telling you the ratio of the odds of a lower response when comparing two of your occupation levels. So, your 1.189 odds ratio means that the odds of lower relatedness is higher for bakers than for farmers since it is greater than 1.
First, it is important that you always examine the Response Profile table to be sure that your response levels are in logically ascending or descending order when you fit an ordinal response model. Otherwise, the results would be meaningless. Note in the log that your reference level setting for the response variable is ignored since this is an ordinal response. In general, do not specify the response variable in the CLASS statement. By default, you will be modeling the probability of lower response levels. If you want to model the probability of higher levels, then specify the DESCENDING option in parens following the response variable in the MODEL statement (model relatedness(desc) = occupation;). Now, concerning the odds ratio estimates, they are telling you the ratio of the odds of a lower response when comparing two of your occupation levels. So, your 1.189 odds ratio means that the odds of lower relatedness is higher for bakers than for farmers since it is greater than 1.
That question is not consistent with an ordinal response. As discussed in this note, the model on an ordinal response models a set of logits, each of which divides the set of response levels into two groups. This is done by moving the division between the ordered levels progressively higher. What you seem to want is to contrast a pair of response levels (2 and 1), not involving the other level (0). If that is what you want, then you want a nominal, not an ordinal, model. That requires specifying the LINK=GLOGIT option in the MODEL statement. If the DESCENDING option is used, then the first logit defined will contrast response level 2 with response level 1, and the second logit will contrast level 2 with level 0. So, the odds ratio on the first logit would be what you want. In summary:
proc logistic;
class occupation(ref="0");
model relatedness(desc) = occupation / link=glogit;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.