Hello Everybody!
So we did a Proc Logistic where we coded female = 0, male = 1 and we set the reference = 0. We got an odds ratio of 0.40 and it is significant at 95% level of confidence. Can we interpret this as females having 60% decrease in odds of being symptomatic given they tested COVID-19 positive?
Thanks,
Brian
Sorry, @ed_sas_member, I think it's rather vice versa (but your explanation is fine otherwise): The original OR=0.4 is "male vs. female" because "female" (gender=0) has been defined as the reference level. (The "1 vs. 0" should also appear in the "Odds Ratio Estimates" table of PROC LOGISTIC output.) So, controlling for othervars, females have 2.5 (=1/0.4) times higher odds of being symptomatic than males (assuming that, e.g., sympto=1 means "symptomatic" vs. sympto=0). With OR=1.6 males would have 1.6 times higher odds than females.
Also, just to underline the good point "all other parameters being constant": It's important to note that the model controls for othervars. Theoretically it's possible to observe odds ratios <1 (male vs. female) in, say, both categories othervars=0 and othervars=1 of a dichotomous model variable, but nevertheless a "crude" odds ratio >1 overall (Simpson's paradox).
Here's the code we used:
proc logistic data=out.final descending; class gender (ref = '0') othervars (ref='0') / param=ref; model sympto = gender othervars / selection = stepwise slentry = 0.3 slstay = 0.35 lackfit; run;
where gender was coded as female = 0. If the resulting Odds Ratio for gender is 0.4 does that mean females have less odds of being symptomatic than males? It's not clear to me what the (ref=0) / param=ref part of the code is doing.
Thanks in advance for any guidance
Hi @BTAinRVA
The PARAM=REF option requests that SAS use reference cell coding and the REF=0 defines the first value of 0 (<=> females) to be the reference group (baseline level) for comparison purpose
In practice, if you get a significative OR = 0.40 with this method, it means that there is 60% decrease in the odds of the event (<=> being symptomatic) compare to males, all other parameters being constant. As it can be hard to understand, you can define REF=1 to have an OR>1.
In practice:
If you had used the default method which is effect coding (PARAM=EFFECT) and define REF=0,
each parameter estimate would have measured the difference between the effect at that level (0 or 1) and the average effect of males and females combined.
In practice,
Best,
Hey ed_sas_member,
Thanks for the reply! So in a similar vein if the OR had been say 1.6 we would conclude that females have a 60% increase in odds of being symptomatic than males, correct?
Hi @BTAinRVA
Yes absolutely
Sorry, @ed_sas_member, I think it's rather vice versa (but your explanation is fine otherwise): The original OR=0.4 is "male vs. female" because "female" (gender=0) has been defined as the reference level. (The "1 vs. 0" should also appear in the "Odds Ratio Estimates" table of PROC LOGISTIC output.) So, controlling for othervars, females have 2.5 (=1/0.4) times higher odds of being symptomatic than males (assuming that, e.g., sympto=1 means "symptomatic" vs. sympto=0). With OR=1.6 males would have 1.6 times higher odds than females.
Also, just to underline the good point "all other parameters being constant": It's important to note that the model controls for othervars. Theoretically it's possible to observe odds ratios <1 (male vs. female) in, say, both categories othervars=0 and othervars=1 of a dichotomous model variable, but nevertheless a "crude" odds ratio >1 overall (Simpson's paradox).
Good catch ! I misspoke myself because I am more used to code male=0.
Sorry for the confusion
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.