Statistical Procedures

BTAinRVA · Posted 06-19-2020 09:27 AM

Hello Everybody!

So we did a Proc Logistic where we coded female = 0, male = 1 and we set the reference = 0. We got an odds ratio of 0.40 and it is significant at 95% level of confidence. Can we interpret this as females having 60% decrease in odds of being symptomatic given they tested COVID-19 positive?

Thanks,

Brian

FreelanceReinh · Posted 06-19-2020 01:07 PM

Sorry, @ed_sas_member, I think it's rather vice versa (but your explanation is fine otherwise): The original OR=0.4 is "male vs. female" because "female" (gender=0) has been defined as the reference level. (The "1 vs. 0" should also appear in the "Odds Ratio Estimates" table of PROC LOGISTIC output.) So, controlling for othervars, females have 2.5 (=1/0.4) times higher odds of being symptomatic than males (assuming that, e.g., sympto=1 means "symptomatic" vs. sympto=0). With OR=1.6 males would have 1.6 times higher odds than females.

Also, just to underline the good point "all other parameters being constant": It's important to note that the model controls for othervars. Theoretically it's possible to observe odds ratios <1 (male vs. female) in, say, both categories othervars=0 and othervars=1 of a dichotomous model variable, but nevertheless a "crude" odds ratio >1 overall (Simpson's paradox).

View solution in original post

BTAinRVA · Posted 06-19-2020 11:18 AM

Here's the code we used:

proc logistic data=out.final descending;
		class gender (ref = '0')  othervars (ref='0') / param=ref;
		model sympto = gender othervars
			/ selection = stepwise
			  slentry   = 0.3
			  slstay    = 0.35
			  lackfit;
	run;

where gender was coded as female = 0. If the resulting Odds Ratio for gender is 0.4 does that mean females have less odds of being symptomatic than males? It's not clear to me what the (ref=0) / param=ref part of the code is doing.

Thanks in advance for any guidance

ed_sas_member · Posted 06-19-2020 12:09 PM

Hi @BTAinRVA

The PARAM=REF option requests that SAS use reference cell coding and the REF=0 defines the first value of 0 (<=> females) to be the reference group (baseline level) for comparison purpose

In practice, if you get a significative OR = 0.40 with this method, it means that there is 60% decrease in the odds of the event (<=> being symptomatic) compare to males, all other parameters being constant. As it can be hard to understand, you can define REF=1 to have an OR>1.

In practice:

SAS would have created the following dummy variables: Males (1) -> 1; Female (0) -> 0
So the logit for males = beta0 + beta1 *(1) and the logic for females = beta0 + beta1 *(0)
And so, OR= exp(beta1) -> which can be easily deduced

If you had used the default method which is effect coding (PARAM=EFFECT) and define REF=0,

each parameter estimate would have measured the difference between the effect at that level (0 or 1) and the average effect of males and females combined.

In practice,

SAS would have created the following dummy variables: Males (1) -> 1; Female (0) -> -1
So the logit for males = beta0 + beta1 *(1) and the logic for females = beta0 + beta1 *(-1)
And so, OR= exp(beta1 *2) -> which is not easily deduced

Best,

BTAinRVA · Posted 06-19-2020 12:27 PM

Hey ed_sas_member,

Thanks for the reply! So in a similar vein if the OR had been say 1.6 we would conclude that females have a 60% increase in odds of being symptomatic than males, correct?

ed_sas_member · Posted 06-19-2020 12:44 PM

Hi @BTAinRVA

Yes absolutely

FreelanceReinh · Posted 06-19-2020 01:07 PM

Sorry, @ed_sas_member, I think it's rather vice versa (but your explanation is fine otherwise): The original OR=0.4 is "male vs. female" because "female" (gender=0) has been defined as the reference level. (The "1 vs. 0" should also appear in the "Odds Ratio Estimates" table of PROC LOGISTIC output.) So, controlling for othervars, females have 2.5 (=1/0.4) times higher odds of being symptomatic than males (assuming that, e.g., sympto=1 means "symptomatic" vs. sympto=0). With OR=1.6 males would have 1.6 times higher odds than females.

Also, just to underline the good point "all other parameters being constant": It's important to note that the model controls for othervars. Theoretically it's possible to observe odds ratios <1 (male vs. female) in, say, both categories othervars=0 and othervars=1 of a dichotomous model variable, but nevertheless a "crude" odds ratio >1 overall (Simpson's paradox).

ed_sas_member · Posted 06-19-2020 01:33 PM

Hi @FreelanceReinh

Good catch ! I misspoke myself because I am more used to code male=0.

Sorry for the confusion

Statistical Procedures

Odds Ratio Interpretation

Re: Odds Ratio Interpretation

Re: Odds Ratio Interpretation

Re: Odds Ratio Interpretation

Re: Odds Ratio Interpretation

Re: Odds Ratio Interpretation

Re: Odds Ratio Interpretation

Re: Odds Ratio Interpretation

Odds Ratio Interpretation

proc glimmix, convergence and odds ratio

Interpreting ARIMAX Models, Part 1

Confusion - Interpretation of Odds Ratio

Odd Ratio Interpretation - with reference

Follow Us

What is...

Statistical Procedures

Our biggest data and AI event of the year.

Follow Us

What is...