Hi,
I have the following distribution, and odd ratio for an event (taking up a certain product).
From my understanding of odd ratio, it should be interpreted as
1) Those with NoCreditCard is more likely to take up the product vs WithCreditCard_WithTxn
2) Those with WithCreditCard_WithTxn is more likely to take up the product vs WithCreditCard_NoTxn
but why is my actual event distribution (Observed data) show that those with "No Credit Card" has only 19% that takeup the product, which is the lowest of all..... Did i read the odd ratio wrongly?
Base Distribution | Event | |
No Credit Card | 34% | 19% |
WithCreditCard_NoTxn | 55% | 23% |
WithCreditCard_WithTxn | 11% | 31% |
Point Estimate | ||
NoCreditCard vs WithCC_WithTxn | 1.196 | |
WithCC_NoTxn vs WithCC_WithTxn | 0.839 |
Thanks.
Mei.
@okla wrote:
Q1) Why the Odd ratio for A with reference to male change? Shouldn't my Odd ratio for female still the same as (A) above? Since it is still comparing female and male (ie the odd of female to buy ice cream if increase by one unit)?
Think of it as a linear regression and adding a new variable, the parameters would change. Since the parameters have changed, the odds ratio changed.
Q2) Say (A) above is < 1, is it possible that once i added "TodayWeather", the odd ratio become > 1? Why?
Sounds like this:
https://en.wikipedia.org/wiki/Simpson%27s_paradox
What parameterization method did you use? I suggest including your code and the relevant output directly as well.
Hi,
I am using EM 12.1, and this is the setting for logistic regression.
What are your options for INPUT CODING ?
By default SAS uses GLM which is not what a standard text book teaches. I think it’s INPUT CODING but not 100% sure.
Looks like it is input coding. I have changed the setting to use "GLM" instead of "Deviation" (Default), the odd ratio has changed to this
Point Estimate | |
NoCreditCard vs WithCC_WithTxn | 0.766 |
WithCC_NoTxn vs WithCC_WithTxn | 0.839 |
Thank you. I must put some research on input coding. Not quite sure how it works.
Referential/Ref is the most common. It’s basically creating dummy variables for your categorical variables.
Hi Reeza,
Sorry, coming back to this, now that i understand the different on the referential/ref, i have got a question on odd ratio...
Say, i am trying to predict likelihood to buy ice cream (event =1 , non-event = 0) using logistic regression.
I have only one variable "Gender" (2 value= Male/Female), so, the formula would be like this.
logit(p)=β0+β1∗female
Odd ratio for female with reference to male (A) = odd(female)/odd(male)
Next, I added in "TodayWeather" (2 value = Sun/Rain).
logit(p)=β0+β1∗female+β2*SUN
Q1) Why the Odd ratio for A with reference to male change? Shouldn't my Odd ratio for female still the same as (A) above? Since it is still comparing female and male (ie the odd of female to buy ice cream if increase by one unit)?
Q2) Say (A) above is < 1, is it possible that once i added "TodayWeather", the odd ratio become > 1? Why?
Thank you again.
@okla wrote:
Q1) Why the Odd ratio for A with reference to male change? Shouldn't my Odd ratio for female still the same as (A) above? Since it is still comparing female and male (ie the odd of female to buy ice cream if increase by one unit)?
Think of it as a linear regression and adding a new variable, the parameters would change. Since the parameters have changed, the odds ratio changed.
Q2) Say (A) above is < 1, is it possible that once i added "TodayWeather", the odd ratio become > 1? Why?
Sounds like this:
https://en.wikipedia.org/wiki/Simpson%27s_paradox
Hi Reeza, thanks for your speedy reply. Really appreciate it. I think i found my answer. 🙂
Hi @okla,
I'm glad you found your answer! If one of the replies was the exact solution to your problem, can you "Accept it as a solution"? Or if one was particularly helpful, feel free to "Like" it. This will help other community members who may run into the same issue know what worked.
Thanks!
Anna
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.