BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pink_poodle
Barite | Level 11

This is a group of theoretical questions based on respective scenarios pertaining to the meaning of the intercept in the logistic procedure.

 

Scenario 1:

I am working with a binary response model. There is only one categorical predictor.  The predictor has levels 1 and 2, and the param method is reference cell coding. The intercept (0.4961) is a logit probability of the event at the reference level (level 2). The estimate (-0.2626) is difference between logits of level1 and the reference level (level2).  Now I re-run the code with the noint option that supresses the intercept. Did I just make the probability of event at the reference level zero? The estimate becomes 0.2335 = 0.4961 - 0.2626, suggesting a positive answer, but perhaps there is a way to retrieve the 0.4961 number from the output.

Question 1: Is there a way to retrieve that 0.4961 number (logit probability of the event at the reference level) from the output generated with the noint option?

 

Scenario 2:

I added a second, continuous predictor to the previous model.

Without the noint option, how does that affect the meaning of the intercept? Is the intercept now a sum of contribution to the intercept from adding the continuous predictor and the logit probability of the event at the reference level?

 

Scenario 3: 

I ran the model from the second scenario with the noint option. 

What is the meaning of estimate for level1 category now? Is it still the difference between its logit and the logit of the reference level? Is there a way to retrieve the logit of the reference level from the output? 

 

Any thoughts are greatly appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @pink_poodle,


@pink_poodle wrote:

(...)

Scenario 1:

(...) The intercept (0.4961) is a logit probability of the event at the reference level (level 2).


I'd rather say it is an estimate of that logit probability.



Scenario 1: (...) Now I re-run the code with the noint option that supresses the intercept. Did I just make the probability of event at the reference level zero?


No, not the probability, but the logit probability. (Hence, that probability is now assumed to be 1/2.)


Question 1: Is there a way to retrieve that 0.4961 number (logit probability of the event at the reference level) from the output generated with the noint option?


No, there isn't. The logit is now assumed to be 0.


Scenario 2:

I added a second, continuous predictor to the previous model.

Without the noint option, how does that affect the meaning of the intercept? Is the intercept now a sum of contribution to the intercept from adding the continuous predictor and the logit probability of the event at the reference level?


Now, the intercept estimates the logit probability of the event for a subject with the first predictor being at its reference level and the second predictor being equal to zero.


Scenario 3: 

I ran the model from the second scenario with the noint option. 

What is the meaning of estimate for level1 category now? Is it still the difference between its logit and the logit of the reference level?


Yes, in both models (with and without intercept) this number estimates the difference between those logits, i.e., the log odds ratio of level 1 vs. level 2, everything else (i.e. the value of the second predictor) being the same. However, if the true intercept is not zero and the NOINT option is used inappropriately, the number in question will likely overestimate or underestimate that log odds ratio.


Scenario 3: 

(...) Is there a way to retrieve the logit of the reference level from the output? 


Similar to question 1, the logit of the reference level, the second predictor being equal to zero, is now assumed to be 0 and nothing else can be derived from the estimates in the output.

View solution in original post

6 REPLIES 6
FreelanceReinh
Jade | Level 19

Hi @pink_poodle,


@pink_poodle wrote:

(...)

Scenario 1:

(...) The intercept (0.4961) is a logit probability of the event at the reference level (level 2).


I'd rather say it is an estimate of that logit probability.



Scenario 1: (...) Now I re-run the code with the noint option that supresses the intercept. Did I just make the probability of event at the reference level zero?


No, not the probability, but the logit probability. (Hence, that probability is now assumed to be 1/2.)


Question 1: Is there a way to retrieve that 0.4961 number (logit probability of the event at the reference level) from the output generated with the noint option?


No, there isn't. The logit is now assumed to be 0.


Scenario 2:

I added a second, continuous predictor to the previous model.

Without the noint option, how does that affect the meaning of the intercept? Is the intercept now a sum of contribution to the intercept from adding the continuous predictor and the logit probability of the event at the reference level?


Now, the intercept estimates the logit probability of the event for a subject with the first predictor being at its reference level and the second predictor being equal to zero.


Scenario 3: 

I ran the model from the second scenario with the noint option. 

What is the meaning of estimate for level1 category now? Is it still the difference between its logit and the logit of the reference level?


Yes, in both models (with and without intercept) this number estimates the difference between those logits, i.e., the log odds ratio of level 1 vs. level 2, everything else (i.e. the value of the second predictor) being the same. However, if the true intercept is not zero and the NOINT option is used inappropriately, the number in question will likely overestimate or underestimate that log odds ratio.


Scenario 3: 

(...) Is there a way to retrieve the logit of the reference level from the output? 


Similar to question 1, the logit of the reference level, the second predictor being equal to zero, is now assumed to be 0 and nothing else can be derived from the estimates in the output.

pink_poodle
Barite | Level 11

@FreelanceReinh ,

 

Thank you for a wonderful reply. This is making a difficult concept a lot more clear. Can you please clarify two quotes:

Yes, in both models (with and without intercept) this number estimates the difference between those logits, i.e., the log-odds ratio of level 1 vs. level 2,everything else(i.e. the value of the second predictor)being the same.

Do you mean everything else being zero, like you answered for Scenario 2 ("the first predictor being at its reference level and the second predictor being equal to zero")? In that case, the coefficient_1 for these two models would be the same (D_level1 is the design variable for level1 of categorical predictor):

(1) logit(predicted probability of event) = intercept + coefficient_1*D_level1

(2) logit(predicted probability of event) = intercept + coefficient_1*D_level1 + coefficient_2*continuous_predictor

 

But, in fact, coefficient_1 changes with the addition of the continuous predictor. Also with the intercept, coefficient_1 changes with the addition of the continuous predictor. How would you explain that?

However, if the true intercept is not zero and the NOINT option is used inappropriately, the number in question will likely overestimate or underestimate that log odds ratio.

Is NOINT only appropriate for a model that contains only the continuous predictors? It seems that with addition of categorical factors, the intercept becomes a necessity, because it has a meaning. For example, it is not appropriate for me to change the predicted probability of the reference level (level2) from 0.3785 to 0.5000 like I did by using the NOINT option on model (1). How would you tell if a true intercept is not zero?  When would you say it is inappropriate to use the NOINT option?

 

Thank you very much for your help.

FreelanceReinh
Jade | Level 19

@pink_poodle: You're welcome


@pink_poodle wrote:

Yes, in both models (with and without intercept) this number estimates the difference between those logits, i.e., the log-odds ratio of level 1 vs. level 2,everything else(i.e. the value of the second predictor)being the same.

Do you mean everything else being zero, like you answered for Scenario 2 ("the first predictor being at its reference level and the second predictor being equal to zero")?


No, I meant that the logit difference is calculated at the same level of the second predictor, regardless of what level this may be. When you look at your equation (1), let D_level1 be 1 (let's call this equation "1a"), then change it to 0 (equation "1b") and finally subtract equation 1b from equation 1a, then you see that the right-hand side equals coefficient_1 while the left-hand side is the difference of two logits: The logit pertaining to level 1 minus the logit pertaining to level 2 of the first predictor. You also see that this doesn't change if you include the intercept into the model: the intercept cancels out when subtracting 1b from 1a. Now do the same with equation (2): To obtain the analogous result the continuous predictor must not change, it has to be "the same" in equations "2a" and "2b". Otherwise the difference "2a − 2b" would yield "coefficient_1 plus coefficient_2 times the difference between the two values of continuous_predictor" on the RHS and not just "coefficient_1" as desired. But it doesn't matter what the common value of the continuous predictor in 2a and 2b is, because it will cancel out in all cases: x − x = 0.

 


But, in fact, coefficient_1 changes with the addition of the continuous predictor. Also with the intercept, coefficient_1 changes with the addition of the continuous predictor. How would you explain that?


Adding a new predictor changes the model. Hence, all coefficients are newly estimated considering the additional information introduced by the values of the new predictor (and their relationship to the other predictor values and the values of the dependent variable). So, it's quite common that coefficient_1 changes in this situation, especially if the new predictor is correlated with the existing predictor.

 


Is NOINT only appropriate for a model that contains only the continuous predictors?


Even if all predictors were continuous you would need a good reason for using the NOINT option. I think I've very rarely used it in practice with PROC LOGISTIC (although I did a lot of logistic regressions over the years), rather with linear models, e.g. in PROC REG. Not surprisingly: There are examples of linear regression models where the conclusion "all [predictors] xi are zero ==> [dependent variable] y must be zero" is plausible. It's more difficult to contrive a real-world example of a logistic regression model where the conclusion "all xi are zero ==> the probability of the event must be 1/2" makes sense.

 


How would you tell if a true intercept is not zero?  When would you say it is inappropriate to use the NOINT option?

As mentioned, in most cases it's quite natural to expect that the (true) intercept is not necessarily zero. So, you would omit the NOINT option and let the maximum-likelihood method do it's work without that restriction. Suppose, the estimate of the intercept turns out to be strikingly close to zero (unlike the other estimated coefficients) and, thinking about the model, you realize that the above conclusion ("... probability ... must be 1/2") is sensible based on subject-matter considerations, then you may want to consider a no-intercept model. As a rule, if you're analyzing data from a planned experiment, you will use the type of model that has been defined in advance.

pink_poodle
Barite | Level 11

Thank you, I really liked the explanation about using the NOINT option. 

Ok, coefficient1 is always the difference between logit of level 1 and logit of the reference level (level2). So then, when the continuous predictor joins the model, looking at just one equation, is the intercept no longer purely the logit probability at the reference level of the categorical predictor?

Equation: logit(p) = b0 + b1*(D_level1) + b2*x

FreelanceReinh
Jade | Level 19

@pink_poodle wrote:

So then, when the continuous predictor joins the model, looking at just one equation, is the intercept no longer purely the logit probability at the reference level of the categorical predictor?

Equation: logit(p) = b0 + b1*(D_level1) + b2*x


Correct. You'd need the additional condition x=0 for this interpretation (see my reply to "Scenario 2"). However, zero could be a totally impossible value for x. In this case, the resulting logit(p)=b0 under the assumptions x=0 and D_level1=0 would at least be a purely hypothetical value and likely even nonsense because this extrapolation would be inadmissible.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2600 views
  • 5 likes
  • 2 in conversation