BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
div44
Calcite | Level 5

Hello,

 

I am performing logistic regression using binary dependent variable. However, one of my independent variable is continuous in nature and has an inverted-U shaped distribution with my dependent variable. Since the association is not linear, I am unable to figure out how do I incorporate the desired independent variable in the model. One option is to categorize the continuous variable, but I want to avoid that.

 

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

Adding an EFFECT statement that defines a spline effect for your independent variable is certainly one possibility if a simple polynomial model form (squared, cubed, etc.) isn't adequate. Another easy to implement approach is to use a Generalized Additive Model in either PROC GAM or the newer (available in SAS 9.4 TS1M3) PROC GAMPL. See the examples of using these procedures in the SAS/STAT User's Guide.

View solution in original post

9 REPLIES 9
Reeza
Super User

 


@div44 wrote:

However, one of my independent variable is continuous in nature and has an inverted-U shaped distribution with my dependent variable. Since the association is not linear, I am unable to figure out how do I incorporate the desired independent variable in the model. 

 

Please clarify in detail, what assumption for logistic regression model are you concerned with and why do you think your data does not meet this assumption.

 

div44
Calcite | Level 5

Hello Reeza,

 

I am manily concerned with the non-linear association between my dependent variable and independent variable. The independent variable shows an inverted - U shaped distribution when plotted against the dependent variable. As for data which is right-skewed (cost data in general), log transformation are used to model costs as independent variables, however I am unaware of any such transformations whih are used to model data which is U-shaped or inverted-U shaped. 

 

A classic example I can think of is that of a disease affecting middle aged population the most, then elderly population and young population the least. If age was my independent variable, it would have led to an inverted - U shaped distribution.

 

I hope this is clear.

 

Than you

Reeza
Super User

Is that an assumption for logistic regression? I don't believe it is.

pbwn
Calcite | Level 5

i think dose response is often modelled using logistic regression so you might check that literature, eg they speak about biphasic dose response which i guess can be an inverted-U shape?

Reeza
Super User

The log odds need to be linearly related to your variable, not the two variables. So after conversion what does the relationship look like?

http://www.statisticssolutions.com/assumptions-of-logistic-regression/

Ksharp
Super User

Check EFFECT statement of proc logistic. 

You can use spline curve to fit the nonlinear relationship.

Calling @Rick_SAS

StatDave
SAS Super FREQ

Adding an EFFECT statement that defines a spline effect for your independent variable is certainly one possibility if a simple polynomial model form (squared, cubed, etc.) isn't adequate. Another easy to implement approach is to use a Generalized Additive Model in either PROC GAM or the newer (available in SAS 9.4 TS1M3) PROC GAMPL. See the examples of using these procedures in the SAS/STAT User's Guide.

stat_sas
Ammonite | Level 13

Hi,

 

You will get U shape distribution because you are plotting continuous variable against a binary variable which has only two values. A useful plot to detect nonlinear relationship is plot of the empirical logits in logistic regrssion.

pbwn
Calcite | Level 5

they couldn\t possibly obtain a U shape if they are plotting against a dichotomous variable, i would assume theyre are plotting against logit(y)

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 3044 views
  • 2 likes
  • 6 in conversation