BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
div44
Calcite | Level 5

Hello,

 

I am performing logistic regression using binary dependent variable. However, one of my independent variable is continuous in nature and has an inverted-U shaped distribution with my dependent variable. Since the association is not linear, I am unable to figure out how do I incorporate the desired independent variable in the model. One option is to categorize the continuous variable, but I want to avoid that.

 

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

Adding an EFFECT statement that defines a spline effect for your independent variable is certainly one possibility if a simple polynomial model form (squared, cubed, etc.) isn't adequate. Another easy to implement approach is to use a Generalized Additive Model in either PROC GAM or the newer (available in SAS 9.4 TS1M3) PROC GAMPL. See the examples of using these procedures in the SAS/STAT User's Guide.

View solution in original post

9 REPLIES 9
Reeza
Super User

 


@div44 wrote:

However, one of my independent variable is continuous in nature and has an inverted-U shaped distribution with my dependent variable. Since the association is not linear, I am unable to figure out how do I incorporate the desired independent variable in the model. 

 

Please clarify in detail, what assumption for logistic regression model are you concerned with and why do you think your data does not meet this assumption.

 

div44
Calcite | Level 5

Hello Reeza,

 

I am manily concerned with the non-linear association between my dependent variable and independent variable. The independent variable shows an inverted - U shaped distribution when plotted against the dependent variable. As for data which is right-skewed (cost data in general), log transformation are used to model costs as independent variables, however I am unaware of any such transformations whih are used to model data which is U-shaped or inverted-U shaped. 

 

A classic example I can think of is that of a disease affecting middle aged population the most, then elderly population and young population the least. If age was my independent variable, it would have led to an inverted - U shaped distribution.

 

I hope this is clear.

 

Than you

Reeza
Super User

Is that an assumption for logistic regression? I don't believe it is.

pbwn
Calcite | Level 5

i think dose response is often modelled using logistic regression so you might check that literature, eg they speak about biphasic dose response which i guess can be an inverted-U shape?

Reeza
Super User

The log odds need to be linearly related to your variable, not the two variables. So after conversion what does the relationship look like?

http://www.statisticssolutions.com/assumptions-of-logistic-regression/

Ksharp
Super User

Check EFFECT statement of proc logistic. 

You can use spline curve to fit the nonlinear relationship.

Calling @Rick_SAS

StatDave
SAS Super FREQ

Adding an EFFECT statement that defines a spline effect for your independent variable is certainly one possibility if a simple polynomial model form (squared, cubed, etc.) isn't adequate. Another easy to implement approach is to use a Generalized Additive Model in either PROC GAM or the newer (available in SAS 9.4 TS1M3) PROC GAMPL. See the examples of using these procedures in the SAS/STAT User's Guide.

stat_sas
Ammonite | Level 13

Hi,

 

You will get U shape distribution because you are plotting continuous variable against a binary variable which has only two values. A useful plot to detect nonlinear relationship is plot of the empirical logits in logistic regrssion.

pbwn
Calcite | Level 5

they couldn\t possibly obtain a U shape if they are plotting against a dichotomous variable, i would assume theyre are plotting against logit(y)

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 2544 views
  • 2 likes
  • 6 in conversation