Programming the statistical procedures from SAS

Question about proc logistic and correlation between vars of input

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 122
Accepted Solution

Question about proc logistic and correlation between vars of input

Hi,

I have a question about proc logistic.
I am creating a regresion logistic model with proc logistic.

All the vars included in the model have dependencies with the target table.
Using proc freq or proc discrim I can see that there is a dependency

 

When I am exploring the vars, I find that there are vars highly correlated.
The vars rango_ant and rango_edad hace a correlation about 0.9

Do I have to exclude one of them from my model?

 

This is my model:

proc logistic data=test outmodel=modelo1 plots(only)=roc;
class cod_posicion nivel_sal rango_edad rango_ant rango_eval;
model baja = rango_edad rango_ant nivel_sal rango_eval cod_posicion ;
quit;

 

Do I have tu use cross effects?. Like this:

 

proc logistic data=test outmodel=modelo1 plots(only)=roc;
class cod_posicion nivel_sal rango_edad rango_ant rango_eval;
model baja = rango_edad rango_ant nivel_sal rango_eval cod_posicion rango_edad*rango_ant ;
quit;

 

proc logistic data=test outmodel=modelo1 plots(only)=roc;
class cod_posicion nivel_sal rango_edad rango_ant rango_eval;
model baja = rango_edad rango_ant nivel_sal rango_eval cod_posicion rango_edad*rango_ant ;
quit;

 

I don't know the effect of correlation in logistic regresion.
Can anybody help me?, any help will be greatly appreciated

 

Thanks in advance


Accepted Solutions
Solution
‎05-25-2017 10:28 AM
Super User
Posts: 10,213

Re: Question about proc logistic and correlation between vars of input

Posted in reply to juanvg1972

I don't know if the following was right, I read it from documentation.

Logistic is fitted by MLE, therefore unlike OLS , MLE will automatically take into account multicollinear , and will drop the variable if it has high correaltion with other variables. So maybe you should use METHOD=STEPWISE to pick up the right variables.

View solution in original post


All Replies
SAS Super FREQ
Posts: 3,839

Re: Question about proc logistic and correlation between vars of input

Posted in reply to juanvg1972

Paul Allison wrote a nice article about this topic. There are also many comments/responses posted to his article.

Frequent Contributor
Posts: 122

Re: Question about proc logistic and correlation between vars of input

Thanks very much Rick....I dont understand how to apply the conclusion of the artcles to my case..., I thinks is another case

Can anybodu hepl me?

Solution
‎05-25-2017 10:28 AM
Super User
Posts: 10,213

Re: Question about proc logistic and correlation between vars of input

Posted in reply to juanvg1972

I don't know if the following was right, I read it from documentation.

Logistic is fitted by MLE, therefore unlike OLS , MLE will automatically take into account multicollinear , and will drop the variable if it has high correaltion with other variables. So maybe you should use METHOD=STEPWISE to pick up the right variables.

Frequent Contributor
Posts: 122

Re: Question about proc logistic and correlation between vars of input

Thanks, that works..., using selection=stepwise
One question...¿what is MLE and OLS?
Thanks again
Super User
Posts: 11,810

Re: Question about proc logistic and correlation between vars of input

Posted in reply to juanvg1972

juanvg1972 wrote:
Thanks, that works..., using selection=stepwise
One question...¿what is MLE and OLS?
Thanks again

Maximum Likelihood Estimator

Ordinary Least Squares

Super User
Posts: 10,213

Re: Question about proc logistic and correlation between vars of input

@ballardw

It is surpirsed to me. You also know statistical theory ? I think you are a seasoned sas programmer .

 

Respected Advisor
Posts: 2,055

Re: Question about proc logistic and correlation between vars of input

[ Edited ]

Ksharp wrote:

I don't know if the following was right, I read it from documentation.

Logistic is fitted by MLE, therefore unlike OLS , MLE will automatically take into account multicollinear , and will drop the variable if it has high correaltion with other variables. So maybe you should use METHOD=STEPWISE to pick up the right variables.


I don't agree with this at all. Stepwise has many many bad properties that make it a poor choice for modelling. I also can't seem to get my head to believe that MLE is better than OLS in the case of multicollinearity, because the problem is actual a problem of logic rather than a problem of estimation method -- if the x-variables are confounded, then there is really no logical way to separate the effects of confounded variables into "un-confounded effects".

--
Paige Miller
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 158 views
  • 3 likes
  • 5 in conversation