Programming the statistical procedures from SAS

Question about proc logistic and correlation between vars of input

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 97
Accepted Solution

Question about proc logistic and correlation between vars of input

Hi,

I have a question about proc logistic.
I am creating a regresion logistic model with proc logistic.

All the vars included in the model have dependencies with the target table.
Using proc freq or proc discrim I can see that there is a dependency

 

When I am exploring the vars, I find that there are vars highly correlated.
The vars rango_ant and rango_edad hace a correlation about 0.9

Do I have to exclude one of them from my model?

 

This is my model:

proc logistic data=test outmodel=modelo1 plots(only)=roc;
class cod_posicion nivel_sal rango_edad rango_ant rango_eval;
model baja = rango_edad rango_ant nivel_sal rango_eval cod_posicion ;
quit;

 

Do I have tu use cross effects?. Like this:

 

proc logistic data=test outmodel=modelo1 plots(only)=roc;
class cod_posicion nivel_sal rango_edad rango_ant rango_eval;
model baja = rango_edad rango_ant nivel_sal rango_eval cod_posicion rango_edad*rango_ant ;
quit;

 

proc logistic data=test outmodel=modelo1 plots(only)=roc;
class cod_posicion nivel_sal rango_edad rango_ant rango_eval;
model baja = rango_edad rango_ant nivel_sal rango_eval cod_posicion rango_edad*rango_ant ;
quit;

 

I don't know the effect of correlation in logistic regresion.
Can anybody help me?, any help will be greatly appreciated

 

Thanks in advance


Accepted Solutions
Solution
‎05-25-2017 10:28 AM
Super User
Posts: 9,769

Re: Question about proc logistic and correlation between vars of input

I don't know if the following was right, I read it from documentation.

Logistic is fitted by MLE, therefore unlike OLS , MLE will automatically take into account multicollinear , and will drop the variable if it has high correaltion with other variables. So maybe you should use METHOD=STEPWISE to pick up the right variables.

View solution in original post


All Replies
SAS Super FREQ
Posts: 3,546

Re: Question about proc logistic and correlation between vars of input

Paul Allison wrote a nice article about this topic. There are also many comments/responses posted to his article.

Frequent Contributor
Posts: 97

Re: Question about proc logistic and correlation between vars of input

Thanks very much Rick....I dont understand how to apply the conclusion of the artcles to my case..., I thinks is another case

Can anybodu hepl me?

Solution
‎05-25-2017 10:28 AM
Super User
Posts: 9,769

Re: Question about proc logistic and correlation between vars of input

I don't know if the following was right, I read it from documentation.

Logistic is fitted by MLE, therefore unlike OLS , MLE will automatically take into account multicollinear , and will drop the variable if it has high correaltion with other variables. So maybe you should use METHOD=STEPWISE to pick up the right variables.

Frequent Contributor
Posts: 97

Re: Question about proc logistic and correlation between vars of input

Thanks, that works..., using selection=stepwise
One question...¿what is MLE and OLS?
Thanks again
Super User
Posts: 10,854

Re: Question about proc logistic and correlation between vars of input


juanvg1972 wrote:
Thanks, that works..., using selection=stepwise
One question...¿what is MLE and OLS?
Thanks again

Maximum Likelihood Estimator

Ordinary Least Squares

Super User
Posts: 9,769

Re: Question about proc logistic and correlation between vars of input

@ballardw

It is surpirsed to me. You also know statistical theory ? I think you are a seasoned sas programmer .

 

Trusted Advisor
Posts: 1,665

Re: Question about proc logistic and correlation between vars of input

[ Edited ]

Ksharp wrote:

I don't know if the following was right, I read it from documentation.

Logistic is fitted by MLE, therefore unlike OLS , MLE will automatically take into account multicollinear , and will drop the variable if it has high correaltion with other variables. So maybe you should use METHOD=STEPWISE to pick up the right variables.


I don't agree with this at all. Stepwise has many many bad properties that make it a poor choice for modelling. I also can't seem to get my head to believe that MLE is better than OLS in the case of multicollinearity, because the problem is actual a problem of logic rather than a problem of estimation method -- if the x-variables are confounded, then there is really no logical way to separate the effects of confounded variables into "un-confounded effects".

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 152 views
  • 3 likes
  • 5 in conversation