- Home
- /
- Analytics
- /
- Stat Procs
- /
Question about proc logistic and correlation betwe...

05-24-2017 02:31 PM

Hi,

I have a question about proc logistic.

I am creating a regresion logistic model with proc logistic.

All the vars included in the model have dependencies with the target table.

Using proc freq or proc discrim I can see that there is a dependency

When I am exploring the vars, I find that there are vars highly correlated.

The vars rango_ant and rango_edad hace a correlation about 0.9

Do I have to exclude one of them from my model?

This is my model:

proc logistic data=test outmodel=modelo1 plots(only)=roc; class cod_posicion nivel_sal rango_edad rango_ant rango_eval; model baja = rango_edad rango_ant nivel_sal rango_eval cod_posicion ; quit;

Do I have tu use cross effects?. Like this:

proc logistic data=test outmodel=modelo1 plots(only)=roc;

class cod_posicion nivel_sal rango_edad rango_ant rango_eval;

model baja = rango_edad rango_ant nivel_sal rango_eval cod_posicion rango_edad*rango_ant ;

quit;

I don't know the effect of correlation in logistic regresion.

Can anybody help me?, any help will be greatly appreciated

Thanks in advance

Solution

05-25-2017
10:28 AM

05-25-2017 09:15 AM

05-24-2017 02:51 PM

Paul Allison wrote a nice article about this topic. There are also many comments/responses posted to his article.

05-24-2017 03:50 PM

Thanks very much Rick....I dont understand how to apply the conclusion of the artcles to my case..., I thinks is another case

Can anybodu hepl me?

05-25-2017
10:28 AM

05-25-2017 09:15 AM

05-25-2017 10:31 AM

Thanks, that works..., using selection=stepwise

One question...¿what is MLE and OLS?

Thanks again

05-25-2017 01:17 PM

juanvg1972 wrote:

Thanks, that works..., using selection=stepwise

One question...¿what is MLE and OLS?

Thanks again

Maximum Likelihood Estimator

Ordinary Least Squares

05-26-2017 07:59 AM

It is surpirsed to me. You also know statistical theory ? I think you are a seasoned sas programmer .

05-26-2017 08:36 AM - edited 05-26-2017 10:23 AM

Ksharp wrote:

I don't know if the following was right, I read it from documentation.

Logistic is fitted by MLE, therefore unlike OLS , MLE will automatically take into account multicollinear , and will drop the variable if it has high correaltion with other variables. So maybe you should use METHOD=STEPWISE to pick up the right variables.

I don't agree with this at all. Stepwise has many many bad properties that make it a poor choice for modelling. I also can't seem to get my head to believe that MLE is better than OLS in the case of multicollinearity, because the problem is actual a problem of logic rather than a problem of estimation method -- if the x-variables are confounded, then there is really no logical way to separate the effects of confounded variables into "un-confounded effects".