Assuming there is existence of multicollinearity effect of an importance variable(X1) in a logistic regression model, I would like to find out how the performance of model will change if these correlated variables are removed one by one. Because I found X1 has opposite sign in the final model compared to bivariate model(). I would like to find the variable(s) causing the sign flipped. Here are the details,
For example,
1) with X1
Variable | Coefficient |
x2 | 0.6 |
x3 | 0.5 |
x4 | 0.4 |
2)
Proc logistic data= data desc;
Model y= x1 x4 x3 x2;
Run;
Proc logistic data=data desc ;
Model y=x1 x4 x3;
Run;
Proc logistic data=data desc ;
Model y= x1 x4;
Run;
Does anyone know how to convert logistic regression process in an automatic way? Thanks a lot.
See this note.
So you wouldn't ever look at a model that is just X4 as a predictor?
model X1=X4;
PROC LOGISTIC does give you the ability to perform "best subset selection" which may be a slightly better method than your looping, and certainly is less coding. You could use the option METHOD=SCORE in the model statement to get this.
Calling @Rick_SAS
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.