How to loop a series of variables in logistic regression automatically...

lionking19063 · Posted 02-26-2020 01:14 PM

Assuming there is existence of multicollinearity effect of an importance variable(X1) in a logistic regression model, I would like to find out how the performance of model will change if these correlated variables are removed one by one. Because I found X1 has opposite sign in the final model compared to bivariate model(). I would like to find the variable(s) causing the sign flipped. Here are the details,

Run proc corr and sort the correlation coefficients with X1 from high to low
Loop over a set of variables in a logistic regression by reducing the one variable each time

For example,

1) with X1

Variable	Coefficient
x2	0.6
x3	0.5
x4	0.4

2)

Proc logistic data= data desc;

Model y= x1 x4 x3 x2;

Run;

Proc logistic data=data desc ;

Model y=x1 x4 x3;

Run;

Proc logistic data=data desc ;

Model y= x1 x4;

Run;

Does anyone know how to convert logistic regression process in an automatic way? Thanks a lot.

StatDave · Posted 02-26-2020 03:48 PM

See this note.

PaigeMiller · Posted 02-26-2020 03:54 PM

So you wouldn't ever look at a model that is just X4 as a predictor?

model X1=X4;

PROC LOGISTIC does give you the ability to perform "best subset selection" which may be a slightly better method than your looping, and certainly is less coding. You could use the option METHOD=SCORE in the model statement to get this.

--
Paige Miller

Ksharp · Posted 02-26-2020 11:03 PM

Calling @Rick_SAS

How to loop a series of variables in logistic regression automatically?

Re: How to loop a series of variables in logistic regression automatically?

Re: How to loop a series of variables in logistic regression automatically?

Re: How to loop a series of variables in logistic regression automatically?

Registration is open