I am attempting to use the stepwise selection method to formulate a parsimonious model from 30 covariates, a dichotomous outcome, and 177 observations. SLENTRY=SLSTAY=0.1 and the initial, univariate Chi-square scores show 10 variables meeting the entry criterion. However, two predictors with the largest Chi-square scores each terminate the stepwise process because they both fail (P>0.6) the predictor retention criterion, once entered and the output states "Model building terminates because the last effect entered is removed by the Wald statistic criterion". If I exclude these two predictors from the stepwise selection, the model proceeds as expected until no additional predictors meet the entry criterion. I have two questions: 1) Why does a predictor with a very large Chi-square score, and p=0.0007, fail to be retained in the stepwise model? and 2) Is it statistically-defensible to exclude predictors from the stepwise process with large Chi-square scores and proceed as I have described above? All advice and citations accepted with gratitude.
Stepwise regression is what I call a counter-intuitive method. It adds variables into the model because they meet some significance criterion, and then it can remove that same variable in the next step (or later step) because it no longer meets the significance criterion. How can that be? How does that make sense? Why would you want to use such a procedure? How would you explain it to someone?
If you want to hear what people say about it, go to your favorite internet search engine and type in "problems with stepwise regression" and read what people say.
What is happening is that when you have correlated predictor variables (as your 30 variables are), the presence of (for example) X7 in the model affects and changes the co-efficients of X1-X6 , and so when the coefficients change, the p-values change and a variable that was significant without X7 in the model can become not significant when X7 is in the model.
So, what should a conscientious data analyst do? My OPINION is that you should not use any form of Stepwise regression (not stepwise, not forward, not backward). Instead, I use Partial Least Squares regression (PROC PLS in SAS) when I have many correlated X variables, and in PLS, a variable that is a good predictor remains a good predictor even when other variables are entered into (or removed from) the model. But wait — PROC PLS only works on continuous Y variables, it doesn't handle the logistic case. There is nothing in SAS that will perform Logistic PLS. There is a paper which explains the Logistic PLS algorithm, and I have written a SAS macro that performs Logistic PLS based upon this paper. I like the way it works in these situations, but I don't think my employer would want me to share the macro.
So what should you do? Well, I don't know. There is R code that performs Logistic PLS, if that's something that would help.
I did suggest that SAS produce a PROC that performs Logistic PLS, but no one has voted for it 😞
https://communities.sas.com/t5/SASware-Ballot-Ideas/Logistic-version-of-PROC-PLS/idi-p/485503
That’s a splendid response. Thank you. Now I have to convince a client.
If I may pursue this just one more step (poor word choice), only the intercept is in the model when the first predictor is entered, which is immediately removed and the model development terminates.
@lcmichael_unc wrote:
If I may pursue this just one more step (poor word choice), only the intercept is in the model when the first predictor is entered, which is immediately removed and the model development terminates.
Can I explain everything that STEPWISE does? No, I can't.
Humor is an excellent explanation. Thanks.
@lcmichael_unc wrote:
If I may pursue this just one more step (poor word choice), only the intercept is in the model when the first predictor is entered, which is immediately removed and the model development terminates.
What does the log say?
I would not be surprised to have something that relates to @Reeza's comment about sample size.
The log is silent...
@ballardw wrote:
@lcmichael_unc wrote:
If I may pursue this just one more step (poor word choice), only the intercept is in the model when the first predictor is entered, which is immediately removed and the model development terminates.
What does the log say?
I would not be surprised to have something that relates to @Reeza's comment about sample size.
In my opinion, this is a deficiency of the method of stepwise regression, and has nothing to do with sample size.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.