BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bmm0628
Calcite | Level 5

Hi! I am using the stepwise (and forward) selection criteria just to narrow down the variables that I use in the formal regression analysis that I use. However, a frequent problem that I have been running into is having variables in the final given model from the stepwise selection criteria with the unexpected sign, creating statistically insignificant results. With this, I was wondering if there is a way to code for this, so if the parameter estimate has a variable with the unexpected sign, it can be taken out of the model, and replaced with others.

 

I am using the "proc reg" procedure for this.

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

This is a known drawback of stepwise (and non-stepwise as well). It is caused by multi-collinearity among your X variables. Many techinques, such as PROC REG, PROC GLM, PROC LOGISTIC and so on can be very sensitive to multi-collinearity and can produce coefficients with the wrong sign.

 

An alternative is PROC PLS, which is surprisingly robust to multi-collinearity (and doesn't generally produce coefficients with the wrong sign). The lead programmer of PROC PLS wrote this paper, in which he fits a model with 1000 highly correlated x variables, does not bother with the step of variable selection, and still gets a useful model from PLS (Note: the syntax in the paper is very old and doesn't work in the current PROC PLS).

--
Paige Miller

View solution in original post

1 REPLY 1
PaigeMiller
Diamond | Level 26

This is a known drawback of stepwise (and non-stepwise as well). It is caused by multi-collinearity among your X variables. Many techinques, such as PROC REG, PROC GLM, PROC LOGISTIC and so on can be very sensitive to multi-collinearity and can produce coefficients with the wrong sign.

 

An alternative is PROC PLS, which is surprisingly robust to multi-collinearity (and doesn't generally produce coefficients with the wrong sign). The lead programmer of PROC PLS wrote this paper, in which he fits a model with 1000 highly correlated x variables, does not bother with the step of variable selection, and still gets a useful model from PLS (Note: the syntax in the paper is very old and doesn't work in the current PROC PLS).

--
Paige Miller

Catch up on SAS Innovate 2026

Dive into keynotes, announcements and breakthroughs on demand.

Explore Now →
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 1539 views
  • 2 likes
  • 2 in conversation