BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bmm0628
Calcite | Level 5

Hi! I am using the stepwise (and forward) selection criteria just to narrow down the variables that I use in the formal regression analysis that I use. However, a frequent problem that I have been running into is having variables in the final given model from the stepwise selection criteria with the unexpected sign, creating statistically insignificant results. With this, I was wondering if there is a way to code for this, so if the parameter estimate has a variable with the unexpected sign, it can be taken out of the model, and replaced with others.

 

I am using the "proc reg" procedure for this.

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

This is a known drawback of stepwise (and non-stepwise as well). It is caused by multi-collinearity among your X variables. Many techinques, such as PROC REG, PROC GLM, PROC LOGISTIC and so on can be very sensitive to multi-collinearity and can produce coefficients with the wrong sign.

 

An alternative is PROC PLS, which is surprisingly robust to multi-collinearity (and doesn't generally produce coefficients with the wrong sign). The lead programmer of PROC PLS wrote this paper, in which he fits a model with 1000 highly correlated x variables, does not bother with the step of variable selection, and still gets a useful model from PLS (Note: the syntax in the paper is very old and doesn't work in the current PROC PLS).

--
Paige Miller

View solution in original post

1 REPLY 1
PaigeMiller
Diamond | Level 26

This is a known drawback of stepwise (and non-stepwise as well). It is caused by multi-collinearity among your X variables. Many techinques, such as PROC REG, PROC GLM, PROC LOGISTIC and so on can be very sensitive to multi-collinearity and can produce coefficients with the wrong sign.

 

An alternative is PROC PLS, which is surprisingly robust to multi-collinearity (and doesn't generally produce coefficients with the wrong sign). The lead programmer of PROC PLS wrote this paper, in which he fits a model with 1000 highly correlated x variables, does not bother with the step of variable selection, and still gets a useful model from PLS (Note: the syntax in the paper is very old and doesn't work in the current PROC PLS).

--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 704 views
  • 2 likes
  • 2 in conversation