BookmarkSubscribeRSS Feed
newmkka
Calcite | Level 5

To solve some questions, I used PROC REC with FORWARD/BACKWARD/STEPWISE options. The questions is exactly asked by following:

"For the forward selection, she specifies significance level for entry equal to 0.3, for the backward selection, she specifies significance level for staying equal to 0.3, whereas for the stepwise selection method she uses default significance levels for entry and staying."

Is there way to specify the p value? or just run and read it?

my code-----

PROC REG data = work.Datafile ;

  model rses = bfio bfic bfie bfia bfin / SELECTION = FORWARD ;

RUN ;

and

result summary is below.

Forward Summary

1bfin10.17700.177043.307140.23<.0001
2bfic20.10580.282815.968227.43<.0001
3bfie30.05360.33643.104314.940.0002
4bfio40.00270.33914.35960.750.3885

- Backward Summary

     
1bfia40.00130.33914.35960.360.5495
2bfio30.00270.33643.10430.750.3885

- Stepwise Summary

1bfin10.17700.177043.307140.23<.0001
2bfic20.10580.282815.968227.43<.0001
3bfie30.05360.33643.104314.940.0002

Which model has the best fit? How to justify the best fit?

Thanks in advance.

5 REPLIES 5
Ksharp
Super User

So did you check the documentation firstly?

proc reg ALPHA=

Default alpha is 0.05, you can specify it by yourself.

SteveDenham
Jade | Level 19

The options on the model statement for entry and staying in are SLE= and SLS=.  The documentation says the default SLE for FORWARD is 0.5, and for STEPWISE 0.15.  For SLS, the default for BACKWARD is 0.10 and for STEPWISE 0.15.

Given all of that, these methods result in biased selection, with the standard errors biased small.  There are are a variety of methods for selecting "best fit" including adjusted R-squared and various information criteria, but still the method is flawed.  Google search for a paper by Peter Flom regarding the dangers of stepwise (and forward/backward) selection of variables.

Steve Denham

newmkka
Calcite | Level 5

I use the options SLENTRY and SLSTAY.

/*FORWARD SELECTION*/

TITLE "FORWARD SELECTION" ;

PROC REG data = work.Datafile ;

  model rses = bfio bfic bfie bfia bfin

  / SELECTION = FORWARD

  SLENTRY = 0.3;

RUN ;

TITLE ;

1bfin10.17700.177043.307140.23<.0001
2bfic20.10580.282815.968227.43<.0001
3bfie30.05360.33643.104314.940.0002

/*BACKWARD SELECTION*/

TITLE "BACKWARD SELECTION" ;

PROC REG data = work.Datafile ;

  model rses = bfio bfic bfie bfia bfin

  / SELECTION = BACKWARD

  SLSTAY = 0.3;

RUN ;

TITLE ;

Backward elimination

1bfia40.00130.33914.35960.360.5495
2bfio30.00270.33643.10430.750.3885

/*STEPWISE SELECTION*/

TITLE "STEPWISE SELECTION" ;

PROC REG data = work.Datafile ;

  model rses = bfio bfic bfie bfia bfin

  / SELECTION = STEPWISE ;

RUN ;

TITLE ;

1bfin10.17700.177043.307140.23<.0001
2bfic20.10580.282815.968227.43<.0001
3bfie30.05360.33643.104314.940.0002

Can help me to fill out?

1.      Which model has the best fit? Please justify your answer.

     ??? can help me? less step? r-square?

2.      Do any models reach the same conclusions with regards to regression coefficients? If so, which ones?

     my answer is FORWARD and STEPWISE is same conclusions

3.      Provide an interpretation of an intercept for the model with the best fit.

    

4.      Provide an interpretation of partial regression coefficients for the model with the best fit.

PGStats
Opal | Level 21

I agree with Steve's comments.  You shouldn't waste your time with antiquated statistical methods.The only selection criteria available in PROC REG that account for the number of estimated parameters are ADJRSQ and CP. Model selection, especially when it is based on small datasets, requires a lot of expertise and ... humility.

PG

PG
SteveDenham
Jade | Level 19

And I wish I could go yell at your stat instructor, who gave you this problem, and never considered the drawbacks of these methods.  Yes, we need to know that they exist, but they are just an intensive and frustrating way of making bad decisions. , I don't mind helping with analytic methods and approaches, but answering questions 1 through 4 looks exactly like homework or a take-home final exam, and the answers should be left to the analyst to come up with.

Steve Denham

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 978 views
  • 3 likes
  • 4 in conversation