Programming the statistical procedures from SAS

Example Data Set For Showing Superiority of LASSO Over Stepwise Selection

Reply
New Contributor
Posts: 3

Example Data Set For Showing Superiority of LASSO Over Stepwise Selection

Hello Everyone,

 

I want to demonstrate that

 

a) LASSO regression is superior to stepwise selection for variable selection

 

b) LASSO regression is superior to linear regression for prediction

 

 

I would like to use PROC GLMSELECT in SAS 9.3 to illustrate this.  Would anyone have a data set and some code to do so?  

 

If you have just the data set but no code, that's fine - I would be glad to write it by myself.  

 

If you have both the data and the code, that would be even better!  

 

 

Thanks for your help.

Grand Advisor
Posts: 9,446

Re: Example Data Set For Showing Superiority of LASSO Over Stepwise Selection

Compare AIC BIC PRESS .... these model fit statistic with these two method.

I don't understand your second question. LASSO is variable selection method ,not a regression method.

There are some better method than LASSO , like Net-LASSO .
Check the documentation .


New Contributor
Posts: 3

Re: Example Data Set For Showing Superiority of LASSO Over Stepwise Selection

Hi Ksharp,

 

 

Sorry - I could have phrased that second question better.  Suppose I generate 2 different models:

 

a) one model is obtained from stepwise selection

 

b) one model is obtained from LASSO

 

 

I want to show that the predictive accuracy of Model B is higher than that of Model A.

 

As Wikipedia notes, LASSO enhances the predictive accuracy of a resulting statistical model.  

 

https://en.wikipedia.org/wiki/Lasso_(statistics)

 

 

 

Would you have an example data set that I can use to demonstrate this?

 

 

Thanks.

Grand Advisor
Posts: 9,446

Re: Example Data Set For Showing Superiority of LASSO Over Stepwise Selection


Yeah. LASSO would be better than STEPWISE .
After using these two method with PROC GLMSELECT ,
Check the following fit statistics:

AIC 2172.72685
AICC 2174.27787
SBC 1736.94624
ASE (Train) 24.18515
ASE (Validate) 25.74617
ASE (Test) 22.57297

I would expect LASSO has smaller value of these statistics than STEPWISE.

New Contributor
Posts: 3

Re: Example Data Set For Showing Superiority of LASSO Over Stepwise Selection

Hello Ksharp,

 

 

Could you please tell me where you got these statistics?  Did you apply those methods to a data set?  If so, could you please tell me where that data set comes from?

 

 

Thanks.

Grand Advisor
Posts: 9,446

Re: Example Data Set For Showing Superiority of LASSO Over Stepwise Selection

These goodness-fit statistics I referred to is from SAS documentation. There are many example you can work with in PROC GLMSELECT documentation.
Ask a Question
Discussion stats
  • 5 replies
  • 277 views
  • 0 likes
  • 2 in conversation