Hello Everyone,
I want to demonstrate that
a) LASSO regression is superior to stepwise selection for variable selection
b) LASSO regression is superior to linear regression for prediction
I would like to use PROC GLMSELECT in SAS 9.3 to illustrate this. Would anyone have a data set and some code to do so?
If you have just the data set but no code, that's fine - I would be glad to write it by myself.
If you have both the data and the code, that would be even better!
Thanks for your help.
Yeah. LASSO would be better than STEPWISE . After using these two method with PROC GLMSELECT , Check the following fit statistics: AIC 2172.72685 AICC 2174.27787 SBC 1736.94624 ASE (Train) 24.18515 ASE (Validate) 25.74617 ASE (Test) 22.57297 I would expect LASSO has smaller value of these statistics than STEPWISE.
Compare AIC BIC PRESS .... these model fit statistic with these two method. I don't understand your second question. LASSO is variable selection method ,not a regression method. There are some better method than LASSO , like Net-LASSO . Check the documentation .
Hi Ksharp,
Sorry - I could have phrased that second question better. Suppose I generate 2 different models:
a) one model is obtained from stepwise selection
b) one model is obtained from LASSO
I want to show that the predictive accuracy of Model B is higher than that of Model A.
As Wikipedia notes, LASSO enhances the predictive accuracy of a resulting statistical model.
https://en.wikipedia.org/wiki/Lasso_(statistics)
Would you have an example data set that I can use to demonstrate this?
Thanks.
Yeah. LASSO would be better than STEPWISE . After using these two method with PROC GLMSELECT , Check the following fit statistics: AIC 2172.72685 AICC 2174.27787 SBC 1736.94624 ASE (Train) 24.18515 ASE (Validate) 25.74617 ASE (Test) 22.57297 I would expect LASSO has smaller value of these statistics than STEPWISE.
Hello Ksharp,
Could you please tell me where you got these statistics? Did you apply those methods to a data set? If so, could you please tell me where that data set comes from?
Thanks.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.