BookmarkSubscribeRSS Feed
csetzkorn
Lapis Lazuli | Level 10

is it possible to use:

 

selection=ELASTICNET ...

in PROC GLMSELECT so that no feature selection is performed (i.e. all featured are 'forced' into model)?

5 REPLIES 5
Rick_SAS
SAS Super FREQ

I don't think so. If you want all variables in the model, use SELECTION=NONE to get the OLS estimates. But I don't think you can get the elastic net estimates for the full model.

csetzkorn
Lapis Lazuli | Level 10
OK thanks. Wanted to use elastic net as there are correlations between at least 2 variables ... Do you know if, in this case, OLS estimates could still be used purely for prediction rather than interpretation? Thanks.
Rick_SAS
SAS Super FREQ

It depends on how correlated the variables are. Strong correlations can result in large standard errors of the OLS estimates due to the X`X matrix being ill-conditioned. You can use the VIF option in PROC REG to examine whether the X`X matrix is ill-conditioned.

 

If the VIF indicates strong correlations, you might try ridged regression in PROC REG, which is close to the Elastic Net in that it includes the quadratic penalty term. That would probably permit the closest comparison. 

 

For other options for regression of correlated variables, see a comparison of PLS and PCR (principal component regression).

csetzkorn
Lapis Lazuli | Level 10
Thanks. Yes ridge is an option.
PaigeMiller
Diamond | Level 26

I would say that PLS is superior to PCR in the case of correlated X variables, because PCR may find components that are not good predictors of the Y variable(s), while PLS will find components of X that predict Y well (if such components exist). Studies show that PLS does well with correlated variables, compared to other methods such as stepwise regression, ridge regression and even PCR.

https://amstat.tandfonline.com/doi/abs/10.1080/00401706.1993.10485033

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1437 views
  • 0 likes
  • 3 in conversation