Hi Experts,
Proc GLMselect model is based on AIC. however, it occasionally picks up non-significant variable in the final Parameter Estimates table. Is a better way to improve the "stepwise" selection method instead of pre-selecting the "p<0.05" variables?
Thanks
Thanks Koen!
Hello,
PROC GLMSELECT will occasionally select an input variable whose parameter estimate has a p-value above 5%.
Surely this is not a bad thing at all. If it improves the overall AIC(C), that's OK anyway.
In my opinion, there is no way to avoid this. You can only play (i.e. trial-and-error) with
Criteria Used in Model Selection Methods (the CHOOSE=, SELECT=, and STOP= options in the MODEL).
One or more settings will probably produce a model of which all parameters are significant according to the 5% significance level.
SAS/STAT® 15.3 User's Guide
The GLMSELECT Procedure
Criteria Used in Model Selection Methods
https://go.documentation.sas.com/doc/en/statug/15.3/statug_glmselect_details15.htm
Good luck,
Koen
Thanks Koen for your help!
As p>n, tried to use both Lasso and elasticnet, and got inconsistent outputs too.
Hello,
If the number of variables / features is higher than the number of observations / samples, the data are considered high-dimensional and require dimension reduction approaches.
Koen
Thanks Koen!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.