Hi Experts,
Proc GLMselect model is based on AIC. however, it occasionally picks up non-significant variable in the final Parameter Estimates table. Is a better way to improve the "stepwise" selection method instead of pre-selecting the "p<0.05" variables?
Thanks
Thanks Koen!
Hello,
PROC GLMSELECT will occasionally select an input variable whose parameter estimate has a p-value above 5%.
Surely this is not a bad thing at all. If it improves the overall AIC(C), that's OK anyway.
In my opinion, there is no way to avoid this. You can only play (i.e. trial-and-error) with
Criteria Used in Model Selection Methods (the CHOOSE=, SELECT=, and STOP= options in the MODEL).
One or more settings will probably produce a model of which all parameters are significant according to the 5% significance level.
SAS/STAT® 15.3 User's Guide
The GLMSELECT Procedure
Criteria Used in Model Selection Methods
https://go.documentation.sas.com/doc/en/statug/15.3/statug_glmselect_details15.htm
Good luck,
Koen
Thanks Koen for your help!
As p>n, tried to use both Lasso and elasticnet, and got inconsistent outputs too.
Hello,
If the number of variables / features is higher than the number of observations / samples, the data are considered high-dimensional and require dimension reduction approaches.
Koen
Thanks Koen!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.