I was very eager to try LASSO variable selection for LOGISTIC regression but can't seem to find the answer to standardization question. Variables selected with STEPWISE and LASSO are completely different. I was under impression that LASSO would select a robust subset of STEPWISE choice but that is not what is happening. LASSO does not select any dichotomous class variables which in my dataset are all coded 0/1 and tend to have larger raw beta coefficients. I wonder if it's a matter of scale?
1. Does SAS do that automatically, or do I need to standardize my variables before I run HPGENSELECT?
2. What about polynomials--do they need to be standardized as well?
3. Do predictors need to be centered?
Thank you.
PROC HPGENSELECT supports a NOCENTER option, which is documented as
NOCENTER
requests that continuous main effects not be centered and scaled internally. (Continuous main effects
are centered and scaled by default to aid in computing maximum likelihood estimates.) Parameter
estimates and related statistics are always reported on the original scale.
From this you can infer that the default is to center and scale the main effects, but you can turn off this feature.
Predictors do not need to be centered since the intercept term accounts for the center of the predictors.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.