BookmarkSubscribeRSS Feed
Haris
Lapis Lazuli | Level 10

I was very eager to try LASSO variable selection for LOGISTIC regression but can't seem to find the answer to standardization question.  Variables selected with STEPWISE and LASSO are completely different.  I was under impression that LASSO would select a robust subset of STEPWISE choice but that is not what is happening.  LASSO does not select any dichotomous class variables which in my dataset are all coded 0/1 and tend to have larger raw beta coefficients.  I wonder if it's a matter of scale?

 

1. Does SAS do that automatically, or do I need to standardize my variables before I run HPGENSELECT?

2. What about polynomials--do they need to be standardized as well?

3. Do predictors need to be centered?

 

Thank you.

2 REPLIES 2
Rick_SAS
SAS Super FREQ

PROC HPGENSELECT supports a NOCENTER option, which is documented as

NOCENTER

requests that continuous main effects not be centered and scaled internally. (Continuous main effects

are centered and scaled by default to aid in computing maximum likelihood estimates.) Parameter

estimates and related statistics are always reported on the original scale.

 

From this you can infer that the default is to center and scale the main effects, but you can turn off this feature.

 

Predictors do not need to be centered since the intercept term accounts for the center of the predictors.

 

Haris
Lapis Lazuli | Level 10
Thanks

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2154 views
  • 0 likes
  • 2 in conversation