Hello fellow SAS users and SAS support, I have been using HPGENSELECT with LASSO selection for a binary dependent variable, and was hoping for clarification regarding the details of the LASSO penalization method and the resulting coefficients. I will post my SAS code at the end. My two questions are: When HPGENSELECT has been called with the LASSO option and there are CLASS variables present, does it perform group LASSO optimization, in which the categories of a class variable are either all selected or all set to zero? This is in contrast to regular LASSO, in which some categories might have a non-zero coefficient but others do not; the fact that they belong to a single effect is ignored. When I use the PARAM = GLM option in the CLASS statement, I seem to invoke less-than-full-rank parameterization of the categorical variables. This means that each level of a class variable gets a dummy variable and all dummy variables are entered into the model. This is not estimable for OLS or maximum likelihood, so a reference category is forced, but LASSO can handle overparameterized models. My question is, how does one then interpret the coefficients? Is it done by calculating the contrasts manually? For example, take my screenshot below of the parameter estimates on the log-odds scale. The variable "Location" has only 4 levels in the data, all of which are present in the fitted model. If one were interested in say comparing Locations 2 through 4 to Location 1 as a reference category, would you calculate the difference in estimates on the log-odds scale (e.g. 0.026 versus 0.074) and then exponentiate to obtain familiar odds ratios? Thanks very much for any insight you can provide! SAS code below, if it helps. Note that this is from SAS version 9.4, SAS/STAT 15.1 PROC HPGENSELECT data=my_data LASSORHO=.80 LASSOSTEPS=20;
WHERE location NOTIN (5,6);
CLASS gender location Physiologic_difficult_AW <many more predictors>
/ param=GLM;
MODEL Number_attempts =
gender location Physiologic_difficult_AW <many more predictors> / DISTRIBUTION=BINARY ;
SELECTION METHOD=LASSO(CHOOSE=AIC) DETAILS=ALL;
RUN;
... View more