Hi All,
I have been working on a Lasso Logistic regression with binary response and 20 predictor varaibles (a mix of categorical and continuous ) and have read a lot on using GLMSELECT procedure and coding the outcome ±1, and applying a cutoff (usually 0) to the predictions. I also tried using HPGENSELECT procedure and somehow it throws the Format error and no predictor variables are displayed in the results and hence want to try GLMSELECT proc.
My query is what does the line 'applying a cutoff (usually 0) to the predictions' mean? Can anyone throw more light on this approach, like any sample SAS code where this has been applied?
Thank you for your time and support.
If the outcomes are ±1 then a cutoff of 0 would be on the predicted values used to determine if the regression predicts an observation is a –1 or a +1.
Using binary responses in PROC GLMSELECT is not truly a logistic regression.
PROC HPGENSELECT does have the Lasso for use with logistic regression (and really for use with many generalized linear models).
I also tried using HPGENSELECT procedure and somehow it throws the Format error
We can't help with this minimal explanation. We need to see the log (not just the error message but the entire PROC HPGENSELECT part of the log). Please copy the relevant parts of the log as text, and then paste it into the window that appears when you click on the {i} icon.
Hi,
The SAS code is:
proc hpgenselect data=mybinarydata ;
where dep ne . and treat =a;
class cov1 cov2 cov3 cov4..........c10;
model dep(event='1') = cov1 cov2 cov3 cov4..........c20 / dist=binary;
selection method=Lasso(choose=SBC) details=all;
performance details;
run;
and the SAS log has the following:
NOTE: The HPGENSELECT procedure is executing in single-machine mode.
ERROR: The analytical component failed to load a format.
Log/Listing results for thread 1.
ERROR: The format xyz could not be loaded for variable cov1.
NOTE: The SAS System stopped processing this step because of errors.
the procedure runs, but with this error and it has only intercept in the final results. Not sure what is gone wrong. Any help on this.
Thanks.
Run PROC CONTENTS on the data set mybinarydata. Is there a format listed for COV1? If so, then you have to make that format available in your code.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.