BookmarkSubscribeRSS Feed
sas_question123
Calcite | Level 5

Hello. 

 

I am new to LASSO regression and I am running it to select variables and then run a binary (1)  and multinomial regression (2) using the variables selected to see how well they predict the outcomes of interest. Below is the code for the LASSO with a binary outcome:

 

proc hpgenselect data=dataset;

         model outcome_binary (event="1") = var1 var2 var3 var4 var5 var6 var7 var8 var9 var10 var11 var12 var13 var14 var15 var16 var17 var18 var19 var20 var21 var22 var23 var24 var25 var26 var27 var28 var29/ link = logit dist=binary;

         selection method=lasso(choose=aicc stop=aicc) details = All;

         run;

 

Is it correct? I believe it is, but would love if someone more experienced can confirm. Also, to run a LASSO for a 5 level categorical outcome that has one of the levels as the reference , would I just need to change the link and dist to the following: link = GLOGIT dist=multinomial?

 

Some of the models show that only the intercept was selected and nothing else. Any reasons why that may be? Thank you!

3 REPLIES 3
SteveDenham
Jade | Level 19

Last question first (as it is the only one I feel comfortable giving an answer for):

 

The intercept model would be chosen when none of the candidate models yield a "LASSO significant" improvement in the AICC.

 

SteveDenham

sas_question123
Calcite | Level 5

Thank you! Any tips for changing the model specifications, such as the "choose" or "stop" option? Sample size is about 100.

StatDave
SAS Super FREQ

Yes, that is a reasonable way to use the LASSO method. See Example 3 in this note which shows use of HPGENSELECT to perform LASSO selection and discusses shrinkage methods more generally. See also the Gunes paper that is linked to. It discusses LASSO and related methods further, though not in the context of categorical response models. And yes, you can use the LASSO method with a nominal, multinomial model with DIST=MULT and LINK=GLOGIT.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1743 views
  • 0 likes
  • 3 in conversation