I am doing LASSO using HPGENSELECT. Below is the code that I am using.
PROC HPGENSELECT DATA=library.Bigdata LASSORHO= 0.80 LASSOSTEPS= 50;
PARTITION roleVar=Group(train='Group1' validate='Group2');
CLASSVAR1 VAR2;
MODEL MASLD (descending)= VAR1 VAR2 ...... VAR(X);
/ DISTRIBUTION = BINARY;
SELECTION METHOD=LASSO (CHOOSE=VALIDATE STOP=None) DETAILS=ALL;
by _imputation_;
ods output ParameterEstimates=data;
run;
quit;
I am facing a couple of problems here. LASSO is selecting the optimal model as one with the lowest ASE, which is always the last step(in case the 50th step) So, I need to increase the LASSOSTEPS=100, still it selects the 100th step as the optimal model It keeps on doing that how much ever I increase the LASSOSTEPS. And I increase the LASSOSTEPS, the number of variables selected into the model also increases and at around TEPS -80, the SAS selects all the 45 variables in to the optimal mode; Do, basically its not doing any more variable selection by the time it reaches step 80, but still last step (say 100th) is selected as the optimal model by LASSO. Does anyone know why is it doing that? Also, the number of variable sthat SAS selects varies by the value of LAMBDARHO and LAMDASTEPS. Is there a way for me to what is the optimal values for these and how to get that for my model? This is my reference-https://www.mwsug.org/proceedings/2017/AA/MWSUG-2017-AA02.pdf
I also use SAS documentation for HPGENSELECT as my reference. Can anyone help with this? Thank you.