Programming the statistical procedures from SAS

HPGENSELECT - LASSO- LOGISTIC

Reply
Contributor
Posts: 73

HPGENSELECT - LASSO- LOGISTIC

Hi,

 

I have now updated my SAS/STAT to 14.1 which inlcude the LASSO selection in HPGENSELECT.

 

Have anyone tried to fo a logistic regression with HPGENSELECT?

Is it possible?

However, I have som problems with the syntax performing a logistic regression in HPGENSELECT

 

Thanks for all advice regarding this.

/Thomas

SAS Super FREQ
Posts: 3,304

Re: HPGENSELECT - LASSO- LOGISTIC

Please post the syntax that is giving you the error.  HPGENSELECT supports the DIST=BINARY and DIST=BINOMIAL options for logistic regression.  For example, the following statements work:

 

proc hpgenselect data=sashelp.class; 
   model sex(event="M") = height weight age / dist=binary;
   selection method=lasso;
run;
Contributor
Posts: 73

Re: HPGENSELECT - LASSO- LOGISTIC

Many thanks!

I had to add dist=binary, then it worked!

 

However, one additional question. If you only wtite selection=lasso, what is the default method for varaible selection?

Is cross-validation included in HPGENSELECT with Lasso?

 

Thanks

Thomas

SAS Super FREQ
Posts: 3,304

Re: HPGENSELECT - LASSO- LOGISTIC

The HPGENSELECT documentation is online and answers all of these questions. Look at the SELECTION statement to see various defaults.

 

I don't understand your question about "the default method for variable selection." The LASSO method IS a variable-selection method, so the default method is LASSO.  If you are talking about the SELECT= option, that option is not valid for LASSO.

 

Yes, you can use the PARTITION statement in conjunction with LASSO to do cross validation.

Contributor
Posts: 73

Re: HPGENSELECT - LASSO- LOGISTIC

Thanks again!

 

Sorry, I was not clear in my previous question. Different methods (AIC, BIC, Cross-validation) can be used to select an optimal value of the regularization parameter i LASSO.

I have seen some code examples where selection=LASSO(choose=sbc).

If you don't enter anything after LASSO (ie no choose option), which model does SAS use to estimate the regularization parameter?

 

Since LASSO is quite new in HPGENSELECT I have not found any code examples how do perform cross-validation in this procedure (this is the first time I perform a LASSO regression).

 

Could this be a correct syntax:

 

proc hpgenselect data=sashelp.class; 
partition fraction(test=0.25 validate=0.25); model sex(event="M") = height weight age / dist=binary; selection method=lasso; run;

 

 

/Thomas

 

 

SAS Super FREQ
Posts: 3,304

Re: HPGENSELECT - LASSO- LOGISTIC

If you run the statements that you propose, you will see a note in the log that says "ERROR: The TEST partition is not available for the LASSO method."   You can use the VALIDATE= option to compute the AIC, AICC, BIC, and ASE statistics on the validation data.


bollibompa wrote:

If you don't enter anything after LASSO (ie no choose option), which model does SAS use to estimate the regularization


Your question is answered in the documentation of the SELECTION statement, which I encourage you to read: "If you specify METHOD=LASSO and you do not specify either the CHOOSE= or STOP= option, then the model in the last LASSO step is chosen as the selected model."  

 

In my opinion, you should probably choose a CHOOSE= criterion. If you are going to specify a validation set, presumably you will want to use CHOOSE=VALIDATE.

 

By the way, if you add the DETAILS=ALL option to the SELECTION statement, then the output contains additional information that might help clarify what LASSO is doing at each step.

Contributor
Posts: 73

Re: HPGENSELECT - LASSO- LOGISTIC

Thanks again for your support!

/Thomas

SAS Super FREQ
Posts: 3,304

Re: HPGENSELECT - LASSO- LOGISTIC

You are welcome. If you think the question has been answered, please close the thread.

Ask a Question
Discussion stats
  • 7 replies
  • 1927 views
  • 0 likes
  • 2 in conversation