turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- HPGENSELECT - LASSO- LOGISTIC

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-19-2015 04:35 AM

Hi,

I have now updated my SAS/STAT to 14.1 which inlcude the LASSO selection in HPGENSELECT.

Have anyone tried to fo a **logistic** regression with HPGENSELECT?

Is it possible?

However, I have som problems with the syntax performing a logistic regression in HPGENSELECT

Thanks for all advice regarding this.

/Thomas

Accepted Solutions

Solution

07-06-2017
09:01 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-19-2015 04:01 PM

Please post the syntax that is giving you the error. HPGENSELECT supports the DIST=BINARY and DIST=BINOMIAL options for logistic regression. For example, the following statements work:

```
proc hpgenselect data=sashelp.class;
model sex(event="M") = height weight age / dist=binary;
selection method=lasso;
run;
```

All Replies

Solution

07-06-2017
09:01 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-19-2015 04:01 PM

Please post the syntax that is giving you the error. HPGENSELECT supports the DIST=BINARY and DIST=BINOMIAL options for logistic regression. For example, the following statements work:

```
proc hpgenselect data=sashelp.class;
model sex(event="M") = height weight age / dist=binary;
selection method=lasso;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-22-2015 07:30 AM

Many thanks!

I had to add dist=binary, then it worked!

However, one additional question. If you only wtite selection=lasso, what is the default method for varaible selection?

Is cross-validation included in HPGENSELECT with Lasso?

Thanks

Thomas

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-22-2015 12:16 PM

The HPGENSELECT documentation is online and answers all of these questions. Look at the SELECTION statement to see various defaults.

I don't understand your question about "the default method for variable selection." The LASSO method IS a variable-selection method, so the default method is LASSO. If you are talking about the SELECT= option, that option is not valid for LASSO.

Yes, you can use the PARTITION statement in conjunction with LASSO to do cross validation.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-22-2015 03:19 PM

Thanks again!

Sorry, I was not clear in my previous question. Different methods (AIC, BIC, Cross-validation) can be used to select an optimal value of the regularization parameter i LASSO.

I have seen some code examples where selection=LASSO(choose=sbc).

If you don't enter anything after LASSO (ie no choose option), which model does SAS use to estimate the regularization parameter?

Since LASSO is quite new in HPGENSELECT I have not found any code examples how do perform cross-validation in this procedure (this is the first time I perform a LASSO regression).

Could this be a correct syntax:

`proc hpgenselect data=sashelp.class; `

partition fraction(test=0.25 validate=0.25);
model sex(event="M") = height weight age / dist=binary;
selection method=lasso;
run;

/Thomas

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-23-2015 08:37 AM

If you run the statements that you propose, you will see a note in the log that says "ERROR: The TEST partition is not available for the LASSO method." You can use the VALIDATE= option to compute the AIC, AICC, BIC, and ASE statistics on the validation data.

bollibompa wrote:If you don't enter anything after LASSO (ie no choose option), which model does SAS use to estimate the regularization

Your question is answered in the documentation of the SELECTION statement, which I encourage you to read: "If you specify METHOD=LASSO and you do not specify either the CHOOSE= or STOP= option, then the model in the last LASSO step is chosen as the selected model."

In my opinion, you should probably choose a CHOOSE= criterion. If you are going to specify a validation set, presumably you will want to use CHOOSE=VALIDATE.

By the way, if you add the DETAILS=ALL option to the SELECTION statement, then the output contains additional information that might help clarify what LASSO is doing at each step.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-08-2015 02:59 AM

Thanks again for your support!

/Thomas

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-08-2015 08:40 AM

You are welcome. If you think the question has been answered, please close the thread.