Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Output constraint parameter from PROC GLMSELECT?

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 09-28-2015 01:45 PM
(3477 views)

I am currently fiddling around with using LASSO, adaptive LASSO, and elastic net methods using PROC GLMSELECT in SAS 9.4.

One issue that I am running up against, however, is that there doesn't seem to be an option for GLMSELECT to actually display the selected value of the regularization/constraint parameter used in these techniques. For all three methods, you can explicitly provide a value for the penalization parameter, or it can determine the value automatically. For example, for elastic net, if you don't specify a value for L2 (the ridge regression penalty parameter), SAS searches for the optimal value of L2 over a range according to the specified CHOOSE method.

However, at no point, that I can find, does SAS actually provide you or output the value of this constraint parameter. This is problematic for a number of reasons (for example, the SAS documentation notes that you should specify the value of L2 if you have a good estimate of what the constraint parameter should be, but SAS provides no method for actually allowing you to determine such an estimate). For example, if you want to use the model averaging functionality of GLMSELECT in combination with the elastic net method, you MUST specify a value of L2 (if you don't, SAS returns an error).

Ideally, you would be able to run GLMSELECT once with elastic net to determine an optimal value of L2 to then plug into the model averaging. However, I cannot find anything in the standard output or documentation that makes this possible. So, am I missing something? Is there some way to force SAS to actually provide you with the explicit values of the regularization/constraint parameters that it necessarily estimates as part of these penalized regression methods? If not, what is a reasonable way to go about determining reasonable values of these for those situations in which it is necessary to provide them explicitly?

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

This looks like a work in progress to me, but here are some of the steps I would try:

1. Specify the L2SEARCH= option explicitly.

2. Turn on ODS trace on.

3. Check the log for tables that may have the values in there.

I can't guarantee anything at this point, but that is where I would start.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Steve,

That's a great idea! Unfortunately, I haven't found anything particularly fruitful following that advice, but it was a good idea, and there is still the possibility that some obscure option I haven't explored yet will sneakily output it to one of those datasets. Hopefully someone at SAS will see this thread and make it more straightforward for us!

Thanks,

Ryan

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You can always contact SAS technical support. They will give you an answer in 24 hours.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks, lvm. I did contact SAS technical support, and will update this thread with any solutions.

In the meantime, I have two ideas for how to overcome the problem, but they each have issues of their own:

1) It is possible to use ridge regression in PROC REG. Since the L2= specification in Elastic Net is a ridge regression parameter, it may be possible to tune the ridge regression in PROC REG and then export it over to PROC GLMSELECT. Doing so seems to give reasonable results. However, PROC REG does not have a built in method for optimizing the regression parameter, it simply runs the parameters over a specified range and spits out the values. Further, this is fairly ad-hoc, since there is no guarantee that the optimal parameter in a ridge regression is equivalent to the optimal L2 parameter in the elastic net setting.

2) Using the R glmnet package, once can run an elastic net regression, and cross-validate to get optimal values for the various parameters. I figured this might be a decent way to get initial values to feed into PROC GLMSELECT (because, after all, for a variety of reasons I want to keep the majority of the analysis in SAS). However, the glmnet package uses a different parameterization of the elastic net than PROC GLMSELECT, and trying to "convert" them doesn't result in sensible values.

The R package has one lambda parameter and an alpha parameter that describes the amount of "mixing" (i.e. the weights given to the ridge and LASSO penalties in the elastic net). In theory, lambda*alpha should give you L2 (and lambda*(1-alpha) should give you L1), but the values I get from R in such a fashion and plug into SAS (with alpha=0.5), give me radically different results. Without the ability to check the values of L1 SAS is using (and with GLMSELECT apparently not allowing you to specify both an L1 and L2 value), I don't see how to cross-reference.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Did you ever find a solution to this problem?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

Did you ever find a solution ? I have the same issue.

Thnx

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I don't think one was ever developed, so I will try again (after 5 years). This is based on the use of the ENSCALE option. That option applies a rescaling of (1 + L2) to the parameters. If you do the naive elastic net (without rescaling) and get the model parameters out, repeat the elastic net with the ENSCALE option and get those model parameters out, you could match up the included variables and construct a ratio of the beta's. From that and a little (actually very little) algebra, you could calculate the final L2 value used. Verification of this could be done with fixed L2 values.

Just an idea. I haven't even begun to try to actually implement this.

SteveDenham

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.