Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- spline variables in hpgenselect - group lasso

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 07-20-2018 06:56 PM
(1107 views)

I am running a proc hpgenselect logistic model using lasso selection. I understand hpgenselect already uses group lasso methods for class variables. However, I additionally have 2 variables (part of a spline - 1 represents continuous variable less than X and the other represents continuous variable greater than X) that I want to keep together. How can I do this as they are not part of the same class variable?

- Tags:
- HPGENSELECT
- lasso

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

OK, great. The model

model Y = Z1 Z2 x2-x10;

is the same as the model

class C;

model Y = C*x1 x2-x10;

where C = (X>0);

In the first model (your situation) the Z1 and Z2 variables can enter/leave independently whereas in the second model the C*x1 term is either in or out. So all you need to do is define the binary class variable C instead of the two continuous variables Z1 and Z2.

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Turn them into a class variable is one way to keep them together.

Although it is certainly pointless to put two variables in the model where one is (continuous variable < X) and the other is (continuous variable > X). They are not telling you different things, they are telling you the same things.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Nevermind, I forgot that you are using splines here.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Since HPGENSELECT does not support the EFFECT statement, how did you generate the spline effects? Are they from a design matrix? If so, it seems like you can use the INCLUDE= option on the MODEL statement to force them both into the model.

If I am misunderstanding, please post your code so we can see what you are doing.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I generated the splines when I was checking the bivariate associations between my predictors and dependent variable. I just basically split the X variable into 2 variables and am calling them splines, sorry for the confusion. I cannot use the include option in the model statement because I dont want to force them in the model, I merely want to group them together so either both or neither end up in the final model selected.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Let's see if I understand this. You had an original variable that I'll call X. You create new variables that are equal to X above or below some cutoff value and zero otherwise, like this:

Z1 = X*(X<0); /* assuming X=0 is cutoff value */

Z2 = X*(X>=0);

Your model includes Z1 and Z2. You want the final model to either include both Z1 and Z2 or include neither?

Am I close? If not, please post your code.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Precisely!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

OK, great. The model

model Y = Z1 Z2 x2-x10;

is the same as the model

class C;

model Y = C*x1 x2-x10;

where C = (X>0);

In the first model (your situation) the Z1 and Z2 variables can enter/leave independently whereas in the second model the C*x1 term is either in or out. So all you need to do is define the binary class variable C instead of the two continuous variables Z1 and Z2.

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.