turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Segmented multiple regression

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-07-2016 07:21 PM - edited 02-07-2016 09:04 PM

Hi,

I would like to do a segmented multiple regression with fixed effects (with and without interactions). Does SAS has any procedures to handle this task. I've read the PROC NLIN can do a segmented regression, however I can only find tutorials on a segmented regression on a single predictor. I'll be very appreciate if you could point to me any samples on the issue. Links to tutorials would be nice too.

I'm currently working with SAS EG 6.1.

Many thanks,

Mai

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to q5pham

02-10-2016 10:02 AM

Segmented regression is well defined for a single predictor. You can specify a number of breakpoints (single points) that separate the domain into disjoint segments on which to run a regression.

When you have more than one explanatory variable, segmented regression is not well defined. For example, with two variables, you can break up the domain into rectangle and solve the regression problem on each rectangle. But you can also use triangles or squiggly curves.

The way to approach this topic for multivariate regression is to switch from segmented regression to local regression. In local regression, the predicted value at each point p is obtained by solving a regression problem that involves the data points that are close to p. Often a kernel function is used so that points close to p carry more weight than points far from p.

This method is call LOESS for LOcal EStimation. In SAS, it is supported by PROC LOESS. The documentation includes a two-diensional example that you should look at.

There are other nonparametric regression techniques in SAS, but PROC LOESS seems to be most similar to segmented regression.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

02-10-2016 12:38 PM - edited 02-15-2016 02:37 PM

I'll add on to Rick's reply about segmented multiple regression. Aside from partitioning the m-space of the predictor variables into the appropriate number of subdomains, you must also enforce both continuity and smoothness constraints along all points on the (m-1)*(m-2)/2 intersections (hope I have the right number there). These constraints may not be amenable to optimization.

Consider the loess approach, where "join contours" can be determined by looking at the gradient and hessian of the final solution.

Steve Denham

EDIT: Well it wasn't the right number. If there are m predictor variables, the number of intersections is m*(m-1)/2.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

02-10-2016 05:00 PM - edited 02-10-2016 05:06 PM

Hi Rick and Steve,

Thank you very much for your posts. I'll study the LOESS for more details. As I just come across to Adaptivereg, I would like to ask for your comments.

According to the SAS guide on Adaptivereg (http://support.sas.com/documentation/cdl/en/statug/65328/HTML/default/viewer.htm#statug_adaptivereg_... the LOESS and TPSPLINE procedures are limited to problems in low dimensions. My base model includes 5 countinuous and 3 categorical predictor variables. I only expect one break in 2 continuous variables each and I assume each segements are linear. Besides, there will be linear relationships between the response variable with the remaining 3 continuous variables. There might be categorical by continuous interactions which I need to study for a full model (1 categorical * 5 continuous variable interactions).

In my situation, what approach is most feasible to look at.

Many thanks,

Mai

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to q5pham

02-11-2016 07:52 AM

PROC LOESS does not have a CLASS statement, and as you say it is best for low-dimensional problems. I agree that you should look at PROC ADAPTIVEREG, which is a nonparametric routine that can fit flexible models to data.

Now that you've explained your problem more, there might be another approach. You say that you only want piecewise linear functions for two of the variables. Look at using the EFFECT statement to create linear splines (DEGREE=1) for those two variables. IF YOU KNOW THE LOCATIONS for the breakpoints, you can use the KNOTMETHOD=LIST(..) option to specify the locations that you want those variables split. You would want to use a truncated power basic function (degree=1) for the splines. Notice, however, that in this approach the knot positions are fixed, not parameters that are chosen by the procedure.

I've never done this before, but it might work.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to q5pham

02-11-2016 08:07 AM

Hi Rick, Thank you for your confirmation on Loess. Will find out how this goes. Hope you can still with me on the subject. Good night and talk to you soon.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to q5pham

02-16-2016 08:20 PM

Hi Rick and all again,

Have tried Adaptivereg, the interpretation for model without interaction between variables is straightforward. However, I have a big trouble with one with interaction (2 ways only). The issue is that SAS automatically selects the model using stepwise regressions and it comes up with quite many unwanted interactions. I have checked the SAS manual but could not find an option where I can define my interactions. Have I missed something? or if not, can any of you suggest me a way to work arround this problem. I have reduced the number of basis functions but not sure if it is a good approach, and still, several unwanted interactions appear in my final model.

Thanks, Mai

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to q5pham

02-17-2016 03:44 PM

Would using the KEEP= option in the MODEL statement enable to look only at the interactions of interest? That fixes those terms in the model, and I would assume the basis vectors describing them. If they were exhaustive of all the model terms, then I think you would have what you need.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

02-17-2016 05:03 PM - edited 02-17-2016 05:07 PM

David, Keep= does not work with interactions. It only deals with variables. If I create a new variable as my interaction, I'm not quite sure break points would behave correctly.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to q5pham

02-18-2016 07:58 AM

What does your current MODEL statement look like? I think if you specifically add the interactions to the MODEL statement, then the KEEP= option should work, but I am not sure.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

02-18-2016 06:26 PM

Hi David,

My wanted model is:

Y= continuous1 continuous2 continuous3 continuous4 continuous5 Categorical1 categorical2 categorical3 continuous1*Categorical1 continuous2*categorical1 continuous3*categorical1 continuous4 *categorical1 continuous5*categorical1 continuous1*continuous2

I expect there will be breaks in continuous1 and continuous2. There will be linear dependencies in the remaining 3 continuous variables.

How do you reckon? Thanks, Mai