BookmarkSubscribeRSS Feed
q5pham
Obsidian | Level 7

Hi,

 

I would like to do a segmented multiple regression with fixed effects (with and without interactions). Does SAS has any procedures to handle this task. I've read the PROC NLIN can do a segmented regression, however I can only find tutorials on a segmented regression on a single predictor. I'll be very appreciate if you could point to me any samples on the issue. Links to tutorials would be nice too.

 

I'm currently working with SAS EG 6.1.

 

Many thanks,

Mai

10 REPLIES 10
Rick_SAS
SAS Super FREQ

Segmented regression is well defined for a single predictor. You can specify a number of breakpoints (single points) that separate the domain into disjoint segments on which to run a regression.

 

When you have more than one explanatory variable, segmented regression is not well defined. For example, with two variables, you can break up the domain into rectangle and solve the regression problem on each rectangle. But you can also use triangles or squiggly curves. 

 

The way to approach this topic for multivariate regression is to switch from segmented regression to local regression.  In local regression, the predicted value at each point p is obtained by solving a regression problem that involves the data points that are close to p.  Often a kernel function is used so that points close to p carry more weight than points far from p. 

 

This method is call LOESS for LOcal EStimation. In SAS, it is supported by PROC LOESS. The documentation includes a two-diensional example that you should look at.

 

There are other nonparametric regression techniques in SAS, but PROC LOESS seems to be most similar to segmented regression.

SteveDenham
Jade | Level 19

I'll add on to Rick's reply about segmented multiple regression.  Aside from partitioning the m-space of the predictor variables into the appropriate number of subdomains, you must also enforce both continuity and smoothness constraints along all points on the (m-1)*(m-2)/2 intersections (hope I have the right number there).  These constraints may not be amenable to optimization.

 

Consider the loess approach, where "join contours" can be determined by looking at the gradient and hessian of the final solution.

 

Steve Denham

 

EDIT: Well it wasn't the right number.  If there are m predictor variables, the number of intersections is m*(m-1)/2.

q5pham
Obsidian | Level 7

Hi Rick and Steve,

 

Thank you very much for your posts. I'll study the LOESS for more details. As I just come across to Adaptivereg, I would like to ask for your comments.

 

According to the SAS guide on Adaptivereg (http://support.sas.com/documentation/cdl/en/statug/65328/HTML/default/viewer.htm#statug_adaptivereg_... the LOESS and TPSPLINE procedures are limited to problems in low dimensions. My base model includes 5 countinuous and 3 categorical predictor variables. I only expect one break in 2 continuous variables each and I assume each segements are linear. Besides, there will be linear relationships between the response variable with the remaining 3 continuous variables. There might be categorical by continuous interactions which I need to study for a full model (1 categorical * 5 continuous variable interactions).

 

In my situation, what approach is most feasible to look at.

 

Many thanks,

Mai

Rick_SAS
SAS Super FREQ

PROC LOESS does not have a CLASS statement, and as you say it is best for low-dimensional problems. I agree that you should look at PROC ADAPTIVEREG, which is a nonparametric routine that can fit flexible models to data.

 

Now that you've explained your problem more, there might be another approach. You say that you only want piecewise linear functions for two of the variables. Look at using the EFFECT statement to create linear splines (DEGREE=1) for those two variables. IF YOU KNOW THE LOCATIONS for the breakpoints, you can use the KNOTMETHOD=LIST(..) option to specify the locations that you want those variables split. You would want to use a truncated power basic function (degree=1) for the splines. Notice, however, that in this approach the knot positions are fixed, not parameters that are chosen by the procedure.

 

I've never done this before, but it might work.

q5pham
Obsidian | Level 7
Hi Rick, Thank you for your confirmation on Loess. Will find out how this goes. Hope you can still with me on the subject. Good night and talk to you soon.
q5pham
Obsidian | Level 7

Hi Rick and all again,

 

Have tried Adaptivereg, the interpretation for model without interaction between variables is straightforward. However, I have a big trouble with one with interaction (2 ways only). The issue is that SAS automatically selects the model using stepwise regressions and it comes up with quite many unwanted interactions. I have checked the SAS manual but could not find an option where I can define my interactions. Have I missed something? or if not, can any of you suggest me a way to work arround this problem. I have reduced the number of basis functions but not sure if it is a good approach, and still, several unwanted interactions appear in my final model.

 

Thanks, Mai

SteveDenham
Jade | Level 19

Would using the KEEP= option in the MODEL statement enable to look only at the interactions of interest?  That fixes those terms in the model, and I would assume the basis vectors describing them.  If they were exhaustive of all the model terms, then I think you would have what you need.

 

Steve Denham

q5pham
Obsidian | Level 7

David, Keep= does not work with interactions. It only deals with variables. If I create a new variable as my interaction, I'm not quite sure break points would behave correctly.

SteveDenham
Jade | Level 19

What does your current MODEL statement look like?  I think if you specifically add the interactions to the MODEL statement, then the KEEP= option should work, but I am not sure.

 

Steve Denham

q5pham
Obsidian | Level 7

Hi David,

 

My wanted model is:

 

Y= continuous1 continuous2 continuous3 continuous4 continuous5 Categorical1 categorical2 categorical3 continuous1*Categorical1 continuous2*categorical1 continuous3*categorical1 continuous4 *categorical1 continuous5*categorical1 continuous1*continuous2

 

I expect there will be breaks in continuous1 and continuous2. There will be linear dependencies in the remaining 3 continuous variables.

 

How do you reckon? Thanks, Mai

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 3715 views
  • 7 likes
  • 3 in conversation