Solved
Contributor
Posts: 30

# Help with Regression & Interaction term

Hi Everyone,

I am using proc surveyreg on an unbalanced panel data. Here is my model:

Y = X1 + X2 + X3 + C + X1*X2 + X1*X3 + X2*X3 + X1*X2*X3 (no intercept)

where.

X1 and X2 are dummy variables

X3 is Continous Variable

C are Control Variables

If I create percentile based on X3 (Low. medium, & high), and I want to find the interaction between X1, X2, & X3, how would I do that? More specifically, do I still include the continous variable X3 and then have dummy variables for percentile values, such LowX3, MedX3, and HighX3? What about the interaction term?

The rock

Accepted Solutions
Solution
‎02-23-2016 06:57 PM
Frequent Contributor
Posts: 140

## Re: Help with Regression & Interaction term

First, I would not categorize a continuous variable into low, medium and high without very good reason.  Categorizing a continuous variable is nearly always a bad idea.  See my blog post The Perils of Categorizing Continuous Variables.

Second, if you really need to do this, then your continuous variable becomes a categorical one. Since it has 3 levels, it will have 2 dummy codes, not 3.

Third, since PROC SURVEYREG contains a CLASS statement, you don't need to dummy code things, just put them on the CLASS statement:

Finally, it's usually not a good idea to have models without an intercept.

All Replies
SAS Employee
Posts: 505

## Re: Help with Regression & Interaction term

PROC SURVEYREG is part of SAS/STAT, so you will probably get a better response by posting in the SAS Statistical Procedures community:

https://communities.sas.com/t5/SAS-Statistical-Procedures/bd-p/statistical_procedures

Solution
‎02-23-2016 06:57 PM
Frequent Contributor
Posts: 140

## Re: Help with Regression & Interaction term

First, I would not categorize a continuous variable into low, medium and high without very good reason.  Categorizing a continuous variable is nearly always a bad idea.  See my blog post The Perils of Categorizing Continuous Variables.

Second, if you really need to do this, then your continuous variable becomes a categorical one. Since it has 3 levels, it will have 2 dummy codes, not 3.

Third, since PROC SURVEYREG contains a CLASS statement, you don't need to dummy code things, just put them on the CLASS statement:

Finally, it's usually not a good idea to have models without an intercept.

Contributor
Posts: 30

## Re: Help with Regression & Interaction term

Hi Peter,

I looked over your reply and blog and it makes perfect sense. The reason that I am not using an intercept has to do with the fact that I have too many CLASS variables in proc statement. Another reseacher in my field did not use the intercept variabel.

I want and prefer to use the continous variable. My biggest problem is the interpretation of the continuous variable within the interaction term. For example, in the original model, if I get a "positive and significant" value for the X1*X2*X3 term, then it would suggest that samples with large X3 value is beneficial. On the other hand, if I get a "negative & significant" value for the X1*X2*X3 term, then it would suggest that samples with large X3 value is not beneficial. What about samples with Low values for X3? What about samples with Mid values for X3?

How would you interpret and find the effect of low levels of X3 in the equation?

Thanks,

therock

Frequent Contributor
Posts: 140

## Re: Help with Regression & Interaction term

That interpretaton of 3 way interactions is too simple.  The best way to interpret the meaning of interactions is graphically.  You can also show the predicted Y value at various combinations of the different X values.

I also don't see how the number of class variables changes whether you need an intercept term.

Contributor
Posts: 30

## Re: Help with Regression & Interaction term

Since the other two terms in my interaction are dummy, would that make a difference? For example, the first term is gender and the second term is for single parent. My continuous variable is the weight of the child. My dependent variable is school GPA. Given this situation, I want to know the effect on GPA of a male child of a single parent with low and high weight. How would you find that?
Frequent Contributor
Posts: 140

## Re: Help with Regression & Interaction term

I would use a three way interaction if you think it's important and then make a graph with predicted GPA on the y axis, weight on the x axis and 4 lines - one for each combination of gender and single parent

Contributor
Posts: 30

## Re: Help with Regression & Interaction term

How would you code the graph? I have never coded a graph.
Also, I don't want to find predicted value. I want to know the significance. How would I know the significance on the graph?
Thanks!
Frequent Contributor
Posts: 140

## Re: Help with Regression & Interaction term

Graphs in SAS are their own thing; there's too much to learn to put in one post.

And you ought to be interested in predicted values, but you get significance right from the output.  However significance and p values are way overrated.

Contributor
Posts: 30

## Re: Help with Regression & Interaction term

I guess that makes sense. I will try the graphing and see what comes out of it. Thanks!
☑ This topic is solved.