BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
therock
Calcite | Level 5

Hi Everyone,

 

I am using proc surveyreg on an unbalanced panel data. Here is my model:

 

Y = X1 + X2 + X3 + C + X1*X2 + X1*X3 + X2*X3 + X1*X2*X3 (no intercept)

where.

X1 and X2 are dummy variables

X3 is Continous Variable

C are Control Variables

 

If I create percentile based on X3 (Low. medium, & high), and I want to find the interaction between X1, X2, & X3, how would I do that? More specifically, do I still include the continous variable X3 and then have dummy variables for percentile values, such LowX3, MedX3, and HighX3? What about the interaction term?

 

Thanks for your suggestion,

The rock 

1 ACCEPTED SOLUTION

Accepted Solutions
plf515
Lapis Lazuli | Level 10

First, I would not categorize a continuous variable into low, medium and high without very good reason.  Categorizing a continuous variable is nearly always a bad idea.  See my blog post The Perils of Categorizing Continuous Variables.

 

Second, if you really need to do this, then your continuous variable becomes a categorical one. Since it has 3 levels, it will have 2 dummy codes, not 3.

 

Third, since PROC SURVEYREG contains a CLASS statement, you don't need to dummy code things, just put them on the CLASS statement:

 

Finally, it's usually not a good idea to have models without an intercept.

View solution in original post

9 REPLIES 9
RobPratt
SAS Super FREQ

PROC SURVEYREG is part of SAS/STAT, so you will probably get a better response by posting in the SAS Statistical Procedures community:

https://communities.sas.com/t5/SAS-Statistical-Procedures/bd-p/statistical_procedures

plf515
Lapis Lazuli | Level 10

First, I would not categorize a continuous variable into low, medium and high without very good reason.  Categorizing a continuous variable is nearly always a bad idea.  See my blog post The Perils of Categorizing Continuous Variables.

 

Second, if you really need to do this, then your continuous variable becomes a categorical one. Since it has 3 levels, it will have 2 dummy codes, not 3.

 

Third, since PROC SURVEYREG contains a CLASS statement, you don't need to dummy code things, just put them on the CLASS statement:

 

Finally, it's usually not a good idea to have models without an intercept.

therock
Calcite | Level 5

Hi Peter,

 

I looked over your reply and blog and it makes perfect sense. The reason that I am not using an intercept has to do with the fact that I have too many CLASS variables in proc statement. Another reseacher in my field did not use the intercept variabel.

 

I want and prefer to use the continous variable. My biggest problem is the interpretation of the continuous variable within the interaction term. For example, in the original model, if I get a "positive and significant" value for the X1*X2*X3 term, then it would suggest that samples with large X3 value is beneficial. On the other hand, if I get a "negative & significant" value for the X1*X2*X3 term, then it would suggest that samples with large X3 value is not beneficial. What about samples with Low values for X3? What about samples with Mid values for X3? 

 

How would you interpret and find the effect of low levels of X3 in the equation?

 

Thanks,

therock

plf515
Lapis Lazuli | Level 10

That interpretaton of 3 way interactions is too simple.  The best way to interpret the meaning of interactions is graphically.  You can also show the predicted Y value at various combinations of the different X values.

 

I also don't see how the number of class variables changes whether you need an intercept term.

therock
Calcite | Level 5
Since the other two terms in my interaction are dummy, would that make a difference? For example, the first term is gender and the second term is for single parent. My continuous variable is the weight of the child. My dependent variable is school GPA. Given this situation, I want to know the effect on GPA of a male child of a single parent with low and high weight. How would you find that?
plf515
Lapis Lazuli | Level 10

I would use a three way interaction if you think it's important and then make a graph with predicted GPA on the y axis, weight on the x axis and 4 lines - one for each combination of gender and single parent

therock
Calcite | Level 5
How would you code the graph? I have never coded a graph.
Also, I don't want to find predicted value. I want to know the significance. How would I know the significance on the graph?
Thanks!
plf515
Lapis Lazuli | Level 10

Graphs in SAS are their own thing; there's too much to learn to put in one post.

And you ought to be interested in predicted values, but you get significance right from the output.  However significance and p values are way overrated.

therock
Calcite | Level 5
I guess that makes sense. I will try the graphing and see what comes out of it. Thanks!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 2092 views
  • 1 like
  • 3 in conversation