## PROC GLM Interaction Confusion

Solved
Occasional Contributor
Posts: 14

# PROC GLM Interaction Confusion

Hi all.

I have two 2-level independent variables: Gender (0 and 1) and Group (0 and 1) and a continuous DV: Score.

For my full model, I ran the following analysis:

PROC GLM DATA = mydata;

CLASS Gender Group;

MODEL Score = Gender Group Gender*Group

RUN;

Just out of interest, I calculate an interaction between the two variables myself (by using this formula: "Interaction = Gender*Group") and ran this model:

PROC GLM DATA = mydata;

CLASS Gender Group Interaction; *It makes no difference if "Interaction" is in the class section;

MODEL Score = Gender Group Interaction;

RUN;

The weird thing is that these produced different results! The results for the interaction were the same in each, but the individual main effects were very different. The reason that I am interested in this is that with PROC REG, the latter kind of model is the only possible one. And indeed, the results from PROC REG match the results from the second model above, and not the first (more automatic) one.

What gives? Am I calculating the interaction incorrectly?

Best,

Fearghal

Accepted Solutions
Solution
‎04-29-2015 10:22 PM
Posts: 5,521

## Re: PROC GLM Interaction Confusion

The asterisk in the interaction syntax doesn't really mean the product of the variable values. It implies the combinations of the class variable levels and generates multiple dummy variables in the design matrix. In your calculation of interaction, the terms (Gender; Group) = (0;0), (0;1) and (1;0) all correspond to interaction=0. To get the proper dummy variables in your design matrix, use proc glmmod :

PROC GLMMOD DATA = myData outParm=parms outDesign=myDesign;

CLASS Gender Group;

MODEL Score = Gender Group Gender*Group;

RUN;

Check out the parms and myDesign datasets. You can then use the myDesign dataset with proc reg, if you wish.

PG

PG

All Replies
Solution
‎04-29-2015 10:22 PM
Posts: 5,521

## Re: PROC GLM Interaction Confusion

The asterisk in the interaction syntax doesn't really mean the product of the variable values. It implies the combinations of the class variable levels and generates multiple dummy variables in the design matrix. In your calculation of interaction, the terms (Gender; Group) = (0;0), (0;1) and (1;0) all correspond to interaction=0. To get the proper dummy variables in your design matrix, use proc glmmod :

PROC GLMMOD DATA = myData outParm=parms outDesign=myDesign;

CLASS Gender Group;

MODEL Score = Gender Group Gender*Group;

RUN;

Check out the parms and myDesign datasets. You can then use the myDesign dataset with proc reg, if you wish.

PG

PG
Occasional Contributor
Posts: 14

## Re: PROC GLM Interaction Confusion

Thanks for that really clear response PG. Given that the asterisk in the GLM syntax doesn't literally mean the product of, does this mean that in general, one should not create an interaction in a regression model through such a simple product manner? - I feel like people do this all the time Or is my situation just a special case where the product and the asterisk syntax in GLM give different results?

Posts: 5,521

## Re: PROC GLM Interaction Confusion

Interactions between continuous variables can be created as simple products (XY = X*Y. Interactions between CLASS variables should not.

PG

PG
Occasional Contributor
Posts: 14

## Re: PROC GLM Interaction Confusion

That's really interesting and good to note. Thanks!

Posts: 5,521

## Re: PROC GLM Interaction Confusion

Check the documentation for GLM parameterization at SAS/STAT(R) 9.3 User's Guide

for more details.

PG

PG
🔒 This topic is solved and locked.