Hi all.
I have two 2-level independent variables: Gender (0 and 1) and Group (0 and 1) and a continuous DV: Score.
For my full model, I ran the following analysis:
PROC GLM DATA = mydata;
CLASS Gender Group;
MODEL Score = Gender Group Gender*Group
RUN;
Just out of interest, I calculate an interaction between the two variables myself (by using this formula: "Interaction = Gender*Group") and ran this model:
PROC GLM DATA = mydata;
CLASS Gender Group Interaction; *It makes no difference if "Interaction" is in the class section;
MODEL Score = Gender Group Interaction;
RUN;
The weird thing is that these produced different results! The results for the interaction were the same in each, but the individual main effects were very different. The reason that I am interested in this is that with PROC REG, the latter kind of model is the only possible one. And indeed, the results from PROC REG match the results from the second model above, and not the first (more automatic) one.
What gives? Am I calculating the interaction incorrectly?
Best,
Fearghal
The asterisk in the interaction syntax doesn't really mean the product of the variable values. It implies the combinations of the class variable levels and generates multiple dummy variables in the design matrix. In your calculation of interaction, the terms (Gender; Group) = (0;0), (0;1) and (1;0) all correspond to interaction=0. To get the proper dummy variables in your design matrix, use proc glmmod :
PROC GLMMOD DATA = myData outParm=parms outDesign=myDesign;
CLASS Gender Group;
MODEL Score = Gender Group Gender*Group;
RUN;
Check out the parms and myDesign datasets. You can then use the myDesign dataset with proc reg, if you wish.
PG
The asterisk in the interaction syntax doesn't really mean the product of the variable values. It implies the combinations of the class variable levels and generates multiple dummy variables in the design matrix. In your calculation of interaction, the terms (Gender; Group) = (0;0), (0;1) and (1;0) all correspond to interaction=0. To get the proper dummy variables in your design matrix, use proc glmmod :
PROC GLMMOD DATA = myData outParm=parms outDesign=myDesign;
CLASS Gender Group;
MODEL Score = Gender Group Gender*Group;
RUN;
Check out the parms and myDesign datasets. You can then use the myDesign dataset with proc reg, if you wish.
PG
Thanks for that really clear response PG. Given that the asterisk in the GLM syntax doesn't literally mean the product of, does this mean that in general, one should not create an interaction in a regression model through such a simple product manner? - I feel like people do this all the time Or is my situation just a special case where the product and the asterisk syntax in GLM give different results?
Interactions between continuous variables can be created as simple products (XY = X*Y;). Interactions between CLASS variables should not.
PG
That's really interesting and good to note. Thanks!
Check the documentation for GLM parameterization at SAS/STAT(R) 9.3 User's Guide
for more details.
PG
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.