BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Reshi
Calcite | Level 5

Hi,

I was looking at a coding example in Ramon Littel's book 'SAS for Mixed Modells', where he is looking at an interaction between a continuous (hour) and a categorical (drug) variable in the contrast statment. I don't understand why (within each contrast) he first specifies the main effect, e.g. "drug 1 -1 0" before specifying the drug*hour interaction. What is the purpose of this?

contrast_statement.png

Here is another example of someone doing the same thing in an estimate statement (source: SAS Library: How do I handle interactions of continuous and categorical variables?😞

estimate_statement.png

Can anyone shed some light on why this is necessary, i.e. why the contrast/estimate statments don't just simply contain the interaction of interest, without the main effect?

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

This is needed because of the way SAS has chosen to parameterize the model with regard to classification factors (sometimes called the "effects model").

The interaction has to have all of the effects shown, and if you left the main effect for drug (or diet in the second example), you would be told that the effect is non-estimable. You can force PROC GLM to display what functions are estimable by using the E option in the MODEL statement.

There are other parameterizations of the model (you may have learned somewhere, and are used in other software, called the "means model") where this would not be necessary. But SAS doesn't use this parameterization, you are pretty much stuck with the parameterization that SAS gives you, and thus you need to include the main effects in the ESTIMATE statement, as shown by the E option in the MODEL Statement.

This reference goes into all the gory details: "Analysis of Messy Data", Milliken and Johnson, Van Nostrand Reinhold, 1984.

--
Paige Miller

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26

This is needed because of the way SAS has chosen to parameterize the model with regard to classification factors (sometimes called the "effects model").

The interaction has to have all of the effects shown, and if you left the main effect for drug (or diet in the second example), you would be told that the effect is non-estimable. You can force PROC GLM to display what functions are estimable by using the E option in the MODEL statement.

There are other parameterizations of the model (you may have learned somewhere, and are used in other software, called the "means model") where this would not be necessary. But SAS doesn't use this parameterization, you are pretty much stuck with the parameterization that SAS gives you, and thus you need to include the main effects in the ESTIMATE statement, as shown by the E option in the MODEL Statement.

This reference goes into all the gory details: "Analysis of Messy Data", Milliken and Johnson, Van Nostrand Reinhold, 1984.

--
Paige Miller
JohnW_
Calcite | Level 5

Try to calculate and/or graph your estimated means by hand using the parameters of your model, with a different line for each level of your main effect.  With an interaction, the difference in main effects differs at depending on the level of your continuous variable.  When you have more than two levels of your main effect, you need to specify which two you are contrasting...because you are estimating different slopes for your continuous variable for each level of your main effect.

The following pages may be of help to you.  It always helps me to write out the function I am estimating based on its parameterization, and then the components of the two means I am trying to compare.

Positional and Nonpositional Syntax for Coefficients in Linear Functions :: SAS/STAT(R) 13.1 User's ...

Specification of ESTIMATE Expressions :: SAS/STAT(R) 13.1 User's Guide

SteveDenham
Jade | Level 19

Also note that in the first example, the variable hour is being fit as a categorical variable, with multiple levels.  It is NOT being treated as a continuous variable.  In the second example, height is a continuous variable, and heterogeneous slopes are fit by diet.  This results in testing the differences between diets at a low, median, and high value of the continous covariate.

You may wish to examine the use of the LSMESTIMATE statement. It avoids this sort of thing where all levels need to be specified.

Steve Denham
.

Jay4
Calcite | Level 5

Thanks Steve. Will the LSMESTIMATE work for continuous variables. It says " Only class variables allowed in this effect."

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 9834 views
  • 6 likes
  • 5 in conversation