BookmarkSubscribeRSS Feed
JamesLin
Fluorite | Level 6

 

Hello:

 

My question is whether it’s possible to compute lsmeans defined in this SAS algorithm if the design matrix is not in GLM form. In particular, in R, if one feeds that design to model.matrix(), then Reference coding is used, so A will occupy 2 columns, B will occupy 1 column, A*B will occupy 2 columns, and C will take 1 column.

 

The meaning of columns in GLM and Ref coding is quite different. For instance, if we have just one factor with two levels, A1 and A2, then in GLM format the intercept corresponds to the response averaged over A1 and A2, but in Ref the intercept corresponds to response at A1.

 

Apparently, the column space of GLM and Ref is the same, so I wonder if there is a way to represent each GLM column as a linear combination of Ref columns. In that case, one could take the LSM vector defined in GLM terms and apply it to a model fitted in Ref coding. Likewise, a contrast vector can be set as a difference of two LSM vectors and then estimated in Ref coding.

 

I know there are some R packages, like "contrast" that can take a fitted model in Ref coding and a contrast or lsmeans specification in string format, like "LSM(A1, B1)" or "A1 vs A2". However, here I am interested in a more specific solution that assumes we have already implemented the SAS algorithm for LSM vector. I am looking for a GLM->Ref "adapter" of sorts.

 

Thanks,

James

2 REPLIES 2
StatDave
SAS Super FREQ

Reference parameterization (PARAM=REF option) is just a nonsingular (full-rank) version of GLM parameterization (PARAM=GLM option). The unrestricted parameters (those not constrained to zero) using GLM parameterization are the same as the parameters using reference parameterization and have the same interpretation. In your example, the intercept estimate from both parameterizations is the same and is interpreted as the effect at A2 (by default). And the interpretation of the A1 parameter is the difference in effect of the A1 and A2 levels under both parameterizations. See this section on the available parameterizations for more.

JamesLin
Fluorite | Level 6
A nice point is that in R, model.matrix() will not use the same reference level as REF option in SAS. For a factor with levels A1 and A2, R will use A1 as reference and SAS will use A2.

So, the question is how to represent each column of GLM matrix as a linear combination of columns from Ref matrix. I suspect one should solve a certain linear system for that (apparently, the answer will depend on what level is used for reference).

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 542 views
  • 0 likes
  • 2 in conversation