Confusion regarding LS Means Interpretation in GLIMMIX

PhilaRose · Posted 02-20-2015 01:05 PM

Hello SAS Community,

I am new to PROC GLIMMIX and I am struggling with an output interpretation that I assumed would be straightforward. I have a dataset that is looking at reproductive output of bee colonies (RS). In my model, I have a treatment effect (trt) and two continuous covariates (cov1 and cov2). The response variable is lognormally distributed, so I have employed the lognormal distribution in my GLIMMIX procedure (using default link function "id"). My code is below.

proc glimmix order=data data=colony_repro plots = residualpanel;

title 'GLIMMIX for RS w/ cov1 and cov2';

class trt;

model RS = trt cov1 cov2 / dist=lognormal s ddfm = satterthwaite;

lsmeans trt / cl;

The trouble I am having is with my LS MEANS. I took the outputted LS Means for my treatments and associated 95% CLMS and back-transformed them (10^MEAN). Looking at the values in their original scale, they do not seem correct, and the 95% CLMs are enormous.

I ran this same model in Proc GLM with the log10-transformed response variable (log10(RS)). The LS Means are very different and the confidence limits are much smaller. This leads me to believe that I have either coded something incorrectly or that I am not appropriately interpreting my LS Means output (e.g. it's not correct to apply the back-transformation as I did).

Any thoughts on what I am doing wrong? Thank you for your time!

lvm · Posted 02-20-2015 08:29 PM

GLIMMIX uses a base-e log, not a base-10 log as the default link with the lognormal. Thus, you would use e^link to get the mean in a data step.You can also get the inverse link directly:

lsmeans trt / ilink cl;

But your text says that you used an 'id' link, which I assume you mean identity. This is not the default, and your model statement would give you the default link. If you actually used an identity link, then you would not have any inverse link to calculate.

PhilaRose · Posted 02-21-2015 12:24 PM

Thank you for your response! Using base-e fixed the problem. It's always nice when the fix is simple!

I did mean 'identity' for the link function. In my output, this link function is specified even though I did not code a link function specifically, so I assumed it was the default setting. Can you expand upon your statement "if you actually used an identity link then you would not have any inverse link to calculate"?

lvm · Posted 02-21-2015 01:23 PM

I wasn't very clear in my answer about the link. Sorry. The link refers to g(mu) function, where mu is the expected value of Y. In general, the identify link indicates no transformation of the mean, so that g(mu) = mu. Thus, the inverse link also does not require any transformation. But the log-normal distribution is a special case, compared to all the other distributions handled by glimmix. (I wasn't thinking about this).

SAS/STAT(R) 9.3 User's Guide

Note that the documentation puts "log-normal” in quotation marks. This is because this distribution is treated differently from the rest of the distributions. With the log-normal, Z=log(Y) has a normal distribution. So, the procedure (apparently) takes the base-e log of Y and fits the normal distribution to log(Y). And it uses an identity link on mu, where mu is the mean of Z=log(Y), not the mean of Y. The inverse link would just be mu, also; that is, the mean of log(Y). An inverse link function in an estimate statement would not help with this distribution (just gives you the same value as g(mu)). This is different from all other distributions. All other distributions fitted by glimmix are directly for the Y random variable, not a transformation of Y. Take the gamma distribution, which is for right-skewed distributions (just like the log-normal). The distribution is for Y, and the link is log(mu), where mu is the mean of Y (not the mean of log(Y)). An inverse link would give you mu (the mean of Y).

Check out the following. One gets the same answer for a log-normal distribution fit to Y, as for a normal distribution fit to Z = log(Y). This equivalence is only for the log-normal.

data l;

input y @@;

z = log(y);

datalines;

1 2 3 3 3 2 2 1 .5 .1 2 2 2 2 2 2 1 1 1 1 4 5 3 2 2 2

;

proc glimmix data=l;

model y = / s dist=lognormal;

estimate 'intercept' int 1 / cl ilink;

run;

proc glimmix data=l;

model z = / s dist=normal;

estimate 'intercept' int 1 / cl ilink;

run;

If you want a central-type statistic on the scale of the original Y data when using the log-normal, you would use e^mu (where mu is for log(Y)) in a data step (or with an EXP option on an estimate statement). But things get even trickier with log-normal: the back-transformation is really giving you the median of the original Y distribution, not the mean. This statement is just for the log-normal distribution.

Confusion regarding LS Means Interpretation in GLIMMIX

Re: Confusion regarding LS Means Interpretation in GLIMMIX

Re: Confusion regarding LS Means Interpretation in GLIMMIX

Re: Confusion regarding LS Means Interpretation in GLIMMIX