turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- How to back-transform LSMEAN standard errors from ...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-19-2012 06:26 PM

Hi,

I'm working with a dataset of litter depth and dry mass that, when logn (depth) or sqrt (mass) transformed has normally-distributed residuals. I'm including a random block effect in my analysis, so I need to use PROC MIXED.

I know how to back-transform the LS mean estimates themselves, using the equation

mn2 = exp(estimate + (.5 * residual_var) )

for log-transformed data.

I have also read that the following equation should be used to back-transform means for square-root transformed data (is this correct?):

mn2 = estimate^2 + (n-1)s^2/n

But my question is, how do I back-transform the LSMEAN standard errors, for both log- and sqrt-transformed data? I've searched all over, and can't find a clear answer to this question. Some sources even say it can't be done, yet I see it done in the literature so I know there must be a way.

Thanks in advance.

Cheers,

Nicole Michel

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-20-2012 07:55 AM

On SAS-L, I replied, and since folks don't always read both, I tried to recover it from the archive:

This looks like an opportunity to use PROC GLIMMIX, and use of the LINK option. For depth LINK=LOG, and for mass LINK=POWER(0.5). Then in the LSMEANS statement, use the ILINK option, and the final values will include the estimates and their standard errors on both the transformed and original scale. The documentation reports that the standard errors on the inverse lined scale are computed by the delta method.

I hope this helps.

Steve Denham

Message was edited by: Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-14-2014 02:59 PM

Hi Steve,

I was wondering if distribution has to be specified in the model statement.

Proc glimmix data=data;

class sub;

model y= x1 x2 / link=power(0.5) dist=;

random sub;

run;

I know if dist is not specified, Proc Glimmix assumes it to be gaussian. Since data is not normal in this case, how to determine the approximate distribution of the data?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-15-2014 07:33 AM

No need to specify a distribution, since under the links given the residuals are normally distributed.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-15-2014 09:52 AM

Hi Steve,

I apologize for the confusion. This is form SAS documentation of Proc Glimmix

"If you do not specify a distribution, the GLIMMIX procedure defaults to the normal distribution for continuous response variables". Does that mean normality of marginal distribution (y) or conditional distribution (residuals)?

I contacted SAS tech representative about specifying distribution in this case and I was told to check the histogram in Proc Univariate and see if reasonable distribution can be found and specify it in the dist parameter even I have used link function with power(-0.5) (-0.5 lambda value obtained from Box-Cox transformation).

Below is my code.

%MACRO GLMX1;

%DO I=1 %TO 5;

%LET VAR=%SCAN(&VARS,&I, ' ');

PROC GLIMMIX DATA=HFD.NEW_ECO_DATA NOBOUND PLOTS=RESIDUALPANEL (CONDITIONAL MARGINAL);

CLASS DIET DRUG RAT__ PUP__;

MODEL &VAR=DIET DRUG DIET*DRUG/ SOLUTION DDFM=BW LINK=POWER(-0.5) E;

RAMDOM INT/ SUB_RAT__;

LSMEANS DIET/CL DIFF ILINK;

LSMEANS DRUG/CL DIFF ILINK;

LSMEANS DIET*DRUG/CL DIFF ILINK;

RUN;

%END;

%MEND GLMX1;

%GLMX1;

Thank you very much.

I suppose it would be the distribution of residuals. The document also mentioned Error=keyword.

**DISTRIBUTION= keyword **

**DIST= keyword **

**D= keyword **

**ERROR= keyword **

**E= keyword **

specifies the built-in (conditional) probability distribution of the data.

Regards,

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-15-2014 02:30 PM

I would use the code that you have and not specify a distribution. Use of the link= option is equivalent to pre-transforming the data using the function specified in the link in order to normalize the residuals. (I assume that your Box-Cox derived link was determined from the residuals).

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-15-2014 06:13 PM

Thanks, Steve. I was also wondering in fit statistics of glimmix model if the ratio of general chi-sqaure and degree of freedom (Gener. Chi-square/DF) has to be close to one?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-16-2014 09:48 AM

It should. What kind of values are you getting?

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-16-2014 01:12 PM

Hi Steve,

I have several variables; some are close to 1 and some are not (such as 50). What could be the reason for gettting such high values even after transforming to induce normality in residuals?

Thanks !!!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-17-2014 12:51 PM

Probably an unaccounted for source of variability--any possibility of cage-level or room-level effects? Systematic unidirectional "outliers" can also have this effect.

And then, it may be that even after fitting Box-Cox to the residuals, the basic model is missing something that I am not seeing right away.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-17-2014 02:54 PM

For normal data (or any distribution with a free scale parameter), Gen. chi-squared/df does not need to be 1. It can be any value. For simple situations (variance component models), this statistic is the same as the residual variance.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-17-2014 02:57 PM

I am going to stand over in the corner for a while longer, and do some studying. I should have known this, and I didn't. Thanks, Larry.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-17-2014 04:24 PM

Hello Ivm,

This paper mentions Gen. chi-squared/df greater than 1 means overdispersion for binomial distribution.

http://www2.sas.com/proceedings/sugi30/196-30.pdf

Does greater than 1 means overdispersion in this case or something else?

Thanks !!!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-17-2014 04:33 PM

Your distribution is normal, which *is* in the exponential family. But the binomial and Poisson do not have a free scale (variance) parameter. The normal, gamma, beta, and others do have a scale parameter. This is a HUGE difference, for many reasons. Find the several articles/books by Walter Stroup. Overdispersion is a concept only for distributions without a free scale parameter. My earlier response is correct.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-17-2014 04:44 PM

Thanks Ivm and Steve. This forum is really helpful.

Regards,