BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,

When I run ANOVA by proc mixed (or proc glm) with lsmeans option to compare means of groups (e.g., each group containing n=12 samples = balanced data), I get identical standard errors (SE) for each mean value.

Based on manual calculation with Excel, I know that the standard deviation (SD) of each group is quite different, and also I remember: SE = SD/(sqrt of sample size n).

If SD are different, why does lsmeans option give identical SE for all of the group means?

Thanks!
4 REPLIES 4
SteveDenham
Jade | Level 19
/soapbox on

I believe you need to study up on some of the basic assumptions of analysis of variance. In particular, the assumption of homogeneity of variance. If you start out assuming that all groups have an equal variance, it is not surprising that the best linear unbiased estimates (the LSMEANS) all have the same estimate of variability (standard error). Further, the calculation of the standard error of any estimate or differences in estimates is based on a single, pooled value--the mean squared error in the case of proc glm, and the combined quadratic form (see the documentation) in proc mixed.

/soapbox off



SteveDenham
deleted_user
Not applicable
Hi Steve, thanks so much for your reply, your comment really helped to shed light on the source(s) of my confusion.

I thought the reason for testing ‘variance homogeneity’ was to ensure that all groups being compared have comparable spread in data points. Also, I thought ‘standard errors’ provide information on how ‘precise’ a given group mean is (i.e., if multiple samples were repeatedly drawn from the same population, about two thirds of these samples would be expected to have mean values between one SE above and below the estimated mean).

Given such compartmentalized understanding of these concepts, it is difficult to fully comprehend your comment on how …it is not surprising to see identical standard errors if the variances are equal... (I'll study up on it. Meanwhile, I welcome any help for me to connect the dots)

Also, I should study up on the method of value pooling and how SAS calculates lsmeans and standard errors.

For now, my question from the original post has evolved into:

Before I conduct ANOVA I check my data for ANOVA assumptions, and I know that the groups being compared have variances that are not significantly different (but of course, not identical). If the variances are not identical, why should the standard errors for lsmeans of different groups have identical standard errors?

Thanks again Steve!
SteveDenham
Jade | Level 19
Lsmeans are solutions to a series of simultaneous equations constructed so that they are the best linear unbiased estimators of central tendency. They consider the whole of the data collected, and adjust for imbalance in numbers between groups. Since we are dealing with a system, it then becomes easy (under ordinary least squares estimation) to derive a single estimate of variability (MSE) that applies to each and every lsmean. This gives rise to the standard error of the mean. Also from this estimate of variability, we can solve for the standard error of the difference between two means, and perform statistical comparisons. This is why standard errors are important--they tell us about the possible distribution of the parameter being estimated, whereas standard deviations tell us about the distribution of the values that were measured. This is a subtle but critical difference.

Proc mixed moves beyond proc glm to use likelihood based estimation, so that we can accommodate structural assumptions about the error variances and covariances. And in fact, you can model the error in such a way that heterogeneous variances can be accommodated. But you still do NOT get standard deviations. The variability estimates are standard ERRORs of the parameters that provide an optimal fit to the data, given the model and error structure chosen.

I hope this helps.

SteveDenham
deleted_user
Not applicable
Hi Steve, thank you very much for your prompt and thorough reply.

Encouraged by your help, I’ll try reading up on the issue further and see if I can better understand the concept of lsmeans in relation to standard errors and variance.

I might (likely) come back to post more questions if I get stuck or simply frustrated with the learning process. Thanks Steve, Best regards.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2591 views
  • 0 likes
  • 2 in conversation