New Contributor
Posts: 2

# Means and Regressions Coefficients Don't Match Up

Hi everyone,

I feel pretty stupid asking this question, but I can't seem to figure this out. I ran a Poisson regression model with quite a few control variables and SAS didn't seem to have an issue running it.

I'm writing up the analysis and I went to report the means and I noticed that the means from proc means don't make sense based on what the regression coefficients specified. For example the treatment coefficient was positive in the output, but when I calculated the means I noticed that the treatment was actually lower.

I first thought it was pretty odd, but then thought it must have been do to the control variables. That being said is there anyway to get the means and standard deviations while controlling for the variance introduced by the possible confounding variables.

Any help would be greatly appreciated. Thanks in advance!
Posts: 2,655

## Re: Means and Regressions Coefficients Don't Match Up

Which procedure did you use to run the Poisson regression? With that info, we could probably come up with a way to calculate lsmeans and standard errors.

Steve Denham
New Contributor
Posts: 2

## Re: Means and Regressions Coefficients Don't Match Up

I used Proc NLMixed. After your comment, I did a bit more searching and I believe I'll need to use the estimate option -- which I'm attempting to figure out right now.

Thanks!
Regular Contributor
Posts: 171

## Re: Means and Regressions Coefficients Don't Match Up

How did you compute a mean from the NLMIXED ouput? Did you employ code something like the following:

proc nlmixed data=mydata;
eta = b0 + TrtEff*(Trt=1) + b1*x1;
lamba = exp(eta);
model Y ~ Poisson(lambda);
predict lambda out=lambda;
run;

proc means data=lambda;
class Trt;
var lambda;
run;

In the above code, values of lambda are affected not only by the treatment effect, but also by the value of X1. If there are differences across treatment levels in the distribution of X1, then the average value for each of the treatment levels depends not only on the treatment effect but also on the values of X1 within the treatment level.

Suppose that you had the fitted model

eta = 0.7 + 0.15*(Trt=1) + 1*X1

and that

mean(X1 | Trt=0) = 1.25
mean(X1 | Trt=1) = 1.00

Then the mean values of eta would be

mean(eta | Trt=0) = 1 + 0.15*(0) + 1*(1.25) = 2.25
mean(eta | Trt=1) = 1 + 0.15*(1) + 1*(1.00) = 2.00

On the level of the log link, we see that the effect of the treatment is to increase the mean. However, the mean for treatment 1 is lower than the mean for treatment 0 because the mean value of X1 differed across treatments. If we exponentiate the LSmeans obtained for the log-link, we obtain estimates of the LSmeans on the original scale. The positive treatment effect on the log-link scale will be observed on the original scale as well.

So, how can you get something like the LSmean values which are provided by the GLM, MIXED, GLIMMIX, and GENMOD procedures when you fit your regression model employing NLMIXED? You are correct that you need to use an ESTIMATE statement. If you look at the documentation of the LSmean statement from one of the other procedures, you will observe that a mean value is computed for each of the covariates in the model. This mean value is then employed when constructing the least squares mean for each treatment level.

In our little example above, lets assume that there are an equal number of observations for each of our two treatments. Then the mean value of X1 across both treatment levels would be 1.125. We can write the estimate statements:

estimate "LSmean(eta | Trt=0)" b0 + TrtEff*0 + b1*1.125;
estimate "LSmean(eta | Trt=1)" b0 + TrtEff*1 + b1*1.125;

This would produce LSmean estimates as follows:

LSmean(eta | Trt=0) = 1 + 0.15*(0) + 1*(1.125) = 2.125
LSmean(eta | Trt=1) = 1 + 0.15*(1) + 1*(1.125) = 2.275

Effects of all other variables (X1 in our little example) being equal, then the least squares mean for the treated group is now higher than the least squares mean for the untreated group as expected. You could also include the statements:

estimate "LSmean(lambda | Trt=0)" exp(b0 + TrtEff*0 + b1*1.125);
estimate "LSmean(lambda | Trt=1)" exp(b0 + TrtEff*1 + b1*1.125);

I would note that this question would have been better directed to the forum which discusses statistical procedures. Next time, you will do that, right?
Discussion stats
• 3 replies
• 134 views
• 0 likes
• 3 in conversation