BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Levi_M
Fluorite | Level 6

Hello, 

I have run my proc glimmix first with the distribution = normal and then I again with the distribution = lognormal; which, fits the data much better. I recall why the parameter results are different, with the lognormal being much smaller. But, I also thought there is a conversion to increase the log result so that it is converted back closer to normal?

Normal:

spaxxs_0-1654532543362.png

Log:

spaxxs_1-1654532588565.png

Thank you!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Please provide the complete MODEL statements for each model.

 

I do not know what you mean by a "conversion" from the lognormal model to make it "closer to normal." You can transform the predicted values from the lognormal scale into the data scale by using the EXP function. Is that what you mean?

 

Let me describe the difference between the two models. The first is the familiar linear regression model. The second is a model of log(Y), where Y is the response variable.

 

A model depends not only on the assumed distribution of the errors, but also on the link function.  By default, PROC GLIMMIX uses the identity link for DIST=NORMAL and for DIST=LOGNORMAL. That means that the LOGNORMAL model is the same as modeling

log(Y) = X*beta + epsilon

where Y is the response variable and epsilon ~ N(0, sigma).

 

In other words, the following two models are equivalent:

data cars;
set sashelp.cars;
logMPG = log(mpg_city);
run;

title "Model of log(Y) with DIST=NORMAL";
proc glimmix data=cars plots=none;
model logMPG = weight horsepower / dist=normal solution;
ods select ParameterEstimates;
run;

title "Model of Y with DIST=LOGNORMAL";
proc glimmix data=cars plots=none;
model MPG_city = weight horsepower / dist=lognormal solution;
ods select ParameterEstimates;
run;

Accordingly, you can see that the model with DIST=LOGNORMAL is providing estimates for the LOGARITHM of the response, not the response itself. You can use the EXP function to convert predicted values from the log scale to the data scale.

 

View solution in original post

5 REPLIES 5
Rick_SAS
SAS Super FREQ

Please provide the complete MODEL statements for each model.

 

I do not know what you mean by a "conversion" from the lognormal model to make it "closer to normal." You can transform the predicted values from the lognormal scale into the data scale by using the EXP function. Is that what you mean?

 

Let me describe the difference between the two models. The first is the familiar linear regression model. The second is a model of log(Y), where Y is the response variable.

 

A model depends not only on the assumed distribution of the errors, but also on the link function.  By default, PROC GLIMMIX uses the identity link for DIST=NORMAL and for DIST=LOGNORMAL. That means that the LOGNORMAL model is the same as modeling

log(Y) = X*beta + epsilon

where Y is the response variable and epsilon ~ N(0, sigma).

 

In other words, the following two models are equivalent:

data cars;
set sashelp.cars;
logMPG = log(mpg_city);
run;

title "Model of log(Y) with DIST=NORMAL";
proc glimmix data=cars plots=none;
model logMPG = weight horsepower / dist=normal solution;
ods select ParameterEstimates;
run;

title "Model of Y with DIST=LOGNORMAL";
proc glimmix data=cars plots=none;
model MPG_city = weight horsepower / dist=lognormal solution;
ods select ParameterEstimates;
run;

Accordingly, you can see that the model with DIST=LOGNORMAL is providing estimates for the LOGARITHM of the response, not the response itself. You can use the EXP function to convert predicted values from the log scale to the data scale.

 

SteveDenham
Jade | Level 19

Buried down in the fine print of the documentation are some equations for putting the estimates in the lognormal space into the normal space.  Note that @Rick_SAS 's method of exponentiating results gives the geometric mean, which likely the best estimate in the original space.  However, it is not the expected value - to get that you need:

omega = exp(sigma*sigma)

expected value of Y = exp(mu)*sqrt(omega)

Variance of Y = exp(2*mu)*omega8(omega - 1)

 

Note that these values are the expected values, which in a right skewed distribution (like a lognormal) may be much larger than the exp(mu), which should be approximately the value that gives the peak of the distribution.

 

SteveDenhm

 

marianna16
Calcite | Level 5

I would have a further question, related to Proc Glimmix with dist=lognormal. Is it possible to get estimates for differences (among groups) back onto the original scale dataset?

Many thanks in advance!

Ksharp
Super User

According to this paper (see attachment).

You can't do it by PROC GLIMMIX.

But you can get it by PROC NLMIXED .

 

marianna16
Calcite | Level 5

Thank you very much Ksharp!

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 3213 views
  • 2 likes
  • 5 in conversation