Hello,
I have run my proc glimmix first with the distribution = normal and then I again with the distribution = lognormal; which, fits the data much better. I recall why the parameter results are different, with the lognormal being much smaller. But, I also thought there is a conversion to increase the log result so that it is converted back closer to normal?
Normal:
Log:
Thank you!
Please provide the complete MODEL statements for each model.
I do not know what you mean by a "conversion" from the lognormal model to make it "closer to normal." You can transform the predicted values from the lognormal scale into the data scale by using the EXP function. Is that what you mean?
Let me describe the difference between the two models. The first is the familiar linear regression model. The second is a model of log(Y), where Y is the response variable.
A model depends not only on the assumed distribution of the errors, but also on the link function. By default, PROC GLIMMIX uses the identity link for DIST=NORMAL and for DIST=LOGNORMAL. That means that the LOGNORMAL model is the same as modeling
log(Y) = X*beta + epsilon
where Y is the response variable and epsilon ~ N(0, sigma).
In other words, the following two models are equivalent:
data cars;
set sashelp.cars;
logMPG = log(mpg_city);
run;
title "Model of log(Y) with DIST=NORMAL";
proc glimmix data=cars plots=none;
model logMPG = weight horsepower / dist=normal solution;
ods select ParameterEstimates;
run;
title "Model of Y with DIST=LOGNORMAL";
proc glimmix data=cars plots=none;
model MPG_city = weight horsepower / dist=lognormal solution;
ods select ParameterEstimates;
run;
Accordingly, you can see that the model with DIST=LOGNORMAL is providing estimates for the LOGARITHM of the response, not the response itself. You can use the EXP function to convert predicted values from the log scale to the data scale.
Please provide the complete MODEL statements for each model.
I do not know what you mean by a "conversion" from the lognormal model to make it "closer to normal." You can transform the predicted values from the lognormal scale into the data scale by using the EXP function. Is that what you mean?
Let me describe the difference between the two models. The first is the familiar linear regression model. The second is a model of log(Y), where Y is the response variable.
A model depends not only on the assumed distribution of the errors, but also on the link function. By default, PROC GLIMMIX uses the identity link for DIST=NORMAL and for DIST=LOGNORMAL. That means that the LOGNORMAL model is the same as modeling
log(Y) = X*beta + epsilon
where Y is the response variable and epsilon ~ N(0, sigma).
In other words, the following two models are equivalent:
data cars;
set sashelp.cars;
logMPG = log(mpg_city);
run;
title "Model of log(Y) with DIST=NORMAL";
proc glimmix data=cars plots=none;
model logMPG = weight horsepower / dist=normal solution;
ods select ParameterEstimates;
run;
title "Model of Y with DIST=LOGNORMAL";
proc glimmix data=cars plots=none;
model MPG_city = weight horsepower / dist=lognormal solution;
ods select ParameterEstimates;
run;
Accordingly, you can see that the model with DIST=LOGNORMAL is providing estimates for the LOGARITHM of the response, not the response itself. You can use the EXP function to convert predicted values from the log scale to the data scale.
Buried down in the fine print of the documentation are some equations for putting the estimates in the lognormal space into the normal space. Note that @Rick_SAS 's method of exponentiating results gives the geometric mean, which likely the best estimate in the original space. However, it is not the expected value - to get that you need:
omega = exp(sigma*sigma)
expected value of Y = exp(mu)*sqrt(omega)
Variance of Y = exp(2*mu)*omega8(omega - 1)
Note that these values are the expected values, which in a right skewed distribution (like a lognormal) may be much larger than the exp(mu), which should be approximately the value that gives the peak of the distribution.
SteveDenhm
I would have a further question, related to Proc Glimmix with dist=lognormal. Is it possible to get estimates for differences (among groups) back onto the original scale dataset?
Many thanks in advance!
According to this paper (see attachment).
You can't do it by PROC GLIMMIX.
But you can get it by PROC NLMIXED .
Thank you very much Ksharp!
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.