Hello. I'm using a very simple data set from an article in trying to further my understanding of GLMs. I've input the data using SAS, and I've run both the PROC REG and PROC GENMOD procedures on the data. In the PROC GENMOD procedure, I used a log link with a normal distribution; in the PROC REG procedure, I used the log of the response variable in the model. My question is, why don't the parameter estimates of the two procedures match? My understanding is that PROC REG uses OLS/WLS to estimate the parameters, whereas PROC GENMOD uses MLE with a Newton-Raphson iterative process for estimation. But I had thought that, when the assumed distribution is normal and the relationship is linear (which, after the log transformation, it is in the GLM, right?), MLE is equal to OLS/WLS. Here are the resulting parameters from the run: REG GENMOD A1 4.623 4.579 A2 4.688 4.730 A3 4.654 4.654 B1 (0.735) (0.741) B2 (0.487) (0.436) And here is my code: data GLM; input Y A1 A2 A3 B1 B2; lnY = LOG(Y); datalines; 95 1 0 0 0 0 115 0 1 0 0 0 105 0 0 1 0 0 55 1 0 0 1 0 45 0 1 0 1 0 30 1 0 0 1 1 ; proc genmod data=GLM; model Y = A1 A2 A3 B1 B2 / dist=normal link=log scale=deviance noint ; weight Y; run; proc reg data=GLM; model lnY = A1 A2 A3 B1 B2 / noint; weight Y; run; As it turns out, if I run GENMOD with an identity link function and run REG using Y instead of LnY, I get the same answer. So, for some reason the transformation from Y to LnY is causing the discrepancy, but mathematically I feel like the answers should still be equal. Any insight that anyone can contribute is greatly appreciated!
... View more