About Jabbawonga

Jabbawonga · ‎05-13-2014

Hello, I'm working through some code in an old program written by someone else, and I came across the following lines: Proc Reg ... (where=(b1))... and Proc Syslin ... (where=(b3))... I've searched the documentation and don't see any explanation regarding "where" statements in Proc Reg or Syslin or what b1 and b3 are (they are not variables in the models). Can anybody help me understand what "where=(b1)" or "where=(b3)" mean? Thanks.

Jabbawonga · ‎04-02-2014

Very helpful, thanks!

Jabbawonga · ‎04-02-2014

Ok, I'm confused now. Here's the exact code I'm running: data GLM; input Y A1 A2 A3 B1 B2; lnY = LOG(Y); datalines; 95 1 0 0 0 0 115 0 1 0 0 0 105 0 0 1 0 0 55 1 0 0 1 0 45 0 1 0 1 0 30 1 0 0 1 1 ; run; proc genmod data=GLM; model Y = A1 A2 A3 B1 B2 / dist=normal link=log noint; /* weight Y; */ run; proc reg data=GLM; model lnY = A1 A2 A3 B1 B2 / noint ; /*weight Y; */ run; But I'm getting different answers. What am I doing wrong?

Jabbawonga · ‎04-02-2014

Thanks for the reply Steve. If I run the code instead using an identity link function and don't log-transform the response in the REG procedure, I get the same answer whether I use weights or not. If I use a log link in GENMOD and log-transform the response in REG I don't get the same answer, whether I use weights or not. So I had concluded that weights weren't the issue. Did you run the code with a log transform without weights and get the same answer? Actually, technically speaking, REG uses OLS and GENMOD uses MLE, which uses iteratively reweighted LS to estimate, so perhaps only the first iteration in GENMOD would match the REG answer...?

Jabbawonga · ‎03-31-2014

Hello. I'm using a very simple data set from an article in trying to further my understanding of GLMs. I've input the data using SAS, and I've run both the PROC REG and PROC GENMOD procedures on the data. In the PROC GENMOD procedure, I used a log link with a normal distribution; in the PROC REG procedure, I used the log of the response variable in the model. My question is, why don't the parameter estimates of the two procedures match? My understanding is that PROC REG uses OLS/WLS to estimate the parameters, whereas PROC GENMOD uses MLE with a Newton-Raphson iterative process for estimation. But I had thought that, when the assumed distribution is normal and the relationship is linear (which, after the log transformation, it is in the GLM, right?), MLE is equal to OLS/WLS. Here are the resulting parameters from the run: REG GENMOD A1 4.623 4.579 A2 4.688 4.730 A3 4.654 4.654 B1 (0.735) (0.741) B2 (0.487) (0.436) And here is my code: data GLM; input Y A1 A2 A3 B1 B2; lnY = LOG(Y); datalines; 95 1 0 0 0 0 115 0 1 0 0 0 105 0 0 1 0 0 55 1 0 0 1 0 45 0 1 0 1 0 30 1 0 0 1 1 ; proc genmod data=GLM; model Y = A1 A2 A3 B1 B2 / dist=normal link=log scale=deviance noint ; weight Y; run; proc reg data=GLM; model lnY = A1 A2 A3 B1 B2 / noint; weight Y; run; As it turns out, if I run GENMOD with an identity link function and run REG using Y instead of LnY, I get the same answer. So, for some reason the transformation from Y to LnY is causing the discrepancy, but mathematically I feel like the answers should still be equal. Any insight that anyone can contribute is greatly appreciated!

Online Status	Offline
Date Last Visited	‎09-01-2015 07:11 AM

Where=(b1)

Re: REG vs GENMOD; WLS vs MLE

Re: REG vs GENMOD; WLS vs MLE

Re: REG vs GENMOD; WLS vs MLE

REG vs GENMOD; WLS vs MLE

Where=(b1)

Re: REG vs GENMOD; WLS vs MLE

Re: REG vs GENMOD; WLS vs MLE

Re: REG vs GENMOD; WLS vs MLE

REG vs GENMOD; WLS vs MLE