05-22-2014 04:02 PM
I am having a problem with proc genmod I am hoping someone here can help me with.
I am trying to model a variable which is very skewed. I am estimating GLM via GENMOD with repeated option to account for the repeated observations (note: I have an unbalanced panel). Now when I used dist=normal and link=log, I get the warning
"The relative Hessian convergence criterion of xxxx is greater than the limit of 0.0001. The convergence is questionable." I get no such warning when I used dist=gamma and link=log. The mean prediction error is significantly smaller when I use the normal/log specifications, but I am not sure if I should be using that model with the warning message. The prediction mean error with gamma/log is closer to what I get if I estimate a log linear model in OLS with the same variables.
Any idea why I am getting the error and any risks associated with using that model? The number of iterations is about 4, so increasing number of iterations will not help.
Thanks in advance.
05-22-2014 04:46 PM
VIF for all the variables included in the model are between 1 and 3, which I dont think is high.
Also note, I dont get the error if I omit the repeated option.
05-22-2014 05:40 PM
Thanks for your input. But I dont think it will help since maxiter will increase the number of iterations if the current number of iterations are at the default cap. But the current iterations is only 4, so increasing maxiter did not help. I tried by setting it to 20.
05-23-2014 10:32 AM
This would be a more difficult work-around, but you might consider porting your analysis to PROC GLIMMIX, and fitting the GEE there. GLIMMIX affords a variety of methods for optimization that I can't seem to find in GENMOD, through the use of the NLOPTIONS statement. It also allows you to get a better look at the Hessian, and possibly note what might be happening.
05-23-2014 03:35 PM
Would you happen to know the equivalent to repeated option of GENMOD in GLIMMIX to fit GLM?
I tried GLIMMIX just with the model statement and got the exact answer as GLM with same dist and link options. Then I tried the same model with "random _residual_/ group= group_variable ; " to account for the repeated observation and I got the following warning :
WARNING: Pseudo-likelihood update fails in outer iteration -1.
NOTE: Did not converge.
05-23-2014 03:42 PM
Not sure, but let's assume that the repeated measure is called time, a single fixed effect called factor, and subjects are indexed by subject_id;
class factor time subject_id;
model response-var=factor time factor*time/dist=normal link=log;
random time/residual type=ar(1) subject=subject_id;
I wouldn't go down the road of different covariance structures by group_variable (I called it factor) until I could get a fit of the marginal over all levels.
06-02-2014 11:28 AM
Note that using DIST=NORMAL and LINK=LOG is not treating the response as lognormal, if that was your intention. To do that, model the log-transformed variable in PROC REG or PROC GLM. With these options in GENMOD, you are assuming that the response is normal and are modeling the log of the mean. If your response is positive, then your gamma model may be the more appropriate model.
06-02-2014 12:20 PM
My intent with that code was to model log(Y)=Xb + Zg + error. If I wanted a lognormal, I assume that i would fit Y=log(Xb + Zg + e). I was trying to approximate the OP's work in GENMOD. The one change that should be made in the code is to change to type=cs, which is the equivalent of the exchangeable structure (the default for repeated) in GENMOD.