BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cd2011
Calcite | Level 5

Hi,

I am having a problem with proc genmod I am hoping someone here can help me with.

I am trying to model a variable which is very skewed. I am estimating GLM via GENMOD with repeated option to account for the repeated observations (note: I have an unbalanced panel). Now when I used dist=normal and link=log, I get the warning

"The relative Hessian convergence criterion of xxxx is greater than the limit of 0.0001. The convergence is questionable." I get no such warning when I used dist=gamma and link=log. The mean prediction error is significantly smaller when I use the normal/log specifications, but I am not sure if I should be using that model with the warning message. The prediction mean error with gamma/log is closer to what I get if I estimate a log linear model in OLS with the same variables.

Any idea why I am getting the error and any risks associated with using that model? The number of iterations is about 4, so increasing number of iterations will not help.

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

Note that using DIST=NORMAL and LINK=LOG is not treating the response as lognormal, if that was your intention.  To do that, model the log-transformed variable in PROC REG or PROC GLM.  With these options in GENMOD, you are assuming that the response is normal and are modeling the log of the mean.  If your response is positive, then your gamma model may be the more appropriate model.

View solution in original post

11 REPLIES 11
stat_sas
Ammonite | Level 13

Hi,

I would suggest check multicolinearity among predictors. Its seems high correlation among some of the predictors causing this problem.

Thanks,

cd2011
Calcite | Level 5

VIF for all the variables included in the model are between 1 and 3, which I dont think is high.

Also note, I dont get the error if I omit the repeated option.

stat_sas
Ammonite | Level 13

Second option is to increase number of iterations (maxiter=), right now it's 4. Hope this will solve the issue.

cd2011
Calcite | Level 5

Thanks for your input. But I dont think it will help since maxiter will increase the number of iterations if the current number of iterations are at the default cap. But the current iterations is only 4, so increasing maxiter did not help. I tried by setting it to 20.

SteveDenham
Jade | Level 19

This would be a more difficult work-around, but you might consider porting your analysis to PROC GLIMMIX, and fitting the GEE there.  GLIMMIX affords a variety of methods for optimization that I can't seem to find in GENMOD, through the use of the NLOPTIONS statement.  It also allows you to get a better look at the Hessian, and possibly note what might be happening.

Steve Denham

cd2011
Calcite | Level 5

Thank you Steve. I will try GLMMIX and see if I can find a solution.

cd2011
Calcite | Level 5

Hi Steve,

Would you happen to know the equivalent to repeated option of GENMOD in GLIMMIX to fit GLM?

I tried GLIMMIX just with the model statement and got the exact answer as GLM with same dist and link options. Then I tried the same model with "random _residual_/ group= group_variable ; " to account for the repeated observation and I got the following warning :

WARNING: Pseudo-likelihood update fails in outer iteration -1.

NOTE: Did not converge.

Any suggestions?

Thanks.

SteveDenham
Jade | Level 19

Not sure, but let's assume that the repeated measure is called time, a single fixed effect called factor, and subjects are indexed by subject_id;

Try

class factor time subject_id;

model response-var=factor time factor*time/dist=normal link=log;

random time/residual type=ar(1) subject=subject_id;

I wouldn't go down the road of different covariance structures by group_variable (I called it factor) until I could get a fit of the marginal over all levels.

Steve Denham

StatDave
SAS Super FREQ

Note that using DIST=NORMAL and LINK=LOG is not treating the response as lognormal, if that was your intention.  To do that, model the log-transformed variable in PROC REG or PROC GLM.  With these options in GENMOD, you are assuming that the response is normal and are modeling the log of the mean.  If your response is positive, then your gamma model may be the more appropriate model.

SteveDenham
Jade | Level 19

My intent with that code was to model log(Y)=Xb + Zg + error.  If I wanted a lognormal, I assume that i would fit Y=log(Xb + Zg + e).  I was trying to approximate the OP's work in GENMOD.  The one change that should be made in the code is to change to type=cs, which is the equivalent of the exchangeable structure (the default for repeated) in GENMOD.

Steve Denham

cd2011
Calcite | Level 5

Thanks for all the inputs. Lot to think about.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 4651 views
  • 0 likes
  • 4 in conversation