- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello. I am running an analysis (proc genmod with gee) to analyze correlated outcomes -- repeated hospital encounters within a patient. My code is giving errors when a certain variable is added to the model, and I'm not sure what it going on. Please see below. Any thoughts as to why there are errors (see below) -- sample size is too small? Any other thoughts? thanks
This code is working fine:
proc genmod data = ip;
class id race payor sex urban psych;
model hpsy=race payor sex urban los psych/ dist=bin link=logit;
repeated subject=id/ type=exch covb corrw;
estimate 'black' race -1 1 0/exp;
estimate 'hispanic' race -1 0 1/exp;
run;
This code is NOT working:
proc genmod data = ip;
class id race payor sex urban psych county;
model hpsy=race payor sex urban los psych county/ dist=bin link=logit;
repeated subject=id/ type=exch covb corrw;
estimate 'black' race -1 1 0/exp;
estimate 'hispanic' race -1 0 1/exp;
run;
error message in log:
NOTE: Class levels for some variables were not printed due to excessive size.
NOTE: PROC GENMOD is modeling the probability that hpsy='1'.
WARNING: The negative of the Hessian is not positive definite. The convergence is questionable.
WARNING: The procedure is continuing but the validity of the model fit is questionable.
WARNING: The specified model did not converge.
WARNING: Negative of Hessian not positive definite.
WARNING: The generalized Hessian matrix is not positive definite. Iteration will be terminated.
ERROR: Error in parameter estimate covariance computation.
ERROR: Error in estimation routine.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You have too many levels for COUNTY. I think you should put it on SUBJECT.
Make a variable :
id_county=id||county;
And make it as subject:
repeated subject=id_county/ type=exch covb corrw;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You have too many levels for COUNTY. I think you should put it on SUBJECT.
Make a variable :
id_county=id||county;
And make it as subject:
repeated subject=id_county/ type=exch covb corrw;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you. That was the problem, the model converged.
Thanks again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I was having the same problem with a model, and I used your solution which made the model converge without error. But can you explain why it works? My variables were all binary, no values missing, and I'm not sure why I got the error in the first place.
Thanks,
Nadja
@Ksharp wrote:You have too many levels for COUNTY. I think you should put it on SUBJECT.
Make a variable :
id_county=id||county;
And make it as subject:
repeated subject=id_county/ type=exch covb corrw;
in the first place.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is no different than in an ordinary logistic model - the more variables you add to the model, the more sparse the data become (just like adding more dimensions in a multi-way table with a fixed amount of data). When the data become too sparse, the result is that some model parameters are actually infinite. Recall that the parameters in a logistic model are related to odds ratios and an odds ratio in a table with a zero count can be infinite. Obviously, an iterative estimation method (like maximum likelihood or the GEE algorithm) will not converge in that case.
However, moving the added variable from the model to SUBJECT= in the REPEATED statement changes the model and the assumptions you are making. The SUBJECT= effect is not part of the model being estimated - it simply defines which observations in the data are considered correlated as described in this note. Changing the SUBJECT= effect changes which observations are considered correlated and might be incorrect if done without thought.
That said, if COUNTY does not belong in SUBJECT=, and if you do want to estimate its effect as part of the model, they you might be able to include the variable in the model if you otherwise reduce the number of parameters in the model. You could do that in many ways such as simply merging levels within County or any other CLASS variable, or of course, by removing other variables from the model.
And by the way, if it *is* appropriate to add County in SUBJECT=, then this can be done by simply specifying ID*COUNTY or ID(COUNTY) as appropriate. There is no need to create a new variable that combines ID and COUNTY. Again, this is described in more detail in the note referred to above.