Re: NPD Hessian

emaguin · Posted 05-04-2020 11:52 AM

I'm running a gee model and getting a NPD Hessian problem. These two models run ok. I have 174 clusters and 1766 records.

* model A3;
proc gee data=couse.co_usevars; class studyid;
model couse1=dsa3A1_234 / dist=binary ;
repeated subject=studyid / modelse corr=exch;
run;

* model B1;
proc gee data=couse.co_usevars; class studyid;
model couse1=dothruse0_12 dothruse1_2 / dist=binary;
repeated subject=studyid / modelse corr=exch;
run;

This one does not.

* model B2;
proc gee data=couse.co_usevars; class studyid;
model couse1=dothruse0_12 dothruse1_2 dsa3A1_234/ dist=binary;
repeated subject=studyid / modelse corr=exch;
run;

As you see, i specified an exchangeable corr structure. If I change to either an independent or unstructured structure, the model estimates. The fit (QIC) is better (smaller) with corr=ind than with corr=un, which seems understandable--kind of. However, on one of the prior models that ran using corr=exch the corr estimate was .26, which suggests to me that corr=ind is not a reasonable assumption.

I know that an NPD problem means that the determinant of the second derivative matrix is not greater than zero. What i don't know is if there is any way locate the problem. Can anyone provide and information and/or guidance? I understand this isn't a sas problem; it's a data+model problem so there may be nothing that can be done. However, you all have more experience than i do; so it's worth asking.

Thanks, Gene Maguin

SteveDenham · Posted 05-04-2020 01:13 PM

I generally see NPD hessian problems when I over-parameterize a model - there isn't enough variability or there isn't enough data to fit the model. As these are regressions, and the first two work, I would guess that dsa3A1_234 introduces some strong collinearity with either dothruse0_12 or dothruse1_2. Have you tried using PROC REG to see if there is a concern in this area? It really doesn't matter what the dependent variable is to get this diagnostic. Try this:

proc reg data=couse.co_usevars; 
model couse1=dothruse0_12 dothruse1_2 dsa3A1_234/collin:
run;
quit;

Not quite sure if you should include studyid in the model, I think it would depend on the pattern of the IV's by studyid, such that you get separation issues when all 3 covariates are included in the model so that a binomial response leads to an issue. But for the collinearity issue, the output from PROC REG may be helpful.

SteveDenham

Rick_SAS · Posted 05-04-2020 01:45 PM

If you don't know how to interpret the COLLIN output, see my article, "Collinearity in regression: The COLLIN option in PROC REG."

And Steve might be interested in the follow-up article about how to visualize the collinearity diagnostics.

Rick_SAS · Posted 05-04-2020 01:47 PM

Although you are asking about a GEE model, you might like to read the background material and consult the references in the article "Convergence in mixed models: When the estimated G matrix is not positive definite."

emaguin · Posted 05-04-2020 03:49 PM

Thanks to all three of you that replied.

I ran the reg syntax and collinearity doesn't seem to be an issue. The largest condition index if 3. something. In terms of separation, my impression is that it would be harder for that to happen with gee than with glimmix because, since there are no random terms, the coefficients are solved using the entire dataset rather than within each value of the subjects variable for the random terms. I understand that's not exactly true but i think it's a useful shorthand.

I want to come back to the fact that the model solves if corr=ind or un. I some descriptions of gee but i'm not clear on if or how the covariances or correlations of the dependent variables enter into the estimation of the coefficients. The explanation in the documentation is a little too compact for me to decompose.

Thanks, Gene Maguin

SteveDenham · Posted 05-05-2020 08:44 AM

Well, that leaves us with the case that the IND and UN structures don't have a "working" correlation matrix that has to be estimated. If it were my project, I would consider moving to GLIMMIX, and follow the applicable parts of Example 45.12 Fitting a Marginal (GEE-type) Model in the GLIMMIX documentation. This approach also makes available a wider range of covariance structures. Try this code:

proc glimmix data=couse.co_usevars; 
class studyid;
model couse1=dothruse0_12 dothruse1_2 dsa3A1_234/ dist=binary;
random _residual_/ subject=studyid  type=ar(1);
run;

I picked an AR(1) covariance structure as it is the closest to an exchangeable correlation matrix. If this runs into the same NPD G matrix issue, then the answer oddly enough is that you can't fit that many parameters with your existing data - the independent (or in the case of GLIMMIX, the VC) matrix is the best model that will fit the data at hand.

SteveDenham

MichaelL_SAS · Posted 05-05-2020 11:53 AM

@emaguin , to answer your question about how the the correlated dependent variables enter into the estimation of the coefficients, the main source would be in the "generalized Hessian" used in the update step (step 4) of the GEE fitting algorithm described here in the PROC GENMOD documentation. That expression is something like a "weighted" X'X computation, where components of a design row are "weighted" based on the partial derivatives of the link function and also the inverse working covariance matrix for the response variable.

To your question about why the model converges with some working correlation structures and not others that is hard to say without the data. Based on your description I doubt this is the cause (I mention it here in case someone in the future comes across this thread) but sometimes a model with an exchangeable working correlation structure results in a NPD generalized Hessian due to insufficient variability in the outcome variable within subjects/clusters. This can result in an estimate for the working correlation matrix that is near singular, which then effects the computation of the generalized Hessian. Very often a give away for this type of issue is a some notes in the log about having to ridge the estimate for the working correlation matrix before the error about the NPD generalized Hessian.

StatDave · Posted 05-05-2020 03:35 PM

This is probably due to the more complex model making the data too sparse. This happens all the time in ordinary binary logistic regression. In PROC LOGISTIC, a warning is issued about "separation" in such cases. The sparseness causes some model parameters to be infinite as discussed in the Details section of the PROC LOGISTIC documentation. If the predictors are continuous variables then you might need to categorize one or more of them to some degree. If they are categorical, then you could try a stratification approach rather than a model-based approach as shown in this note using PROC FREQ.

Ready to join fellow brilliant minds for the SAS Hackathon?