Hi everybody,
I am trying to run zip model in proc nlmixed (SAS 9.4):
proc nlmixed data=migraine maxiter=10000;
parms b0 =1 b1 = 0 b2= 0 c0 =1 c1 = 0 c2= 0 z0= 0 z1= 0 z2= 0 z3= 0 rho=-0.2 s2u =1 s2v=1;
/*linear predictor for mixture probability of zero part*/
linp_pi = b0 + b1*time+ b2*group*time+ u;
pi = exp(linp_pi)/(1+exp(linp_pi));
/*linear predictor for mean of counts (poisson part)*/
linp_mu = c0 + c1*time+ c2*group*time+ v;
mu = exp(linp_mu);
logsigu2 = z0 + z1*group;
logsigv2 = z2 + z3*group;
s2u=exp(logsigu2);
suv=rho*sqrt(exp(logsigu2)*exp(logsigv2));
s2v=exp(logsigv2);
if attack=0 then ll = log((pi) + (1-pi)*exp(-mu));
else ll = log((1-pi)) + attack*log(mu) - lgamma(attack+1)- mu;
model attack ~ general(ll);
random u v ~ normal([0,0],[s2u,suv,s2v]) subject=id; run;
I am attaching longitudinal data. While running my model for "cov hess, tech=newrap, method= gauss, maxiter, qmax noad, seed=12345", Unfortunately, I always face this warning and have some large SE or no SE .
WARNING1: The final Hessian matrix is full rank but has at least one negative eigenvalue. Second-order
optimality condition violated.
WARNING2: The final Hessian matrix is not positive definite, and therefore the estimated covariance
matrix is not full rank and may be unreliable. The variance of some parameter estimates is
zero or some parameters are linearly related to other parameters.
I changed initial values, method, qmax...again and again. Sometimes I have Error: Optimization cannot be completed. I don't know what to do.
I have a principal question:
How do I find appropriate initial values?
Thank you for your time.
Parameter Estimates | ||||||||
Parameter | Estimate | Standard | DF | t Value | Pr > |t| | 95% Confidence Limits | Gradient | |
b0 | -2.6089 | 0.4103 | 64 | -6.36 | <.0001 | -3.4285 | -1.7893 | 6.81751 |
b1 | -1.5772 | 14.2856 | 64 | -0.11 | 0.9124 | -30.1160 | 26.9617 | 0.071280 |
b2 | -2.5369 | 12.5514 | 64 | -0.20 | 0.8405 | -27.6111 | 22.5373 | 0.078330 |
c0 | 0.7799 | 0.1335 | 64 | 5.84 | <.0001 | 0.5133 | 1.0466 | -25.8045 |
c1 | -0.4147 | 0.09284 | 64 | -4.47 | <.0001 | -0.6002 | -0.2292 | 19.9297 |
c2 | 0.09872 | 0.05501 | 64 | 1.79 | 0.0775 | -0.01118 | 0.2086 | -37.4677 |
z0 | 0.2445 | 2.2688 | 64 | 0.11 | 0.9145 | -4.2881 | 4.7770 | 4.09923 |
z1 | 0.4000 | 1.5060 | 64 | 0.27 | 0.7914 | -2.6087 | 3.4087 | 6.45820 |
z2 | -0.9555 | 1.6735 | 64 | -0.57 | 0.5700 | -4.2988 | 2.3877 | -1.27785 |
z3 | -1.3047 | 1.0613 | 64 | -1.23 | 0.2234 | -3.4249 | 0.8155 | -3.62090 |
rho | 0.07197 | 2.0419 | 64 | 0.04 | 0.9720 | -4.0072 | 4.1512 | 0.76599 |
s2u | 2.8419 | . | 64 | . | . | . | . | 1.73494 |
s2v | 0.02830 | 0.06871 | 64 | 0.41 | 0.6818 | -0.1090 | 0.1656 | -73.7197 |
I don't understand the concern. As I mentioned, the ZIP model seems inappropriate because the simple Poisson model shows no evidence of overdispersion and because of the fitting problems when trying to fit the ZIP model. The Poisson GEE model seems to provide a reasonable fit and the parameter estimates and test results seem consistent with the plot of the fitted model. So, the only code needed to fit a reasonable model is this:
proc genmod;
class group id;
model attack=group group*time / d=p;
repeated subject=id / type=exch;
effectplot;
run;
Hello @Yeganeh ,
You ask: "How do I find appropriate initial values?"
There is no golden rules in terms of specifying starting values for the parameters for PARMS statement in NLMIXED. You can always start with using the default starting values (parameters not listed in the PARMS statement are assigned an initial value of 1).
Using a grid search (possible in PARMS statement) or specifying a guess based upon subject matter knowledge or previous studies might help.
But before going further in that direction, I would like to ask why you turn to PROC NLMIXED for your zip model.
There are several procedures that can deal with zip models (zip = Zero-inflated Poisson regression).
Have you tried :
Maybe your model has a little twist that above procedures cannot accomplish, but then you can maybe use above procedures to generate appropriate starting values for at least some of your parameters.
Kind regards,
Koen
Dear Koen,
Thank you for sharing your valuable advice with me.
When I used proc genmod, parameters of zero model had a high SE.
proc genmod data = migraine;
model attack= Time Group*Time/ dist=zip;
zeromodel Time Group*Time/link = logit;
run;
Also, I ran proc glimmix and proc countreg but, the initial values didn't work correctly.(unable to estimate SE or high SEs)
proc glimmix data = migraine noclprint method=laplace;
class id;
model attack= Time Group*Time / solution dist=poisson;
random intercept / subject = id;
run;
proc countreg data=migraine;
model attack= Time Time *Group / dist=zip;
zeromodel attack~Time Time *Group/ link=logistic;
run;
I appreciate your guidance.
Several comments here... First, I suspect what you want is for the model to allow for the groups to have separate intercepts and separate slopes. If so, then the model would be GROUP GROUP*TIME. If you use that model, there seems to be no real evidence of overdispersion which would be expected if the observed zeros were excessive. This can be seen by fitting the simple Poisson model ignoring the repeated measures.
proc genmod;
class group id;
model attack=group group*time / d=p;
effectplot;
run;
Note that the Pearson/DF and deviance/DF values are less than 1 suggesting no overdispersion. This might be why you have trouble fitting the zero-inflated model - in fact, trying to fit that model (without random effects) fails in GENMOD, and fails even if only an intercept is included in the model for extra zeros.
proc genmod;
class group;
model attack=group group*time / dist=zip;
zeromodel group group*time;
run;
So, probably the best model is the simple Poisson model above. You can account for the repeated measures be adding the REPEATED statement in the above code:
repeated subject=id / type=exch;
Hi,
I'm so grateful for your suggestions. I check it.
I don't understand the concern. As I mentioned, the ZIP model seems inappropriate because the simple Poisson model shows no evidence of overdispersion and because of the fitting problems when trying to fit the ZIP model. The Poisson GEE model seems to provide a reasonable fit and the parameter estimates and test results seem consistent with the plot of the fitted model. So, the only code needed to fit a reasonable model is this:
proc genmod;
class group id;
model attack=group group*time / d=p;
repeated subject=id / type=exch;
effectplot;
run;
Ok, I get it now. That's clear.
Thank you for your immediate reply.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.