I'm trying to develop some skill using proc nlmixed. I like the procedure because you can program ideas directly that you find in papers.
I found an example using negative binomial data. the example was taken from the source in the link below, so a test is to match the results. Unfortunaly, my model does not solve for the alpha parameter, even when I give it the starting value given in my solution. However, if I enter alpha as a constant the model gets solved and all the other parameters are the same as in my example.
I would like to have a strategy for how to handle a case when a parameter will not converge and I don't know before hand what the answer should be.
data koch36;
length Melanoma Area 8 AgeGroup $10 Population AG1 AG2 AG3 AG4 AG5 AG6 8;
infile datalines dsd;
input Melanoma Area AgeGroup $ Population AG1 AG2 AG3 AG4 AG5 AG6;
logpop=log(Population);
datalines;
61,0,<35,2880262,1,0,0,0,0,0
76,0,35-44,564535,0,1,0,0,0,0
98,0,45-54,592983,0,0,1,0,0,0
104,0,54-64,450740,0,0,0,1,0,0
63,0,65-74,270908,0,0,0,0,1,0
80,0,>74,161850,0,0,0,0,0,1
64,1,<35,1074246,1,0,0,0,0,0
75,1,35-44,220407,0,1,0,0,0,0
68,1,45-54,198119,0,0,1,0,0,0
63,1,54-64,134084,0,0,0,1,0,0
45,1,65-74,70708,0,0,0,0,1,0
27,1,>74,34233,0,0,0,0,0,1
; run;
proc nlmixed data=koch36 gconv=0;
parms alpha=0.27586 intercept=0 a1=0 b1=0 b2=0 b3=0 b4=0 b5=0;
y=Melanoma;
t=population;
mu=exp(log(t)+Intercept+a1*area+b1*(AgeGroup="35-44")+b2*(AgeGroup="45-54")+b3*(AgeGroup="54-64")+
b4*(AgeGroup="65-74")+b5*(AgeGroup=">74"));
llike=lgamma(y+1/alpha)-lgamma(1/alpha)-lgamma(y+1)-log(1+alpha*mu)/alpha-y*log(1+alpha*mu)+y*log(alpha)+y*log(mu);
model y ~ general(llike);
run;
proc nlmixed data=koch36 gconv=0;
parms intercept=0 a1=0 b1=0 b2=0 b3=0 b4=0 b5=0;
alpha=0.27586;
y=Melanoma;
t=population;
mu=exp(log(t)+Intercept+a1*area+b1*(AgeGroup="35-44")+b2*(AgeGroup="45-54")+b3*(AgeGroup="54-64")+
b4*(AgeGroup="65-74")+b5*(AgeGroup=">74"));
llike=lgamma(y+1/alpha)-lgamma(1/alpha)-lgamma(y+1)-log(1+alpha*mu)/alpha-y*log(1+alpha*mu)+y*log(alpha)+y*log(mu);
model y ~ general(llike);
run;
I think the problem is that your starting point for the parameters for the mean are too far away from the ML estimate.
A way to find starting values is to first run a Poisson model and use these estimate as starting points for the mean parameters. Using only the intercept is often sufficient.
Then run a model the neg-bin model where you introduce the variance only together with intercept.
Then run the full model with start values for contrast parameters=0 and with start values for intercept and alpha = values from previous step.
In the data you show there is so many mean parameters that they almost explain the full variation. So with all these mean parameters I would instead use a poisson model (you see that the alpha parameter is very close to zero).
*find a good starting point for the intercept;
proc genmod data=koch36;
model melanoma=/dist=poisson link=log offset=logpop;
run;
*Find a good starting point for alpha;
proc nlmixed data=koch36(rename=(melanoma=y)) xconv=1E-4;
parms intercept=-9 alpha=1;
mu=population*exp(Intercept);
llike=lgamma(y+1/alpha)-lgamma(1/alpha)-lgamma(y+1)-log(1+alpha*mu)/alpha-y*log(1+alpha*mu)+y*log(alpha)+y*log(mu);
model y ~ general(llike);
run;
*Full model
proc nlmixed data=koch36(rename=(melanoma=y)) xconv=1E-4;
parms intercept=-8.04 a1=0 b1=0 b2=0 b3=0 b4=0 b5=0 alpha=0.6298;
mu=population*exp(Intercept+a1*area+b1*(AgeGroup="35-44")+b2*(AgeGroup="45-54")+b3*(AgeGroup="54-64")+
b4*(AgeGroup="65-74")+b5*(AgeGroup=">74"));
llike=lgamma(y+1/alpha)-lgamma(1/alpha)-lgamma(y+1)-log(1+alpha*mu)/alpha-y*log(1+alpha*mu)+y*log(alpha)+y*log(mu);
model y ~ general(llike);
run;
Thanks for the reply. You approach of starting with the intercept and then adding variables makes sense to me. However, when I tried it I found that when I added the last variable, a1*alpha, the model still failed.
Code
proc nlmixed data=koch36 gconv=1e-6;
parms intercept=-10.1383 alpha=0.1519 b1=1.7772 b2=1.8468 b3=2.1658 b4=2.3630 b5=2.7629 a1=0;
y=Melanoma;
t=population;
mu=exp(log(t)+Intercept+a1*Area+b1*(AgeGroup="35-44")+b2*(AgeGroup="45-54")+b3*(AgeGroup="54-64")+
b4*(AgeGroup="65-74")+b5*(AgeGroup=">74"));
llike=lgamma(y+1/alpha)-lgamma(1/alpha)-lgamma(y+1)-log(1+alpha*mu)/alpha-y*log(1+alpha*mu)+y*log(alpha)+y*log(mu);
model y ~ general(llike);
run;
Results
Iteration History
Negative
Log Maximum
Iteration Calls Likelihood Difference Gradient Slope
1 4 54.8371019 1.892227 18.9465 -69.3337
2 11 50.3199327 4.517169 53.8336 -101.862
3 18 49.7769185 0.543014 85.3213 -1944.62
4 20 48.4637450 1.313174 59.4465 -19.1750
5 31 46.9618976 1.501847 186.198 -26.8011
6 63 46.4230812 0.538816 660.235 -39.5839
7 93 46.4230108 0.00007 23208.6 -17.0106
8 96 45.8523373 0.570674 686.244 -551.955
9 98 44.2959476 1.55639 647.924 -87.0817
10 101 43.6593495 0.636598 1257.82 -24.5176
11 105 41.1515998 2.50775 481.472 -24.3148
12 108 39.9996760 1.151924 40.7525 -84.5010
13 111 39.3144131 0.685263 279.858 -5.17909
14 114 39.2379332 0.07648 365.242 -0.72632
15 117 39.2199225 0.018011 609.610 -0.03710
16 120 39.2199113 0.000011 546.966 -0.00003
NOTE: GCONV convergence criterion satisfied.
NOTE: Moore-Penrose inverse is used in covariance matrix.
Standard
Parameter Estimate Error DF t Value Pr > |t|
intercept -10.6583 0.09511 12 -112.06 <.0001
alpha 5.958E-9 . 12 . .
b1 1.7974 0.1208 12 14.88 <.0001
b2 1.9131 0.1138 12 16.81 <.0001
b3 2.2418 0.1149 12 19.52 <.0001
b4 2.3657 0.1052 12 22.49 <.0001
b5 2.9447 0.1293 12 22.77 <.0001
a1 0.8195 0.05039 12 16.26 <.0001
Apparently it depend on the starting values;
Using the starting values "parms intercept=-8.04 a1=0 b1=0 b2=0 b3=0 b4=0 b5=0 alpha=0.6298;" then it converges. But, again, I turns out that the alpha parameter converges toward zero so I would prefer using the a poisson model instead. Also, the model is not very robust as you have almost as many parameters as you have observations. The variance will therefor be small as the total variation is almost fully explained by the mean parameters.
I'm still having trouble getting alpha even with the new starting values.
I was able to use similar code with proc mcmc, which as I understand it uses sampling rather than optimization. The mcmc code did estimate an alpha, although it is lower than what was in the pdf that I attached to my first email.
Maybe you just need to see if the model is fitting the data in a reasonable way. So you fit both Poisson and Negative binomial and choose the distribution that best fits the data.
proc mcmc data=koch36 nmc=200000 thin=10 nbi=10000;
parms alpha 0.5 intercept 0 b1 0 b2 0 b3 0 b4 0 b5 0 a1 0 ;
y=Melanoma;
t=population;
prior intercept b1 b2 b3 b4 b5 a1 ~normal(mean=0,var=1E6);
prior alpha~uniform(0,6);
mu=exp(log(t)+Intercept+a1*area+b1*(AgeGroup="35-44")+b2*(AgeGroup="45-54")+b3*(AgeGroup="54-64")+
b4*(AgeGroup="65-74")+b5*(AgeGroup=">74"));
llike=lgamma(y+1/alpha)-lgamma(1/alpha)-lgamma(y+1)-log(1+alpha*mu)/alpha-y*log(1+alpha*mu)+y*log ( alpha)+y*log(mu);
model y ~ general(llike);
run;
Parameter N Mean Deviation 95% HPD Interval
alpha 20000 0.0563 0.0825 1.218E-7 0.2058
intercept 20000 -10.6468 0.2068 -11.0502 -10.2185
b1 20000 1.8008 0.2615 1.2863 2.3530
b2 20000 1.9040 0.2674 1.3644 2.4317
b3 20000 2.2320 0.2630 1.7101 2.7652
b4 20000 2.3818 0.2652 1.8874 2.9467
b5 20000 2.9064 0.2735 2.3431 3.4402
a1 20000 0.8222 0.1590 0.5171 1.1548
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.