BookmarkSubscribeRSS Feed
amoynahan
Calcite | Level 5

I'm trying to develop some skill using proc nlmixed. I like the procedure because you can program ideas directly that you find in papers.

I found an example using negative binomial data. the example was taken from the source in the link below, so a test is to match the results. Unfortunaly, my model does not solve for the alpha parameter, even when I give it the starting value given in my solution. However, if I enter alpha as a constant the model gets solved and all the other parameters are the same as in my example.

I would like to have a strategy for how to handle a case when a parameter will not converge and I don't know before hand what the answer should be.

 

link: https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Negative_Binomial_Re...

 

data koch36;

length Melanoma Area 8 AgeGroup $10 Population AG1 AG2 AG3 AG4 AG5 AG6 8;

infile datalines dsd;

input Melanoma Area AgeGroup $ Population AG1 AG2 AG3 AG4 AG5 AG6;

 

logpop=log(Population);

datalines;

 

61,0,<35,2880262,1,0,0,0,0,0

76,0,35-44,564535,0,1,0,0,0,0

98,0,45-54,592983,0,0,1,0,0,0

104,0,54-64,450740,0,0,0,1,0,0

63,0,65-74,270908,0,0,0,0,1,0

80,0,>74,161850,0,0,0,0,0,1

64,1,<35,1074246,1,0,0,0,0,0

75,1,35-44,220407,0,1,0,0,0,0

68,1,45-54,198119,0,0,1,0,0,0

63,1,54-64,134084,0,0,0,1,0,0

45,1,65-74,70708,0,0,0,0,1,0

27,1,>74,34233,0,0,0,0,0,1

; run;

 

 

 

proc nlmixed data=koch36 gconv=0;

parms alpha=0.27586 intercept=0 a1=0 b1=0 b2=0 b3=0 b4=0 b5=0;

 

y=Melanoma;

t=population;

mu=exp(log(t)+Intercept+a1*area+b1*(AgeGroup="35-44")+b2*(AgeGroup="45-54")+b3*(AgeGroup="54-64")+

b4*(AgeGroup="65-74")+b5*(AgeGroup=">74"));

llike=lgamma(y+1/alpha)-lgamma(1/alpha)-lgamma(y+1)-log(1+alpha*mu)/alpha-y*log(1+alpha*mu)+y*log(alpha)+y*log(mu);

model y ~ general(llike);

run;

proc nlmixed data=koch36 gconv=0;

parms intercept=0 a1=0 b1=0 b2=0 b3=0 b4=0 b5=0;

alpha=0.27586;

 

y=Melanoma;

t=population;

mu=exp(log(t)+Intercept+a1*area+b1*(AgeGroup="35-44")+b2*(AgeGroup="45-54")+b3*(AgeGroup="54-64")+

b4*(AgeGroup="65-74")+b5*(AgeGroup=">74"));

llike=lgamma(y+1/alpha)-lgamma(1/alpha)-lgamma(y+1)-log(1+alpha*mu)/alpha-y*log(1+alpha*mu)+y*log(alpha)+y*log(mu);

model y ~ general(llike);

run;

 

 

 

4 REPLIES 4
JacobSimonsen
Barite | Level 11

I think the problem is that your starting point for the parameters for the mean are too far away from the ML estimate.

 

A way to find starting values is to first run a Poisson model and use these estimate as starting points for the mean parameters. Using only the intercept is often sufficient.

 

Then run a model the neg-bin model where you introduce the variance only together with intercept.

 

Then run the full model with start values for contrast parameters=0 and with start values for intercept and alpha = values from previous step.

 

In the data you show there is so many mean parameters that they almost explain the full variation. So with all these mean parameters I would instead use a poisson model (you see that the alpha parameter is very close to zero).

 

*find a good starting point for the intercept;
proc genmod data=koch36;
  model melanoma=/dist=poisson link=log offset=logpop;
run;

*Find a good starting point for alpha;
proc nlmixed data=koch36(rename=(melanoma=y))  xconv=1E-4;
parms intercept=-9 alpha=1;  
mu=population*exp(Intercept);
llike=lgamma(y+1/alpha)-lgamma(1/alpha)-lgamma(y+1)-log(1+alpha*mu)/alpha-y*log(1+alpha*mu)+y*log(alpha)+y*log(mu);
model y ~ general(llike);
run;

*Full model
proc nlmixed data=koch36(rename=(melanoma=y))  xconv=1E-4;
parms intercept=-8.04 a1=0 b1=0 b2=0 b3=0 b4=0 b5=0 alpha=0.6298; 
mu=population*exp(Intercept+a1*area+b1*(AgeGroup="35-44")+b2*(AgeGroup="45-54")+b3*(AgeGroup="54-64")+ 
b4*(AgeGroup="65-74")+b5*(AgeGroup=">74"));
llike=lgamma(y+1/alpha)-lgamma(1/alpha)-lgamma(y+1)-log(1+alpha*mu)/alpha-y*log(1+alpha*mu)+y*log(alpha)+y*log(mu);
model y ~ general(llike);
run;
amoynahan
Calcite | Level 5

Thanks for the reply. You approach of starting with the intercept and then adding variables makes sense to me. However, when I tried it I found that when I added the last variable, a1*alpha, the model still failed.

 

Code

 

proc nlmixed data=koch36 gconv=1e-6;
  parms intercept=-10.1383 alpha=0.1519  b1=1.7772  b2=1.8468 b3=2.1658 b4=2.3630 b5=2.7629 a1=0;  
  y=Melanoma;
  t=population;
  mu=exp(log(t)+Intercept+a1*Area+b1*(AgeGroup="35-44")+b2*(AgeGroup="45-54")+b3*(AgeGroup="54-64")+
     b4*(AgeGroup="65-74")+b5*(AgeGroup=">74"));
  llike=lgamma(y+1/alpha)-lgamma(1/alpha)-lgamma(y+1)-log(1+alpha*mu)/alpha-y*log(1+alpha*mu)+y*log(alpha)+y*log(mu);
  model y ~ general(llike);
run;

 

 

Results

 

Iteration History

Negative

Log Maximum

Iteration Calls Likelihood Difference Gradient Slope

1 4 54.8371019 1.892227 18.9465 -69.3337

2 11 50.3199327 4.517169 53.8336 -101.862

3 18 49.7769185 0.543014 85.3213 -1944.62

4 20 48.4637450 1.313174 59.4465 -19.1750

5 31 46.9618976 1.501847 186.198 -26.8011

6 63 46.4230812 0.538816 660.235 -39.5839

7 93 46.4230108 0.00007 23208.6 -17.0106

8 96 45.8523373 0.570674 686.244 -551.955

9 98 44.2959476 1.55639 647.924 -87.0817

10 101 43.6593495 0.636598 1257.82 -24.5176

11 105 41.1515998 2.50775 481.472 -24.3148

12 108 39.9996760 1.151924 40.7525 -84.5010

13 111 39.3144131 0.685263 279.858 -5.17909

14 114 39.2379332 0.07648 365.242 -0.72632

15 117 39.2199225 0.018011 609.610 -0.03710

16 120 39.2199113 0.000011 546.966 -0.00003

 

NOTE: GCONV convergence criterion satisfied.

NOTE: Moore-Penrose inverse is used in covariance matrix.

 

Standard

Parameter Estimate Error DF t Value Pr > |t|

intercept -10.6583 0.09511 12 -112.06 <.0001

alpha 5.958E-9 . 12 . .

b1 1.7974 0.1208 12 14.88 <.0001

b2 1.9131 0.1138 12 16.81 <.0001

b3 2.2418 0.1149 12 19.52 <.0001

b4 2.3657 0.1052 12 22.49 <.0001

b5 2.9447 0.1293 12 22.77 <.0001

a1 0.8195 0.05039 12 16.26 <.0001

 

 

 

JacobSimonsen
Barite | Level 11

Apparently it depend on the starting values;

Using the starting values  "parms intercept=-8.04 a1=0 b1=0 b2=0 b3=0 b4=0 b5=0 alpha=0.6298;" then it converges. But, again, I turns out that the alpha parameter converges toward zero so I would prefer using the a poisson model instead. Also, the model is not very robust as you have almost as many parameters as you have observations. The variance will therefor be small as the total variation is almost fully explained by the mean parameters.

 

amoynahan
Calcite | Level 5

I'm still having trouble getting alpha even with the new starting values.

 

I was able to use similar code with proc mcmc, which as I understand it uses sampling rather than optimization. The mcmc code did estimate an alpha, although it is lower than what was in the pdf that I attached to my first email.

Maybe you just need to see if the model is fitting the data in a reasonable way. So you fit both Poisson and Negative binomial and choose the distribution that best fits the data.

 

proc mcmc data=koch36 nmc=200000 thin=10 nbi=10000;

 parms alpha 0.5 intercept 0 b1 0 b2 0 b3 0 b4 0 b5 0 a1 0 ;

    y=Melanoma;

    t=population;

  prior intercept b1 b2 b3 b4 b5 a1 ~normal(mean=0,var=1E6);

  prior alpha~uniform(0,6);

  mu=exp(log(t)+Intercept+a1*area+b1*(AgeGroup="35-44")+b2*(AgeGroup="45-54")+b3*(AgeGroup="54-64")+

   b4*(AgeGroup="65-74")+b5*(AgeGroup=">74"));

   llike=lgamma(y+1/alpha)-lgamma(1/alpha)-lgamma(y+1)-log(1+alpha*mu)/alpha-y*log(1+alpha*mu)+y*log    ( alpha)+y*log(mu);

  model y ~ general(llike);

run;

 

Parameter N Mean Deviation 95% HPD Interval

alpha 20000 0.0563 0.0825 1.218E-7 0.2058

intercept 20000 -10.6468 0.2068 -11.0502 -10.2185

b1 20000 1.8008 0.2615 1.2863 2.3530

b2 20000 1.9040 0.2674 1.3644 2.4317

b3 20000 2.2320 0.2630 1.7101 2.7652

b4 20000 2.3818 0.2652 1.8874 2.9467

b5 20000 2.9064 0.2735 2.3431 3.4402

a1 20000 0.8222 0.1590 0.5171 1.1548

 

 

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1934 views
  • 5 likes
  • 2 in conversation