turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Hurdle model with negative binomial

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-13-2011 04:36 PM

Hi,

I am trying to do a hurdle model with a truncated negative binomial

distribution instead of a poisson. I keep getting the error message:

"NOTE: Execution error for observation 1."

I have tried setting the initial parameters to zero, 1 and -1. But I still

get the same message. The code I am running is:

options ls=74 ps=100 nodate nocenter pageno=1;

Data main;

input salinity male female court;

mf = male*female;

sf = salinity*female;

sm = salinity*male;

smf= salinity*male*female;

;

cards;

0 1 1 0

0 1 1 0

0 1 1 29

...more data......

;

proc sort data=main;

by male female salinity;

run;

proc nlmixed data = main;

parms a0 = 0 a1 = 0 a2 =0 a3=0 b0 = 0 b1 = 0 b2 = 0 b3 = 0 b4 = 0 b5 = 0 b6 =

0 b7 = 0 alpha=0;

eta0 = a0 + a1 * male + a2 *female + a3 *mf;

etap = b0 + b1 * male + b2 * female + b3 * mf + b4 * salinity +

b5 * sm + b6 * sf + b7 *smf;

exp_eta0 = exp(eta0);

mu = exp(etap);

p0 = exp_eta0 / (1 + exp_eta0);

m=1/alpha;

if court = 0 then ll = log(p0);

else ll = lgamma(court+m)-lgamma(court+1)-lgamma(m) + court*log(alpha*mu)-

(court+m)*log(1+alpha*mu)- log(1 -( 1 + alpha*mu)**(-m));

model court ~ general(ll);

run;

Any help would be greatly appreciated!!,

Emma

I am trying to do a hurdle model with a truncated negative binomial

distribution instead of a poisson. I keep getting the error message:

"NOTE: Execution error for observation 1."

I have tried setting the initial parameters to zero, 1 and -1. But I still

get the same message. The code I am running is:

options ls=74 ps=100 nodate nocenter pageno=1;

Data main;

input salinity male female court;

mf = male*female;

sf = salinity*female;

sm = salinity*male;

smf= salinity*male*female;

;

cards;

0 1 1 0

0 1 1 0

0 1 1 29

...more data......

;

proc sort data=main;

by male female salinity;

run;

proc nlmixed data = main;

parms a0 = 0 a1 = 0 a2 =0 a3=0 b0 = 0 b1 = 0 b2 = 0 b3 = 0 b4 = 0 b5 = 0 b6 =

0 b7 = 0 alpha=0;

eta0 = a0 + a1 * male + a2 *female + a3 *mf;

etap = b0 + b1 * male + b2 * female + b3 * mf + b4 * salinity +

b5 * sm + b6 * sf + b7 *smf;

exp_eta0 = exp(eta0);

mu = exp(etap);

p0 = exp_eta0 / (1 + exp_eta0);

m=1/alpha;

if court = 0 then ll = log(p0);

else ll = lgamma(court+m)-lgamma(court+1)-lgamma(m) + court*log(alpha*mu)-

(court+m)*log(1+alpha*mu)- log(1 -( 1 + alpha*mu)**(-m));

model court ~ general(ll);

run;

Any help would be greatly appreciated!!,

Emma

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-13-2011 10:52 PM

More than likely, your program is trying to take a log of 0 (including lgamma of 0), or creating infinity. For instance, with alpha=0 (starting value), m=1/alpha will cause big problems (infinity). Also, there are log or lgamma functions that involve alpha (which will also blow up with 0 for alpha). I took your data and added a few more dummy data points. The program ran when I used 1 for the starting value of alpha. But this is still dangerous, because parameter estimates can migrate to 'impossible' regions during the optimization. You should look into putting bounds on the parameters. Check out the syntax of the bounds statement.

(I did not really look at the rest of the code to see if the customized ll is correct).

(I did not really look at the rest of the code to see if the customized ll is correct).

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-14-2011 01:07 PM

The hurdle log-likelihood function does appear sound. As lvm has indicated, computing 1/alpha for alpha initialized to 0 is a problem.

Personally, I prefer not to use bounds statements to control acceptable values for the parameters. Derivatives are usually not defined when parameter estimates are fixed at boundary values. Sometimes, the function itself is not defined when a parameter takes on the boundary value. (Such is the case for the NB distribution when you specify alpha=0. Even though the NB is a Poisson when alpha=0, it is not possible to compute the Poisson density using a negative binomial density function.)

Instead of using boundary constraints, consider parameterizing the model so that the parameter estimates are always in an acceptable range. For instance, in the NB distribution, the parameter alpha must always be positive. If you name log_alpha as a parameter and then compute alpha=exp(log_alpha), then alpha>0 for -infinity

Personally, I prefer not to use bounds statements to control acceptable values for the parameters. Derivatives are usually not defined when parameter estimates are fixed at boundary values. Sometimes, the function itself is not defined when a parameter takes on the boundary value. (Such is the case for the NB distribution when you specify alpha=0. Even though the NB is a Poisson when alpha=0, it is not possible to compute the Poisson density using a negative binomial density function.)

Instead of using boundary constraints, consider parameterizing the model so that the parameter estimates are always in an acceptable range. For instance, in the NB distribution, the parameter alpha must always be positive. If you name log_alpha as a parameter and then compute alpha=exp(log_alpha), then alpha>0 for -infinity

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-14-2011 02:57 PM

Dale makes a good point: it is best to avoid using parameter bounds, when possible. For your model, it should be possible to avoid explicit bounds, using the recommendations from Dale. With some nonlinear models, it can be challenging to do the optimization without using bounds, but always a good goal.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-14-2011 03:14 PM

Thank you all for your advice! I took Dale's advice and the model works. However, I am concerned that the hurdle part of the model is not working. My current code is not very different:

proc nlmixed data = main;

parms a0 = 0 a1 = 0 a2 =0 a3=0 b0 = 0 b1 = 0 b2 = 0 b3 = 0 b4 = 0 b5 = 0 b6 = 0 b7 = 0 log_alpha=0;

eta0 = a0 + a1 * male + a2 *female + a3 *mf;

etap = b0 + b1 * male + b2 * female + b3 * mf + b4 * salinity +

b5 * sm + b6 * sf + b7 *smf;

exp_eta0 = exp(eta0);

alpha = exp(log_alpha);

mu = exp(etap);

p0 = exp_eta0 / (1 + exp_eta0);

m= 1/alpha;

if court = 0 then ll = log(p0);

else ll = lgamma(court+m)-lgamma(court+1)-lgamma(m) + court*log(alpha*mu)-(court+m)*log(1+alpha*mu)- log(1 -( 1 + alpha*mu)**(-m));

model court ~ general(ll);

run;

which is almost identical to what I have before. But my parameter estimates come out like this:

Standard

Parameter Estimate Error DF t Value Pr > |t| Alpha Lower

a0 2.1240 2840.01 132 0.00 0.9994 0.05 -5615.70

a1 2.9631 2808.80 132 0.00 0.9992 0.05 -5553.13

a2 3.2490 2820.91 132 0.00 0.9991 0.05 -5576.80

a3 4.2013 2814.01 132 0.00 0.9988 0.05 -5562.20

b0 2.9783 4.3967 132 0.68 0.4993 0.05 -5.7188

b1 -0.06676 2.4206 132 -0.03 0.9780 0.05 -4.8549

b2 -0.2695 4.1330 132 -0.07 0.9481 0.05 -8.4451

b3 0.1722 2.1445 132 0.08 0.9361 0.05 -4.0698

b4 5.3092 0.3373 132 15.74 <.0001 0.05 4.6420

b5 -2.6839 0.1903 132 -14.10 <.0001 0.05 -3.0604

b6 -5.2397 0.3102 132 -16.89 <.0001 0.05 -5.8533

b7 2.6524 0.1629 132 16.28 <.0001 0.05 2.3301

log_alpha 0.2495 0.2317 132 1.08 0.2834 0.05 -0.2087

So the standard errors for the hurdle part of the model are huge! I can play around with the starting value of log_alpha and get them somewhat better but they are still huge. I also tried changing the hurdle model. Right now it is log((exp(zeromodel)) /(1+(exp(zeromodel))). Based on the SAS for mixed models by Littell et al's hurdle poisson I changed it to -exp(zeromodel) but that did not change the outcome.

-Emma

proc nlmixed data = main;

parms a0 = 0 a1 = 0 a2 =0 a3=0 b0 = 0 b1 = 0 b2 = 0 b3 = 0 b4 = 0 b5 = 0 b6 = 0 b7 = 0 log_alpha=0;

eta0 = a0 + a1 * male + a2 *female + a3 *mf;

etap = b0 + b1 * male + b2 * female + b3 * mf + b4 * salinity +

b5 * sm + b6 * sf + b7 *smf;

exp_eta0 = exp(eta0);

alpha = exp(log_alpha);

mu = exp(etap);

p0 = exp_eta0 / (1 + exp_eta0);

m= 1/alpha;

if court = 0 then ll = log(p0);

else ll = lgamma(court+m)-lgamma(court+1)-lgamma(m) + court*log(alpha*mu)-(court+m)*log(1+alpha*mu)- log(1 -( 1 + alpha*mu)**(-m));

model court ~ general(ll);

run;

which is almost identical to what I have before. But my parameter estimates come out like this:

Standard

Parameter Estimate Error DF t Value Pr > |t| Alpha Lower

a0 2.1240 2840.01 132 0.00 0.9994 0.05 -5615.70

a1 2.9631 2808.80 132 0.00 0.9992 0.05 -5553.13

a2 3.2490 2820.91 132 0.00 0.9991 0.05 -5576.80

a3 4.2013 2814.01 132 0.00 0.9988 0.05 -5562.20

b0 2.9783 4.3967 132 0.68 0.4993 0.05 -5.7188

b1 -0.06676 2.4206 132 -0.03 0.9780 0.05 -4.8549

b2 -0.2695 4.1330 132 -0.07 0.9481 0.05 -8.4451

b3 0.1722 2.1445 132 0.08 0.9361 0.05 -4.0698

b4 5.3092 0.3373 132 15.74 <.0001 0.05 4.6420

b5 -2.6839 0.1903 132 -14.10 <.0001 0.05 -3.0604

b6 -5.2397 0.3102 132 -16.89 <.0001 0.05 -5.8533

b7 2.6524 0.1629 132 16.28 <.0001 0.05 2.3301

log_alpha 0.2495 0.2317 132 1.08 0.2834 0.05 -0.2087

So the standard errors for the hurdle part of the model are huge! I can play around with the starting value of log_alpha and get them somewhat better but they are still huge. I also tried changing the hurdle model. Right now it is log((exp(zeromodel)) /(1+(exp(zeromodel))). Based on the SAS for mixed models by Littell et al's hurdle poisson I changed it to -exp(zeromodel) but that did not change the outcome.

-Emma