Solved: Re: Fitting Nlmixed model for negative binomial model.

Unay13 · Posted 03-30-2018 10:27 AM

Hello,

I need to perform a Negative Binomial and Poisson distribution for a data that I have. Using GENMOD, COUNTREG and specifying distribution as NB or Poisson, I got the mean as a linear function of x Variables. However, I have my own defined Non linear function such as Hoerl and Sigmoidal function that I need to incorporate in the distribution.

For example, instead of this form.

The function I need to use is below moreover the bottom 2. Where miu can is in terms of E.

Any help would be greatly appreciated.

StatDave · Posted 03-30-2018 03:30 PM

The PARMS statement often isn't necessary if the default starting values for the parameters are reasonable enough to allow the fitting algorithm to converge to a proper solution. But when there are fitting problems, one often needs to try various other initial values, and the PARMS statement lets you do that. If you have an idea of approximately what the final parameter values should be, such as from the same model fit to previous data, it might be worthwhile to specify them as starting values in the PARMS statement.

1/k and p are the parameters of the negative binomial distribution. The code you show is in the section of the NLMIXED documentation which shows the form of the negative binomial log likelihood function and how those parameters appear in it. The model in that code (linp) is a linear model on the log of the negative binomial mean, mu. The p parameter is related to mu and the dispersion parameter, k, as shown.

View solution in original post

Ksharp · Posted 03-30-2018 10:38 AM

Did you check PROC GENMOD Program Statement ?

proc genmod;
class car age;
a = _MEAN_;
y = _RESP_;
d = 2 * ( y * log( y / a ) - ( y - a ) );
variance var = a;
deviance dev = d;
model c = car age / link = log offset = ln;
run;
The variables var and dev are dummy variables used internally by the procedure to identify the variance and
deviance functions. Any valid SAS variable names can be used.
Similarly, the log link function and its inverse could be defined with the FWDLINK and INVLINK statements,
as follows:
fwdlink link = log(_MEAN_);
invlink ilink = exp(_XBETA_);

Unay13 · Posted 03-30-2018 10:55 AM

Thanks Ksharp. But I need to feed in the Hoerl and Sigmoidal function and do the NB modeling. By any chance do you have the codes for NB modeling with our own specified functions?

StatDave · Posted 03-30-2018 02:43 PM

Assuming that you mean you want to specify a certain nonlinear model to fit to a response that is distributed negative binomial, you will need to use PROC NLMIXED. There, you can specify both the log likelihood for the negative binomial and whatever linear or nonlinear model you want. See Note2 at the end of this note which shows the statements needed to define the negative binomial log likelihood in NLMIXED.

Unay13 · Posted 03-30-2018 03:06 PM

Your response was really helpful. However, I got the following codes :
proc nlmixed;
parms b0=3, b1=1, k=0.8;
linp = b0 + b1*x;
mu = exp(linp);
p = 1/(1+mu*k);
model y ~ negbin(1/k,p);
run;

How necessary is it to use the PARMS statement? I do not know what values or on what basis do I assign the values for parameters to be accepted? Also in the model statement, what does (1/k, p) stand for?
any help would be greatly appreciated.

StatDave · Posted 03-30-2018 03:30 PM

The PARMS statement often isn't necessary if the default starting values for the parameters are reasonable enough to allow the fitting algorithm to converge to a proper solution. But when there are fitting problems, one often needs to try various other initial values, and the PARMS statement lets you do that. If you have an idea of approximately what the final parameter values should be, such as from the same model fit to previous data, it might be worthwhile to specify them as starting values in the PARMS statement.

1/k and p are the parameters of the negative binomial distribution. The code you show is in the section of the NLMIXED documentation which shows the form of the negative binomial log likelihood function and how those parameters appear in it. The model in that code (linp) is a linear model on the log of the negative binomial mean, mu. The p parameter is related to mu and the dispersion parameter, k, as shown.

Rick_SAS · Posted 03-30-2018 04:06 PM

Since you are new to PROC NLMIXED, here are two elementary examples. Regarding ways to choose an initial guess for the parameters, see "The method of moments: A smart way to choose initial parameters for MLE"

Unay13 · Posted 03-30-2018 04:50 PM

for the following codes where I have not specified TOT as a data set as it is one of the variables, can you let me know why do I get the warning below:

892 proc NLMIXED data =SPFU3ST;
893 parms k=0.8;
894 Y= 5*365*((MINAADT)**beta_1)* ((MAXAADT)**beta_2)*(EXP(beta_0));
895 model TOT ~ NEGBIN (1/k, Y);
896 predict TOT out = TOT;
897 run;

NOTE: The parameters beta_1, beta_2, beta_0 are assigned the default starting value of 1.0, because
they are not assigned initial values with the PARMS statement.
ERROR: No valid parameter points were found.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.TOT may be incomplete. When this step was stopped there were 0
observations and 0 variables.
WARNING: Data set WORK.TOT was not replaced because this step was stopped.

Rick_SAS · Posted 03-31-2018 06:36 AM

I think the warning (and the error before it) is telling you that the model did not converge. Without convergence, there is no model that the procedure can score to produce predicted values. There are several reasons why a model might not converge, but the most common is that it does not fit the data.

StatDave · Posted 04-01-2018 11:50 AM

I assume the model you specify in the Y= statement is the model you want for the negative binomial mean (or maybe the log mean?), not the second parameter of the distribution. If so, then you probably want to have the P= statement like in the code you referred to earlier. Also, you don't want (or obviously, need) to predict the actual response, TOT. You presumably want the predicted response, mu. If that still causes fitting problems, then as I mentioned before you might have to try various starting values using the PARMS statement.

proc nlmixed;
mu = 5*365*((MINAADT)**beta_1)* ((MAXAADT)**beta_2)*(EXP(beta_0));
p = 1/(1+mu*k);
model y ~ negbin(1/k,p);

predict mu out=predmean;
run;

Unay13 · Posted 04-02-2018 09:48 AM

In the above case, TOT is my dependent variable so I was assuming I would have to specify that.

StatDave · Posted 04-02-2018 10:09 AM

No - even in the case of an ordinary regression as would be done in PROC REG you are modeling the mean of Y, not Y, and would look like this in NLMIXED:

proc nlmixed;

mu = b0 + b1*x;

model y ~ normal(mu, s);

run;

But maybe you can avoid NLMIXED altogether. If the model you show is for the mean of your response, Y, then if you use the usual log link for the negative binomial model, your model becomes:

log(mu) = log(5) + log(365) + b1*log(minaadt) + b2*log(maxaadt) + b0
= newb0 + b1*log(minaadt) + b2*log(maxaadt)

where newb0=b0+log(5)+log(365). This can be fit in a generalized linear modeling procedure like GLIMMIX or GENMOD:

proc genmod;

model y = minaadt maxaadt / dist=negbin;

run;

which will provide estimates of newb0, b1, and b2.

Unay13 · Posted 04-02-2018 10:24 AM

I am sorry but I do not have a statistics background and I am new to SAS.
For me the NB modeling I want to do of the above function is for the dependent variable TOT that I have. So is the model statement correct:

model TOT ~ negbin(1/k,p);

Is K the overdispersion parameter?

The default Negbin function, is it for NB-1 or NB-2? NB-2 I suppose? What can I do to model my TOT (dependent variable) and MINAADT and MAXAADT as independent variables with the equation:
Y= 5*365*((MINAADT)**beta_1)* ((MAXAADT)**beta_2)*(EXP(beta_0));

SAS Innovate 2025: Save the Date