NLMIXED Convergence

sfleming · Posted 11-13-2009 05:59 PM

In the November 2009 issue of The American Statistician, there is an article called "Nonlinear Models for Longitudinal Data" by Jan Serroyen, Geert Molenberghs, Geert Verbeke, and Marie Davidian. They present some orange tree data obtained from Applied Regression Analysis (3rd ed.) by Draper and Smith (1998) p559. A DATA statement to read in the data follows.

data orange;
input time @@;
time = time / 100;
do tree = 1 to 5;
input y @@;
y = y / 100;
output;
end;
datalines;
118 30 33 30 32 30
484 58 69 51 62 49
664 87 111 75 112 81
1004 115 156 108 167 125
1231 120 172 115 179 142
1372 142 203 139 209 174
1582 145 203 140 214 177
;
run;

They then try to fit a model (#3 in paper) that uses a logistic function and looks like:

y_ij = (beta1 + b_i) / (1 + exp[-(time_ij - beta2) / beta3) + eta_ij

where b_i ~ N(0,d11) and eta_ij~N(0,sigma2)

They provide the following SAS code to fit this model.

proc nlmixed data=orange qpoints=20;
num = beta1 + b_i;
ex = exp(-(time - beta2)/beta3);
den = 1 + ex;
model y ~ normal(num/den,sigma2);
random b_i ~ normal(0,d11) subject=tree;
predict num/den out=pred;
run;

When I run this code I see the following in the log

NOTE: To assign starting values to parameters, use the PARMS statement. The default starting
value of 1.0 is in effect for all parameters.
NOTE: FCONV convergence criterion satisfied.
NOTE: At least one element of the (projected) gradient is greater than 1e-3.
WARNING: The final Hessian matrix is not positive definite, and therefore the estimated
covariance matrix is not full rank and may be unreliable. The variance of some
parameter estimates is zero or some parameters are linearly related to other
parameters.
NOTE: The data set WORK.PRED has 35 observations and 11 variables.
NOTE: PROCEDURE NLMIXED used (Total process time):
real time 0.59 seconds
cpu time 0.32 seconds

The parameter estimates are:

beta1 1.2557
beta2 1.6872
beta3 1.2839
sigma2 1.11E-12
d11 0.1986

which are no where near the authors estimates of

1.921
7.279
3.481
0.006
0.100

respectively. Even when I try to provide initial parameter estimate that are close these values

parms beta1=2 beta2=7 beta3=3 d11=0.1 sigma2=0.006;

The estimation fails with the following log output:

ERROR: QUANEW Optimization cannot be completed.
WARNING: Optimization routine cannot improve the function value.
NOTE: The data set WORK.PRED has 0 observations and 0 variables.
WARNING: Data set WORK.PRED was not replaced because new file is incomplete.
NOTE: PROCEDURE NLMIXED used (Total process time):
real time 0.56 seconds
cpu time 0.45 seconds

Can anyone spot any coding errors I may have made or any other ways the estimation process could be helped along. I would really like to be able to fit a logistic function in this manner.

Thanks!

Dale · Posted 11-16-2009 04:07 PM

Ah, I like a problem with a simple solution. Sort the data by TREE prior to invoking the NLMIXED procedure. You will then get the results reported by Serroyen et al. Below is a note from the NLMIXED RANDOM statement documentation which discusses this requirement:

NOTE: The input data set must be clustered according to the SUBJECT= variable. One easy way to accomplish this is to sort your data by the SUBJECT= variable prior to calling PROC NLMIXED. PROC NLMIXED does not sort the input data set for you; rather, it processes the data sequentially and considers an observation to be from a new subject whenever the value of its SUBJECT= changes from the previous observation.

NLMIXED Convergence

Re: NLMIXED Convergence