BookmarkSubscribeRSS Feed
sfleming
Calcite | Level 5
In the November 2009 issue of The American Statistician, there is an article called "Nonlinear Models for Longitudinal Data" by Jan Serroyen, Geert Molenberghs, Geert Verbeke, and Marie Davidian. They present some orange tree data obtained from Applied Regression Analysis (3rd ed.) by Draper and Smith (1998) p559. A DATA statement to read in the data follows.

data orange;
input time @@;
time = time / 100;
do tree = 1 to 5;
input y @@;
y = y / 100;
output;
end;
datalines;
118 30 33 30 32 30
484 58 69 51 62 49
664 87 111 75 112 81
1004 115 156 108 167 125
1231 120 172 115 179 142
1372 142 203 139 209 174
1582 145 203 140 214 177
;
run;

They then try to fit a model (#3 in paper) that uses a logistic function and looks like:

y_ij = (beta1 + b_i) / (1 + exp[-(time_ij - beta2) / beta3) + eta_ij

where b_i ~ N(0,d11) and eta_ij~N(0,sigma2)

They provide the following SAS code to fit this model.

proc nlmixed data=orange qpoints=20;
num = beta1 + b_i;
ex = exp(-(time - beta2)/beta3);
den = 1 + ex;
model y ~ normal(num/den,sigma2);
random b_i ~ normal(0,d11) subject=tree;
predict num/den out=pred;
run;

When I run this code I see the following in the log

NOTE: To assign starting values to parameters, use the PARMS statement. The default starting
value of 1.0 is in effect for all parameters.
NOTE: FCONV convergence criterion satisfied.
NOTE: At least one element of the (projected) gradient is greater than 1e-3.
WARNING: The final Hessian matrix is not positive definite, and therefore the estimated
covariance matrix is not full rank and may be unreliable. The variance of some
parameter estimates is zero or some parameters are linearly related to other
parameters.
NOTE: The data set WORK.PRED has 35 observations and 11 variables.
NOTE: PROCEDURE NLMIXED used (Total process time):
real time 0.59 seconds
cpu time 0.32 seconds

The parameter estimates are:

beta1 1.2557
beta2 1.6872
beta3 1.2839
sigma2 1.11E-12
d11 0.1986

which are no where near the authors estimates of

1.921
7.279
3.481
0.006
0.100

respectively. Even when I try to provide initial parameter estimate that are close these values

parms beta1=2 beta2=7 beta3=3 d11=0.1 sigma2=0.006;

The estimation fails with the following log output:

ERROR: QUANEW Optimization cannot be completed.
WARNING: Optimization routine cannot improve the function value.
NOTE: The data set WORK.PRED has 0 observations and 0 variables.
WARNING: Data set WORK.PRED was not replaced because new file is incomplete.
NOTE: PROCEDURE NLMIXED used (Total process time):
real time 0.56 seconds
cpu time 0.45 seconds

Can anyone spot any coding errors I may have made or any other ways the estimation process could be helped along. I would really like to be able to fit a logistic function in this manner.

Thanks!
1 REPLY 1
Dale
Pyrite | Level 9
Ah, I like a problem with a simple solution. Sort the data by TREE prior to invoking the NLMIXED procedure. You will then get the results reported by Serroyen et al. Below is a note from the NLMIXED RANDOM statement documentation which discusses this requirement:

NOTE: The input data set must be clustered according to the SUBJECT= variable. One easy way to accomplish this is to sort your data by the SUBJECT= variable prior to calling PROC NLMIXED. PROC NLMIXED does not sort the input data set for you; rather, it processes the data sequentially and considers an observation to be from a new subject whenever the value of its SUBJECT= changes from the previous observation.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 3286 views
  • 0 likes
  • 2 in conversation