BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Jean-Jacques
Obsidian | Level 7

I am trying to fit a simple exponential decay model to some plant growth data, where exposure to a chemical affects biomass adversely.
I have several years of data. Each year is considered a random realization, and I want the common fixed model.

I am using this program:
proc nlmixed data=growth_response method=FIRO;
parms b1=22 to 40 by 2 b2=0.001 to 0.01 by 0.001  u=-10 to 10 by 1 s2=1 s2u=1;
model biomass ~ normal((b1+u)*exp(-b2*exposure),s2);
random u ~ normal(0,s2u) subject=year;
run;

 

I expect to see b1, b2, s2 and s2u in the table of parameter estimates, and no estimate of u. But instead I get estimates of b1, b2, u, s2 and s2u.
Can someone help understand why this is occurring and maybe suggest an alternate parameterization ?

 

If it can help understand the issue, consider this example from the NLMIXED documentation, which of course estimates a completely different nonlinear model which would not be appropriate for my situation, but demonstrates how no estimate of the random effect u1 gets outputted into the table of parameter estimates.

proc nlmixed data=tree;
parms b1=190 b2=700 b3=350 s2u=1000 s2e=60;
num = b1+u1;
ex = exp(-(day-b2)/b3);
den = 1 + ex;
model y ~ normal(num/den,s2e);
random u1 ~ normal(0,s2u) subject=tree;
run;
:

...which produces estimates of b1, b2, b3, s2e and s2u, but none of u1

 

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

You wouldn't need to log transform the response; just use the log link so that the response is still treated as normally distributed, but the specified model is on the log of the response mean. If this is repeated measures data (plants measured repeatedly over time) and you want to estimate the population-level parameters b1 and b2, then the easiest approach would be to fit a GEE model. Given the log-linear model as I wrote it earlier, the parameter estimate for exposure in the results is b2 directly, but the  exponentiated intercept is the estimate of b1. You could specify a different structure of the within-plant correlations by specifying the TYPE= option in the REPEATED statement.

proc gee;
class plant;
model biomass=exposure / link=log;
repeated subject=plant;
estimate 'b1' intercept 1 / exp;
run;

View solution in original post

7 REPLIES 7
PGStats
Opal | Level 21

I don't know much about proc NLMIXED, but I would try removing u from the list of parameters (in the PARMS statement).

PG
Jean-Jacques
Obsidian | Level 7

I certainly never thought it could be that simple! Thank you!

The parms statement is there to provide starting values, and I didn't think it might also function as a request to output estimates of the parameters it includes.
Not including u in the parms statement succeeds in suppressing the output of the estimate of u, but starting values are needed for my estimation to converge. 

Therefore, I  would feel better if someone with NLMIXED knowledge could confirm that whether or not the parms statement includes a given parameter, there is no underlying difference in the estimation, besides the parms statement providing the starting values.

StatDave
SAS Super FREQ

One side thought you might consider ... since the response mean is modeled as the expression you show, if you write the model on the log mean, it is essentially a linear model with random intercepts: log(mean(biomass)) = log(b1+u)-b2*exposure. You could do that by fitting the model in GLIMMIX, similar to the Getting Started example, but with DIST=NORMAL and LINK=LOG.

Jean-Jacques
Obsidian | Level 7

I have definitely considered just using a linearization (in log biomass), and used PROC MIXED, which went a lot better. 
am worried about the backtransformation, especially of the confidence envelope.
If you could suggest a good approach to that issue, I would love to consider it.


As for including u in the parms statement, I just tested adding u=1 in the statement, which I think is supposed to be the SAS default when a parameter starting value is not provided. I got markedly different estimate from simply not including it. 

StatDave
SAS Super FREQ

You wouldn't need to log transform the response; just use the log link so that the response is still treated as normally distributed, but the specified model is on the log of the response mean. If this is repeated measures data (plants measured repeatedly over time) and you want to estimate the population-level parameters b1 and b2, then the easiest approach would be to fit a GEE model. Given the log-linear model as I wrote it earlier, the parameter estimate for exposure in the results is b2 directly, but the  exponentiated intercept is the estimate of b1. You could specify a different structure of the within-plant correlations by specifying the TYPE= option in the REPEATED statement.

proc gee;
class plant;
model biomass=exposure / link=log;
repeated subject=plant;
estimate 'b1' intercept 1 / exp;
run;
StatsMan
SAS Super FREQ

U is not a parameter in your model. S2U is. U is the realization of the random effect from the distribution of U you have defined with your RANDOM statement. No starting values are needed for U. You do need a starting value for its variance, though ... S2U. 

Jean-Jacques
Obsidian | Level 7

Combining PGStats and StatsMan answers gives a full answer to my immediate question, but Stats Dave provided a great answer to my general underlying need.
I did not realize that GLIMMIX and GEE had the link function capability Stats Dave pointed out, and I also appreciate the suggestion that maybe a repeated measure approach would be appropriate.

Since only one response gets to be marked as the solution, I will go with StatsDave, but readers who want to know why PROC NLMIXED generated an estimate of u, and why I shouldn't even have included u in the parms statement will want to look at PGStats and StatsMan answers.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1082 views
  • 7 likes
  • 4 in conversation