BookmarkSubscribeRSS Feed
LarryL
Calcite | Level 5

I am trying to run a proc nlmixed program, to fit a sigmoidal type curve (abscissa = time, ordinate = concentration) to

data for many patients (subjects).  I am initially applying the program to simulated data.   The data at t=0 has a magnitude

of 0 (minconc below), and rises to a magnitude of 100 (maxconc), with the midpt at t=18 data (but has noise added to it).It looks like (it is sorted by simnum and then by patient):

 

simnum patient time igfbp ir ifgbp_shift ir_shift

1 1 0.000000 3.157332 4.392493 0.000000 2.000000
1 1 1.000000 3.323757 -10.545450 0.000000 2.000000
1 1 2.000000 -2.368074 -1.448547 0.000000 2.000000
1 1 3.000000 3.133494 -3.696192 0.000000 2.000000

...

 

 

The program (attached and below) is essentially fitting a simoid over time to the igfbp value for each patient:

proc nlmixed data=simsig cov corr ecov ecorr method=firo tech=congra maxit=1000;
parms midpt = 18;
random maxconc minconc steep tshift ~ normal([100,0,6,2],[25,0,25,0,0,1,0,0,0,2]) subject=patient out=igfbp_random_effects;

if (time-tshift) <= 0 then do;
predconc = minconc;
end;
else do;
predconc = minconc + (maxconc-minconc)*((time-tshift)**steep)/((time-tshift)**steep + midpt**steep);
end;

model igfbp ~ normal(predconc,100);
by simnum;
predict predconc out=predicted_igfbp_concs;
run;

 

 

The program seems to run.  Then,

for each patient I get out the instances of the random variables (ie, maxconc, minconc, steep, and tshift). 

So for the first patient (and first simulation) and random variables maxconc, minconc, steep, and tshift I get:

 

"simnum" "patient" "Random Effect" "Empirical Bayes Estimate" "Standard Error of Prediction" "Degrees of Freedom" "t Value" "Pr > |t|" "Alpha" "Lower Confidence Limit" "Upper Confidence Limit"
1 1 maxconc 0.2452964704 2.0463602486 26 0.119869642 0.9055078792 0.05 -3.961057263 4.4516502035

1 1 minconc -0.049219837 2.4988920525 26 -0.019696664 0.9844357323 0.05 -5.185766015 5.0873263414
1 1 steep 0.6631004289 0.6859789874 26 0.9666483101 0.3426260427 0.05 -0.746949574 2.0731504317
1 1 tshift 0.0960816859 0.4931441808 26 0.1948348772 0.8470366982 0.05 -0.917590695 1.1097540671

 

It also estimates a fixed effect "midpt" of ~15, which is in the ball park of what I used in my simulation (18).

 

What bothers me the most are the values it is picking for the random effects (and the poor fit). Assuming I am interpreting the output correctly, the proc is 1. producing "fits" for each patient which are really terrible 2. don't seem like they are really coming from the distribution I think I am specifying. for example: it seems to be saying that maxconc was 0.24 for patient 1 in simnum 1. and that this value was fairly likely to be pulled from its normal distribution (t value = 0.11, Pr > |t| = .9). But I THINK that I am specifying in the sas program that it should be pulled from a normal distribution with mean 100 and variance 25. So I must not really be specifying what I think I am specifying?

  

I am not as troubled by the bad fit to the data per se.  Since bad fits can happen for various reasons.  But the fact that

it seems to be "happy" generating random instances which I think don't make sense points to a bigger problem with my understanding of what is going on.Smiley Sad

 

thanks for any enlightenment which anyone can provide ...

 

6 REPLIES 6
SteveDenham
Jade | Level 19

Rather than considering the parameters of the sigmoid to be random factors, can you consider them to be fixed, with random errors?  See for example, Example 82.1 One-Compartment Model with Pharmacokinetic Data (SAS/STAT 14.1 documentation), where b1 and b2 are random effects added on to the fixed betas.  I think this will help you avoid much of the problem.

 

The other thing to check is that it looks like your simulated data may have pathological values.  If the minimum is 0, it strikes me as odd to see values less than zero as a response for igfbp.  Also, watch out for scaling problems over time.  if the midpt is at 18, then the last value is likely 36, so that (time-tshift)**steep is 34**6 = 1544804416 , such that you may run into fit problems.

 

Steve Denham

LarryL
Calcite | Level 5

thanks for the quick response and info Steve. 

 

I will take a look at that model, but I am thinking that each subject has different kinetics so I thought my model

made sanse. 

 

As to the negative values, that is because of noise.  I think I am specifying  a model that has

Gaussian noise  (model igfbp = normal(predconc,100)). So even though minconc = 0 (and hence predconc is

also ~0 for small time), noise with a sd of 10

is added to that, hence the possiblity of negative values.

 

As to your midpt concern - I am not sure

I understand what you are getting at.  yes, that value (34**6) can be very large, but I have that

same number in the numerator and denominator, so as time gets larger the ratio should go to 1 and the

entire formula thus approaches maxconc.  Are you thinking there might be numerical issues within

SAS, given how it might do the fit, dividing one large number by the other?

 

do you have any thoughts on my biggest concern - ie, why does SAS seem to be pulling my random

variables (e, maxconc) from a Gaussian with mean zero and variance 1 (?) when

I think I am specifying they

(ie, maxconc) should come from a Gaussian with mean 100 and variance 25?

 

SteveDenham
Jade | Level 19

Gonna jump around some.  I realize that you have the large exponent in both the numerator and denominator, so things should work out, but save yourself some grief and only calculate it once, and then pass that to both the numerator and denominator.

 

 

factor=(time - tshift)**steep;

 

predconc = minconc + (maxconc - minconc)*(factor / (factor + midpt**steep));

 

I still think you should parameterize each of these as fixed effects with a random error (normal with mean=0, and variance=what you have now). 

 

As far as why it seems to be drawing from a N(0,1), I would guess that the solution after optimization is somewhere near that.

 

Steve Denham

LarryL
Calcite | Level 5

you say: "As far as why it seems to be drawing from a N(0,1), I would guess that the solution after optimization is somewhere near that".   I thought that it should NOT be allowed to find the "best" Normal distribution for my random variables.  I THOUGHT that I was specifying the exact distributions it must use when I say:

random maxconc minconc steep tshift ~ normal([100,0,6,2],[25,0,25,0,0,1,0,0,0,2]) subject=patient

 Doesn't that FORCE it to use a N(mean=100, var=25) for maxconc?  And that all it does, for these four random variables, is decide the most likely instances of those variables for each subject?  (it IS finding the best value for "midpt").

And it COULD pick a maxconc ~0,  but it should then

at least report that the chosen value is very improbable (ie, the t value and prob reported should reflect that).  THAT is what

is confusing me the most - it seems like it does not think it is pulling maxconc from a N(100,25) distribution. 

 

It reports for maxconc for patient 1:

simnum" "patient" "Random Effect" "Empirical Bayes Estimate" "Standard Error of Prediction" "Degrees of Freedom" "t Value" "Pr > |t|" "Alpha" "Lower Confidence Limit" "Upper Confidence Limit"
1 1 maxconc 0.2452964704 2.0463602486 26 0.119869642 0.9055078792 0.05 -3.961057263 4.4516502035

 

so it is saying the instance of maxconc has a value of .24,  and this has a t value of 2.04 of being pulled from its

distribution,  and a value this size (or larger)  would happen  11.9% of the time - ie, is fairly likely.  At least that is the

way I am interpreting the output.  What am I missing or wrong about?

SteveDenham
Jade | Level 19

I don't think any forcing is done.  I think those values serve as initial estimates.  To fix the values, you would have to use the BOUNDS statement appropriately.  You can get a start by looking at Example 82.6 Simulated Nested Linear Random-Effects Model (SAS/STAT14.1 documentation), where variance components are bounded to be nonnegative.  If you want fixed values, you would have to bound each parameter above and below.  (NOTE: I may be completely out of it here, as the documentation for the distribution options, at least for the MODEL statement sure look like what you are expecting).

 

If I get some time next week, I will explore this further, using the example datasets and programs.

 

Steve Denham

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

It really does not make sense to treat the parameters as random effects. You will get extreme shrinkage to the mean, with little relationship with the predictor variable. You can have random variation around the fixed-effect parameters. This is called a random-coefficient model. See the Getting Started example in the User's Guide for a simple case of this where the asymptote has an additional random effect. This approach will be much more productive.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1727 views
  • 1 like
  • 3 in conversation