06-16-2015 12:17 PM
I running a simulation using LME models and am trying to avoid having loops in my code, I want to use the by statement to speed things up. The problem I'm facing is that I need to run some estimate statements which are data dependent. Is there any way I can dynamically tell proc mixed the inputs for the estimate statement when there is a by statement?
06-16-2015 03:11 PM
Can you give an example of "data dependent"?
Do you mean that the estimate statements are model (or parameter) dependent? For example, if the linear model is
Y = 1 + 2*x1 + gamma*x2;
you could be using GAMMA on the by statement to analyze many models such as GAMMA=0 to 1 by 0.1.
I can see that you might want to include "current value of gamma" in the ESTIMATE statement. I don't think this is possible, but the GLM experts might know a way. I think you would have to include 11 ESTIMATE statements, one for each values of GAMMA. Then use post-processing to pick off the results that apply for each GAMMA value.
06-16-2015 09:30 PM
Thank you for your reply. I'm looking at drug responses (y) as a function of plasma concentrations (c) using a random intercepts model. There are multiple doses given to these subjects and I'm calculating the maximum concentration from each simulated dataset and estimating what the drug effect is at each dose level.
Here's a simplified version of it looks like with the loops,
%do sim=1 & &N;
proc mixed data=dsn_∼
estimate "dose 1" int 1 c &&dose1_∼
estimate "dose 2" int 1 c &&dose2_∼
If I were to used a by statement, I'm not sure if there is any way of telling proc mixed what the current by iteration number is for me to pull out the macro variables associated with each simulated dataset. If I'm running a large number of simulations, your suggestion of calculating all the possible estimate statements would end up being too inefficient.
An alternative I have done is to assume that the inputs for the maximum concentrations is fixed rather than random, I tried that and it is way faster than looping. I've actually finishing doing all the runs I need using the loops but am interested to see if I can improve on the code. Thanks!
06-17-2015 01:55 PM
I'm confused by your %DO loop. Is N the total number of simulation studies, or the number of samples in a single simulation study?
If N is the number of simulation study parameters, then this is a linear model, so you can rescale the data and always use
ESTIMATE intercept c 1;
If N is the number of samples, I don't understand why you are conducting different hypothesis tests for each sample. Each sample is supposed to be drawn from the same population.
06-18-2015 10:24 AM
N is the total number of simulations run.
The test in the estimate statement is the response at maximum drug concentration which varies from sample to sample in real life. In a simulation we can input the actual maximum drug concentration from some model but in practice it is estimated from the individual drug concentrations and it's more realistic to treat the maximum concentration from a given sample as random.
06-18-2015 12:59 PM
I don't fully understand, but I'd try to put the maximum concentration (M) as a variable in the data, and then look at hypothesis tests that involve M-c or (M-c)/M. Try to standardize the problem so that you can use a single ESTIMATE statement.