Hi, I am trying to understand what adjust=simulate does (within the lsmestimate option in proc mixed and other procs, SAS 9.4).  I am comfortable with what it is doing at a high level, but I can not get my head around how it does it.  I have read the reference below and the on line help in SAS (also below) but I am finding it hard to grasp the principle of how the the adjustment is done.  I believe simulations are done and the contrasts are calculated for the specified contrasts for each simulation and a p-value is obtained from a mutivariate t but I'm unclear on how the simulations are generated and how the adjusted p-value is obtained.  A simple explanation or even examples would be great.  I have emailed SAS but they have no further info they can provide to me.  thank you.

http://support.sas.com/documentation/cdl/en/statug/68162/HTML/default/viewer.htm#statug_glm_details2...

1 ACCEPTED SOLUTION

Accepted Solutions

What is your statistical background and previous experience with simulation methods in statistics?  From your questions, it sounds like you need to start with simpler situations to better unerstand the main ideas. Try reading about using simulation to estimate the power of a statistical test and Monte Carlo methods for contingency tables in SAS.

Also what is the application here? Are you merely intellectually curious about the details of the algorithm? Or do you need to implement a similar algoithm in a different situation?

3 REPLIES 3

It's basically a multivariate generalization of using simulation to estimate the coverage probability of a 1-D confidence interval.

Suppose you want to estimate ONE linear combination of the betas: c`*beta. You also want a CI. The statistic c`*beta is normally distributed, so the CI will be of the form

c`*beta +/- delta * stderr(c`*beta).

The challenge is to choose delta. For simple estimates (like c=(1 0 0 .. 0)), you can choose delta to be a critical value of the t distribution: delta=t(1-alpha/2, df).   The t distribution is used instead of the normal distribution to adjust for the finite sample size.

Now suppose that you have k linear combinations and you want SIMULTANEOUS CIs. The k estimates are jointly MVN with mean and covariance that can be computed because you are assuming a GLM. So simulate a bunch of values from the appropriate MVN distribution (actually multivariate t) and then use quantiles of the simulated distribution of estimates to find the critical value (delta) that works simultaneously for all the estimates.  Make a bunch of draws of the form (t1, t2, ..., tk) and for each draw compute the statistic max(|t1|, ..., |tk|).  The union of those max statistics has an empirical distribution. The Edwards/Berrry paper says that you can choose delta to be the  (1-alpha)th quantile of the empirical distribution.

* When you say "simulate a bunch of values from the appropriate MVN distribution (actually multivariate t)".  Do you mean, generate a lot of datasets, say 100 to keep things simple, from a mvt which has means which match the means of your k contrasts/estimates and covariance that matches the covariance of your k contrasts/means?

* Then you say "use quantiles of the simulated distribution of estimates to find the critical value (delta) that works simultaneously for all the estimates.".  My guess at what this means is, for the each datasets you have simulated, calculate the value of each of the k contrasts, so if k=2, and 100 datasets, you have 100 values for each contrast.  Then I'm not sure how you find the critical value.  You mention later to use (1-alpha)th quantile.  So if alpha is 5%, is the 95th ordered value from the 100 values I have for each k my critical value?

*Then "Make a bunch of draws of the form (t1, t2, ..., tk) and for each draw compute the statistic max(|t1|, ..., |tk|).".  I'm lost here, am I randomly selecting a value from the 100 values I have for each k, so in my case I would have two values, then select the max of these and build up a distribution of these I guess.  Sorry, I'm not sure what happens now.

What is your statistical background and previous experience with simulation methods in statistics?  From your questions, it sounds like you need to start with simpler situations to better unerstand the main ideas. Try reading about using simulation to estimate the power of a statistical test and Monte Carlo methods for contingency tables in SAS.

Also what is the application here? Are you merely intellectually curious about the details of the algorithm? Or do you need to implement a similar algoithm in a different situation?

Discussion stats
• 3 replies
• 1455 views
• 1 like
• 2 in conversation