BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Dennisky
Quartz | Level 8
Dear all, Recently, we conduct a study for a repeated measure study. The purpose of this study is to analyze the variable situation of a key postoperative indicator (a continuous variable) in pediatric cardiac surgery. We measured it three times, at 1 week, 1 month, and 6 months after the operation. The aim is to assess whether this indicator changes over time. Due to the data of the indicator is not conform to the normal distribution, we have conducted the repeated measure analysis by GLMM. We have a question now. How to calculate a sample size for repeated measure study by generalized linear mixed model (GLMM)? or could we use the Linear mixed-effects models (LMM) and how to calculate a sample size for LMM? Thanks!
1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

Think of it this way: for a standard sample size calculation, you need four values - a t value for the alpha level, a t value for the beta level (1-power), a measure of the variance and the difference you want to consider as "significant" (effect size). From those you can solve for a sample size, or for a known sample size, calculate the expected power.

 

For a two sample generalized model, you have to consider the dependence of the variance on the mean for almost every distribution, so it gets a bit harder. Still, you can calculate sample size for a given power, or power given a sample size. The %NLest and %NLmeans macros come in handy for this.

 

For a repeated measures linear mixed model, there are multiple variance measures, with associated covariances, that need to be taken into account, in addition to the t for alpha, t for beta and detectable difference. The first four sections of the chapter on power determination in SAS for Mixed Models address this approach, which happens to be the simplest I have seen for mixed models. Note that for generalized mixed models, this method remains valid, as it builds the mean to variance relationship in, but the detectable difference has to be expressed on the linked scale, rather than the original scale, and this is a non-trivial exercise.

 

Consequently, almost every sample size or power determination using a generalized linear mixed model for things like clinical trials uses simulation. This is a case where tools like IML (linked to R or not) or Viya (using R or Python) can utilize existing packages that live out there on the internet to do the simulation.. And if you are a much better programmer than I am, I am sure that a combination of IML and DATA step programming could be constructed as well.

 

One work around would be to consider a single contrast (2 group design) for a narrow inference space (totally fixed effects, nothing repeated) analysis. This will give an estimate of sample size that is too small, as the variance for a broad inference space is generally greater than for a narrow inference space. However, that estimate can be multiplied by a rule of thumb (I use 2x) to get a conservative estimate of the sample size needed to detect the difference of interest. You could use this estimate in any costing algorithm dependent on sample size to get a maximum probable cost for a study to declare a difference significant at the alpha=0.05 at least (1 - beta) percent of the time.

 

Steve Denham.

 

View solution in original post

4 REPLIES 4
SteveDenham
Jade | Level 19

SAS for Mixed Models (3rd ed., Stroup et al.), gives a method for determining sample size for both LMM and GLMM, although doing the GLMM is a bit harder. The idea is to calculate the power for various sample sizes by fixing denominator degrees of freedom, and then post-processing the resulting F values to get non-centrality estimates. There is also an extensive section on using simulation to calculate power.  With power estimates at various points, you can then zoom in and get sample size estimates by re-estimation, or by examination of the power curve. The simulation method does require SAS/IML. There is also a list of references in this section, including some that specifically are from repeated measures analysis.

 

SteveDenham

Dennisky
Quartz | Level 8

Thank you very much for providing us with very valuable advice and methods as always.

We are also studying this document, although it is difficult to understand.

Has this type of sample size calculation all turned into simulation analysis of power? Are there few direct methods for calculating sample size?

 
 
 
 
SteveDenham
Jade | Level 19

Think of it this way: for a standard sample size calculation, you need four values - a t value for the alpha level, a t value for the beta level (1-power), a measure of the variance and the difference you want to consider as "significant" (effect size). From those you can solve for a sample size, or for a known sample size, calculate the expected power.

 

For a two sample generalized model, you have to consider the dependence of the variance on the mean for almost every distribution, so it gets a bit harder. Still, you can calculate sample size for a given power, or power given a sample size. The %NLest and %NLmeans macros come in handy for this.

 

For a repeated measures linear mixed model, there are multiple variance measures, with associated covariances, that need to be taken into account, in addition to the t for alpha, t for beta and detectable difference. The first four sections of the chapter on power determination in SAS for Mixed Models address this approach, which happens to be the simplest I have seen for mixed models. Note that for generalized mixed models, this method remains valid, as it builds the mean to variance relationship in, but the detectable difference has to be expressed on the linked scale, rather than the original scale, and this is a non-trivial exercise.

 

Consequently, almost every sample size or power determination using a generalized linear mixed model for things like clinical trials uses simulation. This is a case where tools like IML (linked to R or not) or Viya (using R or Python) can utilize existing packages that live out there on the internet to do the simulation.. And if you are a much better programmer than I am, I am sure that a combination of IML and DATA step programming could be constructed as well.

 

One work around would be to consider a single contrast (2 group design) for a narrow inference space (totally fixed effects, nothing repeated) analysis. This will give an estimate of sample size that is too small, as the variance for a broad inference space is generally greater than for a narrow inference space. However, that estimate can be multiplied by a rule of thumb (I use 2x) to get a conservative estimate of the sample size needed to detect the difference of interest. You could use this estimate in any costing algorithm dependent on sample size to get a maximum probable cost for a study to declare a difference significant at the alpha=0.05 at least (1 - beta) percent of the time.

 

Steve Denham.

 

Dennisky
Quartz | Level 8

Thank you very much for your guidance and advice once again.

This has allowed me to experience the power of SAS once more, particularly its rigorous mathematical system, as well as the expertise of top-level professionals in the SAS community such as yourself.

Honestly, we usually calculate sample sizes using PASS software, but many aspects of this software are not explained clearly, such as the mathematical formulas for complex analysis methods including repeated measures analysis, GLMM, GLM, GEE, mixed effects models, survival analysis, and so on.

Additionally, PASS software generally only provides calculation formulas for power, and hardly ever provides direct sample size calculation formulas. All of this makes it difficult for us to fully understand the process of sample size calculation, and we cannot be completely confident in the calculation results.

SAS is relatively clear in this aspect. We will study your content carefully, and every answer you give benefits us greatly.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2513 views
  • 8 likes
  • 2 in conversation