BookmarkSubscribeRSS Feed
Calcite | Level 5

Hi Everyone,


I am currently learning sample size calculations using SAS and I came across this question:


Under ideal packaging conditions, the concentration of the active ingredient in a vacuum packed dry powdered product should be independent of storage temperature and humidity at the time of packaging and there should be no difference between initial and end-of-shelf-life concentrations. An experiment to study the active ingredients concentration is planned using a 3x2x2 experiment design in temperature, humidity, and age, respectively. The variable levels are: 20°C, 25°C, and 30°C for temperature; 25 and 50 percent for humidity, and 0 and 6 months for age. Determine the number of replicates are required to detect an effect size of 30 ppm with 90% power when the standard error of the model is expected to be 20 ppm.


I do not know how to approach this problem since I do not have any data to use for sample size calculations. I tried using proc glmpower but it would require a set of data in order to calculate. Is there a way to find sample size with just effect size and standard error?


Thanks in advance!


Ammonite | Level 13 VDD
Ammonite | Level 13

this link may assist you in determining the sample size.


Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

The question that you cite is ill-posed and cannot be answered as written:  If you have a 3x2x2 factorial, you have MANY comparisons that might differ by at least 30 ppm. Which pair of the 3 levels of factor A (temperature) differ by at least 30 ppm? Do the levels of factor B (humidity) or C (age) differ by at least 30 ppm? Do comparisons within interactions differ by at least 30 ppm? Do ALL comparisons in the entire model need to differ by at least 30 ppm?


Even if your statistical model is not mixed, you can use the GLIMMIX procedure to determine sample size in complex models, by using exemplary datasets. (GLMPOWER also uses exemplary datasets.) An exemplary dataset represents an alternative hypothesis for which you would like to assess power--it is what you think the mean structure of your data will look like, and it replaces actual data in the procedure. Once you understand the concept of exemplary data, you'll see why you do not need an actual set of data.


For a 3-way factorial, coming up with an exemplary dataset is a nontrivial problem requiring much thought and a good familiarity with the context of the study, because you have to envision the entirety of the 3-factor outcomes of interest (so, really a set of alternative hypotheses: effect of A and effect of B and effect of C, and interaction of A and B, etc.). See PROC GLIMMIX as a Teaching and Planning Tool for Experiment Design for an example of the process.


For your question, I might imagine that the 30 ppm applies to the difference in age. But what if the difference in age is not expected to be the same for all temperatures and humidities? Then you need to assess power for interactions and it gets more complicated.


I hope this helps.


Calcite | Level 5

Thank you! This is really helpful! 



Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1 like
  • 3 in conversation