Designing a Mixture Experiment with PROC OPTEX

In my previous post I discussed how to properly analyze a mixture experiment. But how do you choose which mixture combinations to run to properly estimate the model you want to fit? This is where the design of experiments is helpful. In this post I will discuss an approach to use when creating a designed experiment for a mixture situation.

So what is a mixture situation? A mixture situation is when the “independent variables” or factors have a constraint that they add to 100%. For example, suppose I want to make a mixed drink, a Harvey wallbanger. This drink is made up of three ingredients: orange juice, vodka, and a liqueur. When making this drink, I could make the drink in a small glass or in a large barrel. In other words, the amount of each ingredient does not matter. The only thing that matters is the proportion of each ingredient. So as one proportion goes up, another one (or more than one of the other ingredients) must go down. This constraint of adding to 100% makes the situation unique and causes some challenges in creating the design and in the analysis.

There are several different types of mixture designs: simplex-centroid, simplex-lattice, and extreme vertices are considered the “classic” mixture designs. However, these designs typically require an unconstrained design space or require too many runs to be of practical use. Therefore, optimal designs are typically used which is what will be discussed here.

Optimal designs find the best combination of runs according to a given criterion. There are many choices of criteria, such as A-optimality, D-optimality, U-optimality, and S-optimality. Regardless of the criteria chosen, you must supply the model form that you wish to use for the criterion to be evaluated and the number of runs that you desire for the experiment. For simplicity, I will only focus on D-optimal designs in this post, but you could use whichever optimality criterion you wish.

To get started, PROC OPTEX requires a list of candidate runs. These are possible runs to use in the final design. The algorithm will search through these candidate runs to find the best subset combination of runs to estimate the desired model. The approach is called the row-exchange algorithm since it will exchange one row of the candidate set for a row that is currently in the design. The algorithm continues in this fashion until the best set of runs is found.

As an example, suppose I wish to design an experiment for three mixture variables, A, B, and C. Component A can only be in the range 0.05 to 0.75, Component B has a range of 0.1 to 0.8, and Component C is in the range 0.15 to 0.6. To get started on creating the design, we need to form the candidate set. Typically, a fine grid of points is created that matches the criteria of the design space. This will provide a rich candidate set for PROC OPTEX to find a good design. You can make the grid as fine as you would like, but realize that the finer the grid is, the larger the candidate set, and the longer it may take for PROC OPTEX to run.

data candidates;
   do a=0.05 to 0.75 by 0.05;
      do b=0.1 to 0.8 by 0.05;
         c=1-a-b;
         if (0.15 <= c <= 0.6) then output;
      end;
   end;
run;

This code will create a good candidate set. Notice that by forcing C=1-A-B, we are ensured of only getting valid mixtures. However, we also need to add the range of component C before outputting the run to ensure that we meet those constraints as well. If our constraints are well-defined, this step may not be necessary, but this is just a quick and easy way to ensure our constraints are consistent. For this situation there are 97 runs in our candidate set.

With the candidate set created we can now turn our attention to PROC OPTEX. In order to create a design, we need to specify the model that we wish to estimate. This is where knowledge of how to fit a mixture model is needed. For this situation, suppose that we wish to fit the special cubic model, meaning that our model should be Y=b1*A+b2*B+b3*C+b12*A*B+b13*A*+b23*B*C+123*A*B*C (let's call this the NOINT model). As discussed in my last post, this model is not really a no-intercept model, and another way to fit the model is to remove one of the main effects. So this model:

Y=b0 + b1*A+b2*B+b12*A*B+b13*A*C+b23*B*C+b123*A*B*C (let's call this one the NOME model for no main effect) which has an intercept will be equivalent to the Scheffe mixture model. The intercept will be the estimate for component C.

proc optex data=candidates seed=27513 coding=none;
   model a b a*b a*c b*c a*b*c;
   generate n=12 criterion=D;
   output out=dsgn;
run;

Two options were used on the PROC OPTEX line. The seed makes the design selection reproducible. Using this seed and running the code again will provide the same design. The coding=none option tells SAS to create the design without using any coding on the model effects. In many situations you would want to code the effects to make them orthogonal and on the same scale. However, for a mixture situation this coding is not typically used. You could still perform the coding, but the results are not going to be the same as what you would obtain when fitting the model.

The PROC OPTEX statements include a MODEL line which is fitting the intercept model that was discussed earlier. The GENERATE statement specifies that we are looking for a design with only 12 runs that uses the D-optimality criterion. Finally, the output line specifies the dataset that will contain the final design.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

PROC OPTEX shows the ranges for each of the components as well as the ten designs that were considered by the procedure. Considering 10 designs is the default for PROC OPTEX, but you could specify more (or fewer) by using the ITER=n options on the GENERATE statement.

The output from PROC OPTEX can be seen with a PROC PRINT statement.

Or graphically:

Looking at this design you can see that it seems to cover the design space well. Although 9 points appear on the graph, by looking at the printed design you can see that observations 3 and 4 are the same conditions as are observations 8 and 9 and observations 11 and 12.

This approach to creating the design makes sense, but there are other ways to consider a mixture model. What if we want to use the NOINT model mentioned above? All of the code would look the same except for the MODEL statement in PROC OPTEX.

proc optex data=candidates seed=27513 coding=none;
   model a b c a*b a*c b*c a*b*c / noint;
   generate n=12 criterion=D;
   output out=dsgn2;
run;

Alternatively you could also specify the model as

MODEL A|B|C / NOINT;

Running this code, printing out the design, and displaying it graphically will show that this model will give you the exact same design as the NOME model.

Regardless of how you wish to specify your mixture model, PROC OPTEX will find an appropriate mixture design given your optimization criteria and the number of runs.

Find more articles from SAS Global Enablement and Learning here.

Designing a Mixture Experiment with PROC OPTEX

Free course: Data Literacy Essentials

Get Started