Hi all,
I have a question measures of efficiency in choice designs: in the SAS manual for %choiceff, the examples report both D-Efficiency and Relative D-Efficiency. D-efficiency is simply the number of choice sets. In all other examples I have read, D-efficiency is on a scale of 0-100 where 100 is the orthogonal design. Is that what is known as "relative" d-efficiency? Why does %choiceff draw the distinction between these two measures?
In procedures such as Optex, what is being reported - the "relative" d-efficiency, or the d-efficiency as defined in %choiceff, or a measure that uses some other method of coding?
Thanks!
http://support.sas.com/techsup/technote/mr2010c.pdf
Everything I have ever known on this topi is in this pdf. I tried to make it a basic (free!) text on choice design in SAS. I would say that all efficiencies are relative. The range depends on the coding. For some simple (or toy) examples, I can make efficiency range from 0 to 100. Other times, when things are complicated, we dot no know the range.
I saw your maxdiff question. I don't know how to answer it. I would just use the mktbibd macro to make a maxdiff design.
Hi Warren,
Thanks so much for the help! I've reviewed the PDF you sent and I think I can refine my questions.
The first is, do the D-efficiencies reported in SAS (proc optex, and results from %mktex, from %mktbibd, and the output from %choiceff) all use standardized orthogonal coding so that the D-efficiency is equal to
100 × 1/(N_D |(X'X)^(-1) |^(1⁄p) ) ?
And the second is, what is the standard way of generating a d-efficient experimental design for the maxdiff profile case (case 2)? It seems that %mktbibd does not have a way to include factor levels (being more suited to case 1); and %choiceff seems designed for discrete choice and/or the multi-profile case (case 3)?
Thanks!
No. There is great flexibility in coding and efficiency calculations, just as there is great flexibility in most everything else in SAS. In materials I write, I try to do things so that we get a 0 to 100 efficiency when possible, but it is not always possible.
Use MktBIBD for maxdiff. There are examples in the documentation that show how to use the design once it comes out as a matrix of integers.
BTW, I retired from SAS last year, so I don't have access to source code or anything else that makes it easy to reply. I cannot provide the level of support that I once did.
Hi, I understand, I'm not trying to be difficult, this is just very hard to wrap my mind around and I appreciate the help!
I don't think that BIBD are actually appropriate for the kind of experiment I'm thinking of. The goal is to have respondents evaluate one profile of factors at a time and to select the best and worst of the factor levels in that specific profile. Most of the literature I have seen mentions fractional factorial designs that are blocked in order to reduce respondent fatigue. I've tried to do this a few ways.
First, using the %mktex macro to create a candidate set and then the %choiceff macro to search the candidate set for an efficient design, following the chair example in the Discrete Choice chapter. This seems ideal because it optimizes the variance matrix for a logit model rather than for a linear model. However, in this example
%choiceff(data=design, /* candidate set of alternatives */
model=class(x1-x5 / sta), /* model with stdzd orthogonal coding */
nsets=6, /* number of choice sets */
maxiter=100, /* maximum number of designs to make */
seed=121, /* random number seed */
flags=3, /* 3 alternatives, generic candidates */
options=relative, /* display relative D-efficiency */
beta=zero) /* assumed beta vector, Ho: b=0 */
the argument "flags=3" specifies that each choice set has three alternatives, and "flags=1" is not allowed. I understand that you can block the resulting dataset using %mktblocks, however %mktblocks requires the argument nalts>1 as well.
I have tried using proc optex directly, in which I first create a dataset that includes another column for the block number and use it as the input Candidate dataset, with orthogonal coding, and sepcifying a model that includes Block, e.g.
proc optex data=Candidates seed=8327 coding=orth;
class F1 F2 F3 Block;
model F1 F2 F3 Block;
generate n=25 method=federov;
output out=Design number=dbest;
run;
However, I'm not sure if this is doing anything different than using the ADX interface would do (which I have also tried). Finally, I'm not sure if the d-efficiency from each method of creating a design is comparable. Is there a way to use %choiceff and %mktblocks in a way to create the kind of design I described?
Forget about efficiency for a moment and give me some example of what you want. Make it as concrete as you can. If there is anything confidential about what you are doing, then make something up in a different context. At the moment, I am not understanding what you want. You are correct that macros like ChoicEff assume two or more alternatives--they assume a set of alternatives. My best guess is you simply want to use MktEx, but I need to better understand what you are doing. Is it a good old fashioned pre-1990s-style conjoint task with some additional collection of level info? Then definitely use MktEx.
Sure! It's seeming likely that I do need to use mktex. Here are two examples, one from a published paper and one concrete but much simpler example.
The published paper is Franco et al. (2015) at https://doi.org/10.1016/j.jphys.2014.11.001
They estimate preferences for an exercise program. The program has 9 attributes (e.g. time spent on exercise, frequency per week), with 5 levels each. A single choice task consists of one scenario with 9 attribute levels (e.g. 20 minutes, once per week). Respondents choose which factor would make the most likely to participate and which factor least likely to participate in that given program. They used a Bayesian D-efficient design and imposed constraints on some attribute level combinations. The final design included 40 choice tasks blocked into 4 blocks of 10 questions.
For this particular design, I won't have preliminary data from which to construct priors, and no need to impose restrictions. In a consumer choice study of preferences for organic or free range eggs I would have the following attributes and levels.
Organic - yes, no
Free range - cage free, free range, no
Production - local, not local
Cost - $1, 2, 3, 4, 5
The full factorial would have 60 combinations and a saturated design would have 9 runs. Suppose I wanted 24 runs - each run being one scenario - grouped into 4 blocks of 6 runs each, so that each individual would respond to 6 scenarios.
One possible choice task with responses would look like this:
Most preferred Least preferred
x Organic
Cage Free
Local
$1 x
I ran the code:
%mktruns(2 3 2 5); /* the number of levels for each attribute */ %mktex(2 3 2 5, n=24, seed=1234) proc print data=design(obs=24); run;
To get a design with a D-Efficiency of 98.1924 and an Avg Prediction Std Error of 0.6124 and can see all 24 runs, but the design is not blocked.
Your code looks reasonable to me. Just add a MktBlock step to do the blocking. It does not require a choice design. There are different options for choice designs and linear designs. Off hand, I don't remember what they are. Btw, don't get hung up on the canonical correlations it produces. If I had it to do over again, I would not have included them. They cause needless consternation.
Now, all that said, how are you analyzing the data? Is there some overall statistical model? Alternatively, is this more descriptive? If the latter is true, then I think you are on the right track.
Ok, thanks. Yes, I got MktBlock to work as well by adding %mktblock(data=randomized, nblocks=4). This data should be analyzed fairly similarly to DCE in a random utility framework by using logit models, with the difference being that we would define a probability that some best attribute level is chosen and some worst attribute level is chosen as being the probability that the difference in utility between them is greatest as in Lusk & Briggeman (2009) on Food Values or Flynn et al. (2008) [https://doi.org/10.1186/1471-2288-8-76]
Hi Warren,
Do you know if I would run into any problems in interpreting each run in each block as one choice profile, where the numbers in x1-x4 indicate the levels of each factor, and the numbers correspond to the level? I read the %mktroll pdf and it seems to be mostly rearranging the output into a choice experiment format with 2 or more alternatives per choice set.
In addition, since the %mktex macro was used to create the choice design, it is not inconsistent with analyzing the data with a multinomial logit or related model, right?
Thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.