Hi everybody,
I have a question about designing a discrete choice experiment. I have 2 alternatives (Alt1 and Alt2) and also a no-choice option. My alternatives are unlabeled. I have 4 attributes (X1, X2, X3, X4) with their levels mentioned below:
X1: 0=no 1=yes
X2: 0=no 1=yes
X3: 0=no 1=yes
X4: %10 %25 %50 %75
Based on what I read, I created two different versions. Here are the versions and some results;
Version 1;
%MktRuns(2 2 2 4)
%MktEx (2 2 2 4, n=16)
%choiceff(data=design,
model=class(x1-x4/sta),
nsets=8,
flags=2,
maxiter=60,
seed=123,
options=relative,
beta=zero)
Relative D-eff = 64.90
D-error = 0.19
Version 2;
%MktRuns(2 2 2 4)
%MktEx (8 2**3 4, n=16, seed=123)
%mktlab(data=randomized, vars=Set x1-x4)
proc sort data=Final; by set; run;
proc print; by set; id set; run;
%choiceff(data=final,
init=final(keep=set),
model=class(x1-x4/sta),
nsets=8,
nalts=2,
options=relative,
beta=zero)
%mkteval(data=Best)
Relative D-eff = 79.3
D-error = 0.12
Which one is the better result? or can I create a better version?
Thank you so much!
Run your code again. This time, change the first n=16 to n=32. Now compare the Final Results table as well as the table with variances and standard errors. You should see the same final efficiency in both approaches. The variances will be different. Which pattern would you like? When I run the first step (with n=32), my variances are more similar for the 4-level factor, so that might be your preference.
You have a small problem. Don't create a fractional factorial design as input. Use a full-factorial: n=32. I'll run your code in a bit and let you know what I find.
Run your code again. This time, change the first n=16 to n=32. Now compare the Final Results table as well as the table with variances and standard errors. You should see the same final efficiency in both approaches. The variances will be different. Which pattern would you like? When I run the first step (with n=32), my variances are more similar for the 4-level factor, so that might be your preference.
Thank you for the quick answer,
I run my code again. I get the same results in both versions (D-eff=79.37 and D-error=0.15).
As I understand it, there is no problem in determining n as the exact full factorial (n=32).
I edited the final version of the code.
%MktRuns(2 2 2 4)
%MktEx (2 2 2 4, n=32) /*candidate sets*/
proc print; run;
%macro res;
do i = 1 to nalts;
do k = i + 1 to nalts;
if all(x[i,] >= x[k,]) then bad = bad + 1; /* alt i dominates alt k */
if all(x[k,] >= x[i,]) then bad = bad + 1; /* alt k dominates alt i */
end;
end;
%mend;
%choiceff(data=design, /* candidate set of alternatives */
model=class(x1-x4/sta), /* model with stdz orthogonal coding */
nsets=8, /* number of choice sets */
flags=2, /* 2 alternatives, generic candidates */
maxiter=60, /* maximum number of designs to make */
seed=12655, /* random number seed */
options=relative, /* display relative D-efficiency */
beta=zero); /* assumed beta vector, Ho: b=0 */
proc print; var x1-x4; id set; by set; run;
proc format;
value x1f 1=’no’ 2=’yes’;
value x2f 1=’no’ 2=’yes’;
value x3f 1=’no’ 2=’yes’;
value x4f 1=’10’ 2=’25’ 3=’50’ 4=’75’;
run;
proc print label;
label x1 = ’X1’ x2 = ’X2’ x3 = ’X3’ x4 = ’Price’;
format x1 x1f. x2 x2f. x3 x3f. x4 x4f.;
by set; id set; var x:;
run;
proc print data=bestcov label;
title ’Variance-Covariance Matrix’;
id __label;
label __label = ’00’x;
var x:;
run;
title;
%mktdups(generic, /* duplicate choice sets or duplicate alternatives within choice sets*/
data=best,
factors=x1-x4,
nalts=2)
But I noticed a problem. When I examine the choice sets that were created, I think that some choice sets are not realistic. For example, the choice set shown below;
The first alternative has no good features (no no no) but it has a higher price. The second alternative has all the good features (yes yes yes) but has a low price. What can I do to avoid these choice sets?
Thank you very much again
%macro res; bad = (all(x[1,] = 1) & all(x[2,] = 2)) +
(all(x[1,] = 2) & all(x[2,] = 1));
%mend; %choiceff(data=design, /* candidate set of alternatives */
restrictions=res, resvars=x1-x3, /* <===== Use macro =====<<<<<<< /*
model=class(x1-x4/sta), /* model with stdz orthogonal coding */ nsets=8, /* number of choice sets */ flags=2, /* 2 alternatives, generic candidates */ maxiter=60, /* maximum number of designs to make */ seed=12655, /* random number seed */ options=relative, /* display relative D-efficiency */ beta=zero); /* assumed beta vector, Ho: b=0 */
When you write a restrictions macro, you have to actually use it. Try this. I changed your macro to prevent constant levels across y/n within each alternative and then specified the restrictions macro name.
Thank you so much. You are very helpful.
I want to ask one more thing. There are also choice sets in the design same as the example below.
Good features but lower price or vice versa. Should I also write the restrictions for these choice sets? When I add restrictions the D-efficiency score is getting lower. If I want to add another restriction for the example how can I write restriction macro?
Thank you
When you add restrictions, as you noticed, efficiency will almost invariably decrease. This is the price you pay for a realistic design. In practice, restrictions are used a lot. Almost every question I got about choice designs during my employment at SAS was about restrictions. I will be away from my computer for the next several hours. I will look at your specific restrictions question when I get home.
I was in a hurry to go somewhere this morning and did not quite get the restrictions right in my previous post. I edited them. Below, I modified the restrictions so it is bad if either alternative has more yesses and a lower price.
%macro res; bad = (all(x[1,1:3] = 1) & all(x[2,1:3] = 2)) + (all(x[1,1:3] = 2) & all(x[2,1:3] = 1)); g1 = (x[1,1:3] = 2)[+]; * Yesses in alt 1; g2 = (x[2,1:3] = 2)[+]; * Yesses in alt 2; bad = bad + (g1 > g2 & x[1,4] < x[2,4]); * More yesses in 1 and lower price in 1; bad = bad + (g2 > g1 & x[2,4] < x[1,4]); * More yesses in 2 and lower price in 2; %mend; %choiceff(data=design, /* candidate set of alternatives */ restrictions=res, resvars=x1-x4, model=class(x1-x4/sta), /* model with stdz orthogonal coding */ nsets=8, /* number of choice sets */ flags=2, /* 2 alternatives, generic candidates */ maxiter=60, /* maximum number of designs to make */ seed=12655, /* random number seed */ options=relative, /* display relative D-efficiency */ beta=zero); /* assumed beta vector, Ho: b=0 */
Restrictions are written in IML syntax. If you are going to be doing this often, you should study that syntax.
Examples:
vector = constant ---> results in a vector of zeros (when an element does not equal the constant) and ones (equal)
vector[+] ---> sum of the elements in a vector. In the example, vector results from an expression.
Also note the result of any Boolean (logical) expression is zero (false) or one (true) and hence can be used in an arithmetic expression.
I run the code and it works. Relative D-Efficiency decreased from 79 to 66 and D-error increased from 0.15 to 0.18. However, the choice sets are now more realistic. Thank you very much for your support. I learned a lot about creating a discrete choice experiment. I will consider your suggestions for my studies. Also, your documentations are very helpful. Thank you very much for your help.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.