Hi, I'm running a logistic regression and the problem is that not all the coefficients have been estimated.
Premise: I'm running an experiment (binary choice) and, after having created a choice design unsing SAS macro, I run a pre-test (N=51) and now I' merging the data with the design and analysing the data. The experiment consists of a series of binary choices in order to indirectly determine participants' willingness to pay for specific features of an online service. The proposed choices are between two general options A vs B. I have 14 features, of which 13 are binary (i.e., 1=feature shown, 0=feature not shown) while one feature has 8 levels (i.e., price, 1-2-3-4-5-6-7-8 dollars). So, my design is asymmetric, fractional factorial, and uses blocks.
The design is created as follows:
%mktruns (2 ** 13 8)
%mktex(2**13 8, n=64, seed=205)
proc format;
value Status_update 1 = 1 2 = 0;
value Notes 1 = 1 2 = 0;
value Comments 1 = 1 2 = 0;
value Wall_posts 1 = 1 2 = 0;
value Private_messages 1 = 1 2 = 0;
value Chats 1 = 1 2 = 0;
value Groups 1 = 1 2 = 0;
value Newsfeed 1 = 1 2 = 0;
value Like 1 = 1 2 = 0;
value Photos_videos 1 = 1 2 = 0;
value Events 1 = 1 2 = 0;
value Gaming 1 = 1 2 = 0;
value Fan_pages 1 = 1 2 = 0;
value Price 1 = $1 2 = $2 3 = $3 4 = $4 5 = $5 6 = $6 7 = $7 8 = $8;
run;
%mktlab(data=design, /* input data set */
vars=Status_update Notes Comments Wall_posts Private_messages Chats Groups Newsfeed Like Photos_videos Events Gaming Fan_pages Price, /* new attribute names */
int=f1-f2, /* create 2 columns of 1’s in f1-f2 */
out=final, /* output design add a format statement for the attributes */
stmts=format Status_update Status_update. Notes Notes. Comments Comments. Wall_posts Wall_posts. Private_messages Private_messages. Chats Chats. Groups Groups. Newsfeed Newsfeed. Like Like. Photos_videos Photos_videos. Events Events. Gaming Gaming. Fan_pages Fan_pages. Price Price.)
proc print; run;
%choiceff(data=final, /* candidate set of alternatives */
bestout=sasuser.facebookdes, /* choice design permanently stored */
/* model with stdz orthog coding */
model=class(Status_update Notes Comments Wall_posts Private_messages Chats Groups Newsfeed Like Photos_videos Events Gaming Fan_pages Price / sta),
nsets=64, /* number of choice sets to make */
seed=205, /* random number seed */
flags=f1-f2, /* flag which alt can go where, 2 alts*/
options=relative, /* display relative D-efficiency */
beta=zero) /* assumed beta vector */
%mktblock(data=final, /* input choice design to block */
out=sasuser.facebookdes, /* output blocked choice design stored in permanent SAS data set */
nalts=2, /* two alternatives */
nblocks=4, /* eight blocks */
factors=Status_update Notes Comments Wall_posts Private_messages Chats Groups Newsfeed Like Photos_videos Events Gaming Fan_pages Price, /* 14 attributes, x1-x14 */
print=design, /* print the blocked design (only) */
seed=205) /* random number seed */
The design consists of 64 choice sets that has been blocked using 4 blocks with 8 choice sets each.
At this point a small pretest has been run, N=51, where each participant has been shown 1 random block (8 choice sets).
I have read the data as follows (1 means choice set A was chosen, 2 corresponds to choice set B):
data chdata;
input Block Sub (c1-c8) (1.) @@;
datalines;
4 1 12212111
3 2 21121121
1 3 12212112
4 4 12211221
2 5 12211221
2 6 21121221
4 7 11111111
2 8 21121221
3 9 21121121
1 10 12212112
2 11 21221221
1 12 12211221
3 13 21122112
2 14 12211221
3 15 22111121
2 16 12221212
2 17 21121221
4 18 12211221
4 19 12212111
2 20 11111111
4 21 12212111
2 22 21121221
3 23 12211121
3 24 21122111
3 25 12211121
4 26 22221111
1 27 21111112
4 28 12211121
2 29 11221221
2 30 12121221
3 31 21121121
4 32 12211121
2 33 11211221
1 34 12211221
1 35 12212112
1 36 11112222
1 37 12211221
1 38 12211121
4 39 12212112
3 40 21121121
1 41 12212222
4 42 12212111
1 43 12211222
2 44 12211221
1 45 11111221
4 46 12212112
4 47 12212112
3 48 12211111
4 49 12212111
1 50 11212112
4 51 12212111
;
The following is how I merged the design with data and then run the data analysis:
%mktmerge(design=sasuser.facebookdes, /* input final blocked choice design */
data=chdata, /* input choice data */
out=desdata, /* output design and data */
blocks=block, /* the blocking variable is block */
nsets=8, /* 13 choice sets per subject */
nalts=2, /* 2 alternatives in each set */
setvars=c1-c8) /* the choices for each subject vars */
%phchoice(on) /* customize PHREG for a choice model */
proc phreg brief data=desdata; /* provide brief summary of strata */
ods output parameterestimates=pe;/* output parameter estimates */
class Status_update Notes Comments Wall_posts Private_messages Chats Groups Newsfeed Like Photos_videos Events Gaming Fan_pages Price / ref=first; /* name all as class vars, ’1’ ref level*/
model c*c(2) = Status_update Notes Comments Wall_posts Private_messages Chats Groups Newsfeed Like Photos_videos Events Gaming Fan_pages Price; /* 1 - chosen, 2 - not chosen */
/* these are independent vars */
strata block sub set; /* set within subject within block */
run; /* identify each choice set */
proc sort data=pe; /* process the parameter estimates */
by descending estimate; /* table by sorting by estimate */
run;
data pe; /* also get rid of the ’2’ level */
set pe; /* in the label */
substr(label, length(label)) = ' ';
run;
proc print label; /* print estimates with largest first */
id label;
label label = '00'x;
var df -- probchisq;
run;
%phchoice(off) /* restore PHREG to a survival PROC */
The output is as follow (log and output attached):
as you can see, not all the variables coefficient are estimated. My belief is that they all should be estimated as the design should allow to determine all the main effects. My guess is that either I might have done a mistake when I allocated the participants to the choice sets (every participant has been randomly assigned to one block, therefore taking all the eight choice sets that belong to that block) or I'm not using the right parameters in the %mktmerge or proc phreg. I can't proceed to the second stage of my study as I need all the coefficients in order to calculate the willingness to pay for every variable/feature of the service.
Is anyone able to help me understanding what's the issue? Many thanks
Here are two of your code snippets. Everything is wrong from MktBlock on down because you need to be blocking the design that comes out of the ChoicEff macro not the one that goes into it.
%choiceff(data=final, /* candidate set of alternatives */
bestout=sasuser.facebookdes, /* choice design permanently stored */
%mktblock(data=final, /* input choice design to block */
Here are two of your code snippets. Everything is wrong from MktBlock on down because you need to be blocking the design that comes out of the ChoicEff macro not the one that goes into it.
%choiceff(data=final, /* candidate set of alternatives */
bestout=sasuser.facebookdes, /* choice design permanently stored */
%mktblock(data=final, /* input choice design to block */
Thank you very much @WarrenKuhfeld, quite embarrassing.
One more question, if you don't mind: do you see any other mistake in the code that follows MktBlock? (I don't)
For the record, I changed two lines as follows
%mktblock(data=sasuser.facebookdes, /* input choice design to block */
nblocks=8, /* eight blocks */
and everything seems fine now.
Thanks again, Carlo
Carlo, You are welcome!
I am assuming your data are artificial for testing purposes, so it does not matter that it is being merged with a different design than the one you used previously. If not, you need to discard the data.
You tend to use the same name for input and output data sets. That is not wrong, but as you are refining and revising code, it makes things more difficult. You have to go back and rerun previous steps. You might want to do that less often.
Your statement substr(label, length(label)) = ' '; removes some distinguishing information from the labels. I believe you want to remove it.
Other than that, it looks right to me.
Best,
Warren
"I am assuming your data are artificial for testing purposes, so it does not matter that it is being merged with a different design than the one you used previously."
Actually I created a simulated dataset with 8 blocks instead of 4 (and 8 choice set for each participant) and rerun the whole code. I just did not paste the whole thing again as now I'm confident that it's correct.
Thanks again, very kind!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.