I designed an experiment for a 32 subjects to choose from 1 of three options of leaf disc. Based on the Candy example in the SAS library I think that predicting the choice probabilities is best but I'm having trouble fitting the model or maybe its just my code. It seems to be similar data to the Candy example except there are only 3 alternatives rather than 8, I'm not sure what I've done wrong here.
Find data in a txt file attached. It is structured based on this prompt: "chosen alternative is indicated by c=1, which means first choice. All second and subsequent choices are unobserved, so the unchosen alternatives are indicated by c=2, which means that all we know is that they would have been chosen after the first choice (as a second or subsequent choice). Both the chosen and the unchosen alternatives must appear in the input data set since both are needed to construct the likelihood function. The c=2 observations enter into the denominator of the likelihood function, and the c=1 observations enter into both the numerator and the denominator."
Current code:
proc print data=choicebtm noobs;
where Subj <= 2;
var Subj set c ef ea b;
run;
Data DesignM;
Infile datalines dlm="09"x;
input ef ea b;
datalines;
0 0 1
0 1 0
1 0 0
;
run;
data choiceBTM;
input choice; drop choice;
Subj = _n_; Set = 1;
do i = 1 to 3;
c = 2 - (i eq choice);
set DesignM point=i;
output;
end;
datalines;
5 6 7 5 2 6 2 6 6 6
;
proc phreg data=ChoiceBTM outest=betas;
strata subj set;
model c*c(2) = ef ea b / ties=breslow;
label ef = ’Euonymus fortunei’ ea = ’Euonymus alatus’
b = ’Buxus’;
run;
This is the log
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
68
69 */first set up the original dataset;
70 Data ChoiceBTM;
71 Infile datalines dlm="09"x;
72 input Subj c ef ea b;
73 Set=1;
74 datalines;
NOTE: The data set WORK.CHOICEBTM has 96 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 781.25k
OS Memory 29092.00k
Timestamp 10/27/2022 05:09:33 AM
Step Count 250 Switch Count 2
Page Faults 0
Page Reclaims 90
Page Swaps 0
Voluntary Context Switches 11
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264
171 ;
172 run;
173
174 proc print data=choicebtm noobs;
175 where Subj <= 2;
176 var Subj set c ef ea b;
177 run;
NOTE: There were 6 observations read from the data set WORK.CHOICEBTM.
WHERE Subj<=2;
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.01 seconds
user cpu time 0.02 seconds
system cpu time 0.00 seconds
memory 1580.37k
OS Memory 29352.00k
Timestamp 10/27/2022 05:09:33 AM
Step Count 251 Switch Count 0
Page Faults 0
Page Reclaims 95
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 16
178
179 Data DesignM;
180 Infile datalines dlm="09"x;
181 input ef ea b;
182 datalines;
NOTE: Invalid data for ef in line 183 1-80.
NOTE: Invalid data for ea in line 184 1-80.
NOTE: Invalid data for b in line 185 1-80.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
185 1 0 0
NOTE: Invalid data errors for file CARDS occurred outside the printed range.
NOTE: Increase available buffer lines with the INFILE n= option.
ef=. ea=. b=. _ERROR_=1 _N_=1
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.DESIGNM has 1 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 747.40k
OS Memory 29092.00k
Timestamp 10/27/2022 05:09:33 AM
Step Count 252 Switch Count 2
Page Faults 0
Page Reclaims 88
Page Swaps 0
Voluntary Context Switches 15
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264
186 ;
187 run;
188
189 data choiceBTM;
190 input choice; drop choice;
191 Subj = _n_; Set = 1;
192 do i = 1 to 3;
193 c = 2 - (i eq choice);
194 set DesignM point=i;
195 output;
196 end;
197 datalines;
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
198 5 6 7 5 2 6 2 6 6 6
choice=5 Subj=1 Set=1 i=4 c=2 ef=. ea=. b=. _ERROR_=1 _N_=1
NOTE: The data set WORK.CHOICEBTM has 3 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 1070.53k
OS Memory 29352.00k
Timestamp 10/27/2022 05:09:33 AM
Step Count 253 Switch Count 2
Page Faults 0
Page Reclaims 124
Page Swaps 0
Voluntary Context Switches 11
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 272
199 ;
200 run;
201
202 */ trying this now;
203 proc phreg data=ChoiceBTM outest=betas;
204 strata subj set;
205 model c*c(2) = ef ea b / ties=breslow;
206 label ef = ’Euonymus fortunei’ ea = ’Euonymus alatus’
___________
22
76
ERROR 22-322: Expecting a quoted string.
ERROR 76-322: Syntax error, statement will be ignored.
207 b = ’Buxus’;
208 run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.BETAS may be incomplete. When this step was stopped there were 0 observations and 0 variables.
WARNING: Data set WORK.BETAS was not replaced because this step was stopped.
NOTE: PROCEDURE PHREG used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.01 seconds
memory 891.21k
OS Memory 29092.00k
Timestamp 10/27/2022 05:09:33 AM
Step Count 254 Switch Count 0
Page Faults 0
Page Reclaims 181
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 8
209
210 */tried
211
212 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
213 ODS HTML CLOSE;
214 &GRAPHTERM; ;*';*";*/;RUN;QUIT;
215 QUIT;RUN;
216 ODS HTML5 (ID=WEB) CLOSE;
217
218 FILENAME _GSFNAME;
NOTE: Fileref _GSFNAME has been deassigned.
219 DATA _NULL_;
220 RUN;
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 460.84k
OS Memory 28328.00k
Timestamp 10/27/2022 05:09:33 AM
Step Count 255 Switch Count 0
Page Faults 0
Page Reclaims 24
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 0
221 OPTIONS NOTES STIMER SOURCE SYNTAXCHECK;
222
Hello,
I moved this post (topic) to the 'Statistical Procedures' board under the 'Analytics' header.
Why are you using PROC PHREG instead of PROC BCHOICE ?
SAS® 9.4 and SAS® Viya® 3.5 Programming Documentation | SAS 9.4 / Viya 3.5
SAS/STAT 15.2 User's Guide
The BCHOICE Procedure
Example 29.7 Predict the Choice Probabilities
https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_bchoice_examples07.htm
Koen
Hi Koen @sbxkoenk
I have tried the Proc BCHOICE as well and it didn't work. I was basing my code off of document MR2010F found a this link https://support.sas.com/resources/papers/tnote/tnote_marketresearch.html
Here is my code. I used the same data as in the original post so I didn't repost it here.
Data ChoiceBTM;
Infile datalines dlm="09"x;
input Subj c ef ea b;
Set=1;
datalines;
*/insert data here;
;
run;
Data DesignM;
Infile datalines dlm="09"x;
input ef ea b;
datalines;
0 0 1
0 1 0
1 0 0
;
run;
*/tried this and it didn't run;
proc bchoice data=ChoiceBTM outpost=bsamp nmc=10000 thin=2 seed=124;
class ef(ref='0') ea(ref='0') b(ref='0') Subj;
model c = ef ea b / choiceset=(Subj);
preddist covariates=DesignM nalter=3 outpred=Predout;
run;
Here is the log. I get a blank results page. And as @Quentin pointed out, I need to fix something but that is the underlying question - how do I fix this?
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
68
69 Data DesignM;
70 Infile datalines dlm="09"x;
71 input ef ea b;
72 datalines;
NOTE: Invalid data for ef in line 73 1-80.
NOTE: Invalid data for ea in line 74 1-80.
NOTE: Invalid data for b in line 75 1-80.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
75 1 0 0
NOTE: Invalid data errors for file CARDS occurred outside the printed range.
NOTE: Increase available buffer lines with the INFILE n= option.
ef=. ea=. b=. _ERROR_=1 _N_=1
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.DESIGNM has 1 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 668.71k
OS Memory 24228.00k
Timestamp 10/28/2022 02:40:07 PM
Step Count 30 Switch Count 2
Page Faults 0
Page Reclaims 144
Page Swaps 0
Voluntary Context Switches 9
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264
76 ;
77 run;
78
79 */tried this and it didn't run;
NOTE: PROCEDURE BCHOICE used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 403.65k
OS Memory 24484.00k
Timestamp 10/28/2022 02:40:07 PM
Step Count 31 Switch Count 2
Page Faults 0
Page Reclaims 249
Page Swaps 0
Voluntary Context Switches 10
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 160
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.BSAMP may be incomplete. When this step was stopped there were 0 observations and 0 variables.
80 proc bchoice data=ChoiceBTM outpost=bsamp nmc=10000 thin=2 seed=124;
81 cla
___
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
82 ss ef(ref='0') ea(ref='0') b(ref='0') Subj;
83 model c = ef ea b / choiceset=(Subj);
84 preddist covariates=DesignM nalter=3 outpred=Predout;
85 run;
86 quit;
87
88 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
98
Hi,
The errors are because you have curly quotes in your code. SAS expects straight quotes. This often happens if you copy code from a Word doc or PDF or similar. If you delete the quotes and retype them, they should become straight, i.e. change:
label ef = ’Euonymus fortunei’ ea = ’Euonymus alatus’
b = ’Buxus’;
to:
label ef = 'Euonymus fortunei' ea = 'Euonymus alatus'
b = 'Buxus';
But also, note that your log shows there were problems reading in the data, so you will likely need to fix these issues as well:
179 Data DesignM; 180 Infile datalines dlm="09"x; 181 input ef ea b; 182 datalines; NOTE: Invalid data for ef in line 183 1-80. NOTE: Invalid data for ea in line 184 1-80. NOTE: Invalid data for b in line 185 1-80. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 185 1 0 0 NOTE: Invalid data errors for file CARDS occurred outside the printed range. NOTE: Increase available buffer lines with the INFILE n= option. ef=. ea=. b=. _ERROR_=1 _N_=1 NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.