04302013 06:21 PM
Hello All;
I would like to do a Bayesian analysis on a crosssectional data set in which the response variable is categorical and the independent variables include both continues, binary and categorical types.
The response variable is the consumer choice among three different alternatives, namely Y, E and N
The choice of the consumer depends on a set of characteristics pertaining to the consumer including his/her age, gender and socioeconomic class (5 classes). The data set includes over 30 thousand observations and looks like this:
Consumer ID  Consumer choice  consumer age  Gender (=1 if female)  SES class1 (=1 if class=1) 


 

1  Y  29  1  1  0  0  0  
2  E  39  0  0  1  0  0  
3  Y  18  1  0  0  0  1  
4  N  30  1  1  0  0  0 
There is a SAS paper here: Fitting Bayesian hierarchical multinomial logit models in PROC MCMC
which shows how to use PROC MCMC to model consumer choice based on the alternative characteristics. However, in my case, the characteristics are related to the consumer not the choices and this makes it unfeasible to create a data structure like the one used in the mentioned paper.
Any help on modeling this problem is appreciated!
Thanks for reading
05012013 09:19 AM
The characteristics you mention as related to the consumer would be predictors/covariates in a multinomial model, wouldn't they? I assume that the SES variables are mutually exclusive, so the final results and credible intervals should be OK. I don't see a hierarchical nature to your dataset, so I think (hope?) that the analysis should be straightforward. I see you tagged GENMODhave you run the comparable frequentist analysis in GENMOD?
Steve Denham
05012013 03:16 PM
I have run the frequentist analysis in PROC CATMOD and it works perfectly. I was curious to see the Bayesian results.
05032013 03:06 PM
05012013 01:56 PM
With large N, the maximum likelihood estimates for a Bayesian and frequentist analysis should converge to the same estimates, as the data will overwhelm any prior. (In logistic regression analysis, the effective sample size is determined by the group whose % is nearest the boundary so it could be that your effective sample size is much smaller than 30,000 and my comment won't help.).
Doc Muhlbaier
Duke
05012013 03:24 PM
So you mean that there is no merit in Bayesian analysis when I already have the frequentist results due to the fact that with my large sample the results will be the same?
05022013 09:01 AM
"No merit" is a bit strong a term. Once you get estimates, you still have to decide how to use them and the Bayesian and classical frequentist interpretations can still be different.
05092013 03:11 PM
I contacted the author of the mentioned paper and he was kind enough to help me with providing a code for the exact problem described above. I will copy the code here so others with the same question may also use it.
I agree with Doc@Duke's comments  there's some sense in Bayesian statistics that with moderate to large sample sizes, the sample should overwhelm the prior when you're making inferences, as long as you've chosen reasonable, relatively noninformative priors. A Bayesian analysis still has "merit"  you can interpret the posterior estimates as true probabilities, and you have the ability to incorporate your prior beliefs into your estimates. That said, this model will probably take a long time to run, and will probably take a large number of iterations to reach convergence, so you'll have to weight whether the properties of a Bayesian analysis are worth the headache.
Good luck!
Jake
data niam;
input id choice $ age female ses1  ses4;
/* Recode choice into dummy variables */
choice_y = 0;
choice_e = 0;
choice_n = 0;
if choice eq "Y" then choice_y = 1;
else if choice eq "E" then choice_e = 1;
else if choice eq "N" then choice_n = 1;
cards;
1 Y 29 1 1 0 0 0
2 E 39 0 0 1 0 0
3 Y 18 1 0 0 0 1
4 N 30 1 1 0 0 0
;
run;
proc mcmc
data = niam
nmc = 1000 nbi = 100
outpost = posterior;
/* Data */
array Y[3] choice_y choice_e choice_n;
/* Parameters */
array p[3];
array beta0[2];
array beta1[2];
array beta2[2];
array beta3[2];
array beta4[2];
array beta5[2];
parms beta:; /* You may want to divide these up into separate "parms" statements, because I think
if you use a single statement, all the parameters are proposed together, which might lead to slower
convergence */
/* Priors */
prior beta: ~ norm(0, var = 100);
/* Independent, normal priors for the beta coefficients in the model. You should change the priors
to represent your prior beliefs */
/* linear predictor */
array eta[2];
do i = 1 to 2;
eta = beta0 + beta1 * age + beta2 * female + beta3 * ses2 + beta4 * ses3 + beta5 * ses4;
end;
denominator = 1 + exp(eta[1]) + exp(eta[2]);
/* Inverse logit transformation */
do i = 1 to 2;
p = exp(eta) / denominator;
end;
p[3] = 1/denominator;
/* Sampling model */
model y ~ multinom(p);
run;