Re: Bayesian Multinomial Logit Model

niam · Posted 04-30-2013 06:21 PM

Hello All;

I would like to do a Bayesian analysis on a cross-sectional data set in which the response variable is categorical and the independent variables include both continues, binary and categorical types.

The response variable is the consumer choice among three different alternatives, namely Y, E and N

The choice of the consumer depends on a set of characteristics pertaining to the consumer including his/her age, gender and socioeconomic class (5 classes). The data set includes over 30 thousand observations and looks like this:

Consumer ID

Consumer choice

consumer age

Gender (=1 if female)

SES class1 (=1 if class=1)

SES class2 (=1 if class="2")

SES class3 (=1 if class="3")

SES class4 (=1 if class="4")

1

Y

29

1

0

2

E

39

0

1

0

3

Y

18

1

0

1

4

N

30

1

0

There is a SAS paper here: Fitting Bayesian hierarchical multinomial logit models in PROC MCMC

which shows how to use PROC MCMC to model consumer choice based on the alternative characteristics. However, in my case, the characteristics are related to the consumer not the choices and this makes it unfeasible to create a data structure like the one used in the mentioned paper.

Any help on modeling this problem is appreciated!

Thanks for reading

SteveDenham · Posted 05-01-2013 09:19 AM

The characteristics you mention as related to the consumer would be predictors/covariates in a multinomial model, wouldn't they? I assume that the SES variables are mutually exclusive, so the final results and credible intervals should be OK. I don't see a hierarchical nature to your dataset, so I think (hope?) that the analysis should be straightforward. I see you tagged GENMOD--have you run the comparable frequentist analysis in GENMOD?

Steve Denham

niam · Posted 05-01-2013 03:16 PM

I have run the frequentist analysis in PROC CATMOD and it works perfectly. I was curious to see the Bayesian results.

niam · Posted 05-03-2013 03:06 PM

By hierarchical nature, do you mean a panel structure? Can I use the Bayesian random effects in my cross sectional data set?

Doc_Duke · Posted 05-01-2013 01:56 PM

With large N, the maximum likelihood estimates for a Bayesian and frequentist analysis should converge to the same estimates, as the data will overwhelm any prior. (In logistic regression analysis, the effective sample size is determined by the group whose % is nearest the boundary so it could be that your effective sample size is much smaller than 30,000 and my comment won't help.).

Doc Muhlbaier

Duke

niam · Posted 05-01-2013 03:24 PM

So you mean that there is no merit in Bayesian analysis when I already have the frequentist results due to the fact that with my large sample the results will be the same?

Doc_Duke · Posted 05-02-2013 09:01 AM

"No merit" is a bit strong a term. Once you get estimates, you still have to decide how to use them and the Bayesian and classical frequentist interpretations can still be different.

niam · Posted 05-09-2013 03:11 PM

I contacted the author of the mentioned paper and he was kind enough to help me with providing a code for the exact problem described above. I will copy the code here so others with the same question may also use it.

I agree with Doc@Duke's comments -- there's some sense in Bayesian statistics that with moderate to large sample sizes, the sample should overwhelm the prior when you're making inferences, as long as you've chosen reasonable, relatively non-informative priors.  A Bayesian analysis still has "merit" -- you can interpret the posterior estimates as true probabilities, and you have the ability to incorporate your prior beliefs into your estimates.  That said, this model will probably take a long time to run, and will probably take a large number of iterations to reach convergence, so you'll have to weight whether the properties of a Bayesian analysis are worth the headache.

Good luck!


Jake

data niam;
 input id choice $ age female ses1 - ses4;

 /* Recode choice into dummy variables */
 choice_y = 0;
 choice_e = 0;
 choice_n = 0;
 if choice eq "Y" then choice_y = 1;
  else if choice eq "E" then choice_e = 1;
  else if choice eq "N" then choice_n = 1;

 cards;
1    Y    29    1    1    0    0    0
2    E    39    0    0    1    0    0
3    Y    18    1    0    0    0    1
4    N    30    1    1    0    0    0
;
run;

proc mcmc
 data = niam
 nmc = 1000 nbi = 100
 outpost = posterior;

 /* Data */
 array Y[3] choice_y choice_e choice_n;

 /* Parameters */
 array p[3];
 array beta0[2];
 array beta1[2];
 array beta2[2];
 array beta3[2];
 array beta4[2];
 array beta5[2];
 parms beta:; /* You may want to divide these up into separate "parms" statements, because I think
  if you use a single statement, all the parameters are proposed together, which might lead to slower
  convergence */

 /* Priors */
 prior beta: ~ norm(0, var = 100);
 /* Independent, normal priors for the beta coefficients in the model.  You should change the priors
  to represent your prior beliefs */

 /* linear predictor */
 array eta[2];
 do i = 1 to 2;
  eta = beta0 + beta1 * age + beta2 * female + beta3 * ses2 + beta4 * ses3 + beta5 * ses4;
 end;

 denominator = 1 + exp(eta[1]) + exp(eta[2]);

 /* Inverse logit transformation */
 do i = 1 to 2;
  p = exp(eta) / denominator;
 end;
  p[3] = 1/denominator;

 /* Sampling model */
 model y ~ multinom(p);
run;

Catch up on SAS Innovate 2026