I am trying to run a Bayesian logistic regression model, where two parameters need to be constrained to be equal. This model compares two groups in an IRT model, each with its own latent variable theta(the regressor), and each group has its own set of binomial responses, with the response probability determined by theta. I need to constrain some responses to be equal across the groups.
I have some code below, based on the SAS Press book 'Bayesian Analysis of Item Response Theory' by Stone and Zhu. However, they don't have the parameter constraint that I need, which I am trying to specify.
Here is my code. The reposnse data is in the variable x[5], denoting responses to 5 items. The parameters b1[5] and b2[5] are paameters for the item responses for groups 1 and 2 that are allowed to differ. However, responses to the first item have a parameter constraint - their coefficients ItemOne1 and ItemOne2 should be constrained equal across the two groups. The mu and mu_g parameters are the means of the latent regressor theta for the two groups.
proc mcmc data=lsat_g outpost=lsat_bayes seed=23 nthreads=1 nbi=5000 nmc=20000; array ItemOne[2] ItemOne1 ItemOne2; array b1[4]; array b2[4]; array x[5];array mu[2] 0 mu_g; parms ItemOne: 0; parms a1 a2 1; parms b1: b2: 0; prior b1: b2: ~ normal(0, var=16); prior a1 a2 ~ lognormal(0, var=9); parms mu_g 0; prior mu_g ~ normal(0, var=1); beginnodata; /* if (ItemOne1 - ItemOne2 = 0.) then lp = lpdfnorm(ItemOne1,0,4);*/ /* else lp = .;*/ lp = lpdfnorm(ItemOne1,0,4); prior ItemOne1 ItemOne2 ~ general(lp); endnodata; random theta ~ normal(mu[group], var=1) subject=_obs_; llike=0; do j=1 to 5; if group=1 and j=1 then do; prob = logistic(a1*(theta-ItemOne1)); llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob); end; else if group=1 and j > 1 then do; prob = logistic(a1*(theta-b1[j-1])); llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob); end; else if group =2 and j = 1 then do; prob = logistic(a2*(theta-ItemOne2)); llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob); end; else if group =2 and j > 1 then do; prob = logistic(a2*(theta-b2[j-1])); llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob); end; end; model general(llike); run;
I tried to specify the equality cnstraint based on some example code for conjoint analysis, and my model runs. My results adon't seem to agree with the frequentist model I ran using Proc IRT. The beta coefficients estimates b1 and b2 should not be too different, while my Proc MCMC resutlts make them to be so, unlike Proc IRT. Is my syntax correct? How can I force ItemOne1 and ItemOne2 to be equal, while allowing the theta regressor to have different means (mu and mu_g) for the two groups?
I am including the data generating code as well here.
/* SAS program to create the LSAT SAS dataset */ data lsat; input x @5 (x1-x5) (1.) freq; do i=1 to freq; output; end; drop x i freq; datalines; 1 00000 3 2 00001 6 3 00010 2 4 00011 11 5 00100 1 6 00101 1 7 00110 3 8 00111 4 9 01000 1 10 01001 8 11 01010 0 12 01011 16 13 01100 0 14 01101 3 15 01110 2 16 01111 15 17 10000 10 18 10001 29 19 10010 14 20 10011 81 21 10100 3 22 10101 28 23 10110 15 24 10111 80 25 11000 16 26 11001 56 27 11010 21 28 11011 173 29 11100 11 30 11101 61 31 11110 28 32 11111 298 ; run; /* create the LSAT_G data set assumes lsat sas dataset in work library */ data lsat_g; set lsat; if mod(_n_,2)=0 then group=1; else group=2; run;
The commented out section of your code is the way in which you can impose the equality restriction. Other than that, I don't see anything else obviously wrong. What I do see from looking at the diagnostic plots is that the mixing is pretty poor for several parameters and you need to run the chain longer to deal with the autocorrelation. For some of the other ones, (like ItemOne1, b11, b13 and b14) you need more burn-in iterations. This might account for the unexpected results.
There is a good explanation in the documentation as to how to "read" the plots and some possible remedies.
SAS Help Center: Assessing Markov Chain Convergence
The other reason they might ultimately be different than the frequentist model is because your priors are, for the most part, informative. Have you checked the results using non-informative priors?
Thanks so much for your reply!
However, I feel that the ultimate solution might be related to the discussion here at SO: https://stats.stackexchange.com/questions/453715/how-do-you-apply-constrains-on-parameters-in-bayesi...
Following that discussion, where they suggest that constraints in Bayesian Regression should be built-in to the conditional distributions, I introduced the following changes, which produced better- but still not accurate - results. The response probability (already a conditional probability in IRT models), P(Y=1Itheta) needs to be further modified by the additional condition, ItemOne1 = ItemOne2. That is, for this item, response probabilities behave in the same way, regardless of group-specific theta. Below theta_group1 ~ N(0,1), theta_group2 ~ N(mu,1), mu ~N(0,1). The Group2 theta has a hierachical prior.
I added the terms (theta-ItemOne2) or (theta -ItemOne1) inside the conditional probability in my second model.
I am not sure this is exactly right, though. My reasoning is that I want, for Group1, Item1(and similarly for Group2,Item1):
prob = P(Y|theta_group1), with the addition that ItemOne1=ItemOne2
=P(Y|theta_group1,ItemOne1=ItemOne2) , which implies
prob = P(Y|theta_group2,ItemOne1) *P(Y|theta_group2,ItemOne2), since both Y idepends on both ItemOne1 and ItemOne2 being the same.
From Patz and Junker's MCMC notes online, here are the conditional posteriors for this IRT model (except I additioanlly have two different groups as well. Also, I am not interested in the "a" parameter in my application).
ItemOne1 and ItemOne2 are the b_i in the equation above, but now ItemOne1 is from Group1 and ITemOne2 is from Group 2, which have different "theta" distributions. However, the conditional probability needs to be equal regardless of theta. Not sure how to express this, but the closest I came was the modification I mentioned. Please see below.
prior b1: b2: ~ normal(0, var=16);
prior a1 a2 ~ lognormal(0, var=9);
prior ItemOne: ~ normal(0,var=16);
parms mu_g 0;
prior mu_g ~ normal(0, var=1);
random theta ~ normal(mu[group], var=1) subject=_obs_;
llike=0;
do j=1 to 5;
if group=1 and j=1 then do;
prob = logistic(a1*(theta-ItemOne1)*(theta-ItemOne2));
llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
end;
else if group=1 and j > 1 then do;
prob = logistic(a1*(theta-b1[j-1]));
llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
end;
else if group =2 and j = 1 then do;
prob = logistic(a2*(theta-ItemOne2)*(theta-ItemOne1));
llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
end;
else if group =2 and j > 1 then do;
prob = logistic(a2*(theta-b2[j-1]));
llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
end;
All the other sections of the code stay the same except for the changed conditional probabilities while updating ItemOne1 and ItemOne2.. Here are my new results.
The actual results, from the SAS Press book, by Stone and Zhu (apparently out of print now), is below, which is different but not as bad as my first attempt. I feel like if I can get the right conditional probability equation, I can get the right values, but nore sure what that would be. Please note that "b11" and "b21" in Stone/Zhu code correspond to my ItemOne1 and ItemOne2 respectively, and their b12,b13,b14,b15 correspond to my b11, b12,b13, 14 and so on for the other b's.
The code for the last model , from the SAS book by Stone and Zhu is below.
title "Unconstrained model across groups";
proc mcmc data=lsat_g outpost=lsat_bayes_1p_unconstrained seed=23 nthreads=8 nbi=5000 nmc=20000;
array b1[5]; array b2[5]; array x[5];
parms a1 a2 1; parms b1: b2: 0;
prior b1: b2: ~ normal(0, var=16);
prior a1 a2 ~ lognormal(0, var=9);
random theta ~ normal(0, var=1) subject=_obs_;
llike=0;
do j=1 to 5;
if group=1 then do;
prob = logistic(a1*(theta-b1[j]));
llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
end;
else do;
prob = logistic(a2*(theta-b2[j]));
llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
end;
end;
model general(llike);
run;
Please do let me know if you find a solution! Thanks very much for your reply!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.