BookmarkSubscribeRSS Feed
nstdt
Quartz | Level 8

I am trying to run a Bayesian logistic regression model, where two parameters need to be constrained to be equal. This model compares two groups in an IRT model, each with its own latent variable theta(the regressor), and each group has its own set of binomial responses, with the response probability determined by theta. I need to constrain some responses to be equal across the groups.

I have some code below, based on the SAS Press book 'Bayesian Analysis of Item Response Theory' by Stone and Zhu. However, they don't have the parameter constraint that I need, which I am trying to specify.

Here is my code. The reposnse data is in the variable x[5], denoting responses to 5 items. The parameters b1[5] and b2[5] are paameters for the item responses for groups 1 and 2 that are allowed to differ. However, responses to the first item have a parameter constraint - their coefficients ItemOne1 and ItemOne2 should be constrained equal across the two groups. The mu and mu_g parameters are the means of the latent regressor theta for the two groups.

proc mcmc data=lsat_g outpost=lsat_bayes seed=23 nthreads=1 nbi=5000 nmc=20000;
   array ItemOne[2] ItemOne1 ItemOne2;
   array b1[4]; array b2[4]; array x[5];array mu[2] 0 mu_g;   
   parms ItemOne: 0; parms a1 a2 1; parms b1: b2: 0; 

   prior b1: b2: ~ normal(0, var=16);
   prior a1 a2 ~ lognormal(0, var=9); 

   parms mu_g 0;
   prior mu_g ~ normal(0, var=1);

   beginnodata;
/*   	if (ItemOne1 - ItemOne2 = 0.) then lp = lpdfnorm(ItemOne1,0,4);*/
/*	else lp = .;*/
    lp = lpdfnorm(ItemOne1,0,4);
	prior ItemOne1 ItemOne2 ~ general(lp);
   endnodata;

   random theta ~ normal(mu[group], var=1) subject=_obs_; 
   llike=0;
   do j=1 to 5;                     
      if group=1 and j=1 then do;
         prob = logistic(a1*(theta-ItemOne1));
         llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
      end;
	  else if group=1 and j > 1 then do;
		 prob = logistic(a1*(theta-b1[j-1]));
         llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
	  end;
      else if group =2 and j = 1 then do;
				prob = logistic(a2*(theta-ItemOne2));
         		llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
      end;
      else if group =2 and j > 1 then do;
				prob = logistic(a2*(theta-b2[j-1]));
         		llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
      end;
	   
   end;    
   model general(llike);   
run;

 

I tried to specify the equality cnstraint based on some example code for conjoint analysis, and my model runs. My results adon't seem to agree with the frequentist model I ran using Proc IRT. The beta coefficients estimates b1 and b2 should not be too different, while my Proc MCMC resutlts make them to be so, unlike Proc IRT. Is my syntax correct? How can I force ItemOne1 and ItemOne2 to be equal, while allowing the theta regressor to have different means (mu and mu_g) for the two groups?

 

I am including the data generating code as well here.

/* SAS program to create the LSAT SAS dataset */
data lsat;                                                                             
input x @5 (x1-x5) (1.) freq; 
do i=1 to freq; 
   output; 
end;                                                           
drop x i freq;                                                                         
datalines;                                                                                 
  1 00000   3                                                                          
  2 00001   6                                                                          
  3 00010   2                                                                          
  4 00011  11                                                                          
  5 00100   1                                                                          
  6 00101   1                                                                          
  7 00110   3                                                                          
  8 00111   4                                                                          
  9 01000   1                                                                          
 10 01001   8                                                                          
 11 01010   0                                                                          
 12 01011  16                                                                          
 13 01100   0                                                                          
 14 01101   3                                                                          
 15 01110   2                                                                          
 16 01111  15                                                                          
 17 10000  10                                                                          
 18 10001  29                                                                          
 19 10010  14                                                                          
 20 10011  81                                                                          
 21 10100   3                                                                          
 22 10101  28                                                                          
 23 10110  15                                                                          
 24 10111  80                                                                          
 25 11000  16                                                                          
 26 11001  56                                                                          
 27 11010  21                                                                          
 28 11011 173                                                                          
 29 11100  11                                                                          
 30 11101  61                                                                          
 31 11110  28                                                                          
 32 11111 298                                                                          
 ;
run;
/* create the LSAT_G data set assumes lsat sas dataset in work library */
data lsat_g;
   set lsat;
   if mod(_n_,2)=0 then group=1;
   else group=2;
run;
2 REPLIES 2
SAS_Rob
SAS Employee

The commented out section of your code is the way in which you can impose the equality restriction.  Other than that, I don't see anything else obviously wrong.  What I do see from looking at the diagnostic plots is that the mixing is pretty poor for several parameters and you need to run the chain longer to deal with the autocorrelation.  For some of the other ones, (like ItemOne1, b11, b13 and b14) you need more burn-in iterations.  This might account for the unexpected results. 

There is a good explanation in the documentation as to how to "read" the plots and some possible remedies.

SAS Help Center: Assessing Markov Chain Convergence

 

The other reason they might ultimately be different than the frequentist model is because your priors are, for the most part, informative.  Have you checked the results using non-informative priors?

nstdt
Quartz | Level 8

Thanks so much for your reply!

However, I feel that the ultimate solution might be related to the discussion here at SO:  https://stats.stackexchange.com/questions/453715/how-do-you-apply-constrains-on-parameters-in-bayesi...

 

Following that discussion, where they suggest that constraints in Bayesian Regression should be built-in to the conditional distributions, I introduced the following changes, which produced better- but still not accurate - results. The response probability (already a conditional probability in IRT models), P(Y=1Itheta) needs to be further modified by the additional condition, ItemOne1 = ItemOne2. That is, for this item, response probabilities behave in the same way, regardless of group-specific theta. Below theta_group1 ~ N(0,1), theta_group2 ~ N(mu,1), mu ~N(0,1). The Group2 theta has a hierachical prior.

I added the terms (theta-ItemOne2) or (theta -ItemOne1) inside the conditional probability in my second model.

I am not sure this is exactly right, though. My reasoning is that I want, for Group1, Item1(and similarly for Group2,Item1):

prob = P(Y|theta_group1), with the addition that ItemOne1=ItemOne2

            =P(Y|theta_group1,ItemOne1=ItemOne2) , which implies

prob = P(Y|theta_group2,ItemOne1) *P(Y|theta_group2,ItemOne2), since both Y idepends on both ItemOne1 and ItemOne2 being the same.

 

From Patz and Junker's MCMC notes online, here are the conditional posteriors for this IRT model (except I  additioanlly have two different groups as well. Also, I am not interested in the "a" parameter in my application). 

Screenshot 2025-07-22 150557.jpg

 

ItemOne1 and ItemOne2 are the b_i in the equation above, but now ItemOne1 is from Group1 and ITemOne2 is from Group 2, which have different "theta" distributions. However, the conditional probability needs to be equal regardless of theta. Not sure how to express this, but the closest I came was the modification I mentioned. Please see below.

prior b1: b2: ~ normal(0, var=16);
   prior a1 a2 ~ lognormal(0, var=9); 
   prior ItemOne: ~ normal(0,var=16);
   parms mu_g 0;
   prior mu_g ~ normal(0, var=1);

   random theta ~ normal(mu[group], var=1) subject=_obs_; 
   llike=0;
   do j=1 to 5;                     
      if group=1 and j=1 then do;
         prob = logistic(a1*(theta-ItemOne1)*(theta-ItemOne2));
         llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
      end;
	  else if group=1 and j > 1 then do;
		 prob = logistic(a1*(theta-b1[j-1]));
         llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
	  end;
      else if group =2 and j = 1 then do;
				prob = logistic(a2*(theta-ItemOne2)*(theta-ItemOne1));
         		llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
      end;
      else if group =2 and j > 1 then do;
				prob = logistic(a2*(theta-b2[j-1]));
         		llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
      end;

 All the other sections of the code stay the same except for the changed conditional probabilities while updating ItemOne1 and ItemOne2.. Here are my new results.

nstdt_0-1753211501363.png

The actual results, from the SAS Press book, by Stone and Zhu (apparently out of print now), is below, which is different but not as bad as my first attempt. I feel like if I can get the right conditional probability equation, I can get the right values, but nore sure what that would be. Please note that "b11" and "b21" in Stone/Zhu code correspond to my ItemOne1 and ItemOne2 respectively, and their b12,b13,b14,b15 correspond to my b11, b12,b13, 14  and so on for the other b's.

nstdt_1-1753211799871.png

The code for the last model , from the SAS book by Stone and Zhu is below.

title "Unconstrained model across groups";
proc mcmc data=lsat_g outpost=lsat_bayes_1p_unconstrained seed=23 nthreads=8 nbi=5000 nmc=20000;
   array b1[5]; array b2[5]; array x[5];  
   parms a1 a2 1; parms b1: b2: 0; 
prior b1: b2: ~ normal(0, var=16);
   prior a1 a2 ~ lognormal(0, var=9); 
   random theta ~ normal(0, var=1) subject=_obs_;
   llike=0;
   do j=1 to 5;                     
      if group=1 then do;
         prob = logistic(a1*(theta-b1[j]));
         llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
      end;
      else do;
         prob = logistic(a2*(theta-b2[j]));
         llike = llike + x[j] * log(prob) + (1 - x[j]) * log(1 - prob);
      end;
   end;    
   model general(llike);
run;

 Please do let me know if you find a solution! Thanks very much for your reply!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 504 views
  • 2 likes
  • 2 in conversation