BookmarkSubscribeRSS Feed
SASuser2018
Calcite | Level 5

Dear Experts, 

 

I have a question on how to do a regression on correlated multiple binary dependent variables.  I know that SAS proc qlim can do bivariate probit model, but I can't find similar function that can work on >=3 binary variables.   I search for "multivariate probit" but the search result of proc MDC,  But I think MDC works for choosing one of several options and the choice is mutual exclusive for one subject thus it doesn't work for my condition.  I type some numbers below as an example of my data and looking forward to hearing from experts' response. 

 

Thank you.  

 

dependent variables:  Y1, Y2 and Y3  -- all binary variables;  and they are correlated

X1 -- X101  -- independent variables, can be either continuous or discrete;  each Y may corresponds to different combination of X, for example, X1, X3, X10 etc. are in Y1 equation, X2, X8, X100 etc. are in Y2 equation and X3, X7, X52 etc. are in Y3 equation

In total I have 10,000 observations.  

 

 

Y1  Y2  Y3   X1  X2  X3 ....... X100 X101

1    0    1     10   11   0            21      32

0    1     0     9   7    1              11      23

1   1    0     1     7   1               13     27

 

4 REPLIES 4
PaigeMiller
Diamond | Level 26

I am not really understanding your problem. It sounds like you want one model fit which will be appropriate for all 3 Y variables and also takes into account the correlation between the Y variable ... but then you say:

 

each Y may corresponds to different combination of X, for example, X1, X3, X10 etc. are in Y1 equation, X2, X8, X100 etc. are in Y2 equation and X3, X7, X52 etc. are in Y3 equation

 

As I understand what you are saying, this is no longer a single model. This is three separate model fits, which you can do using PROC LOGISTIC.

 

 

--
Paige Miller
SASuser2018
Calcite | Level 5

Perhaps I didn't explain my problem clearly.  Sorry about the confusion.  What I want is one model for 3 Ys, all 3 Ys are binary, and the error terms are correlated.  And, the independent variables in the 3 equations are not the same.  the model could be described as follow:

 

Y1 = X1B1 + e1; 

Y2 = X2*B2 +e2;

Y3=X3*B3+e3;

 

and (e1,e2,e3|x) ~ N((0,0,0), (1,rho12, rho13,

                                                rho12, 1, rho23

                                                rho13, rho23, 1)

 

Basically, I want to use a similar function as proc qlim (for 2 binary outcomes), but I want to work for at least 3 binary outcomes.

 

Thank you

 

 

 

PaigeMiller
Diamond | Level 26

I'll stick with my original answer. Once you require different X variables in the models for Y1, Y2 and Y3, I don't see how you can make this one single model, it becomes 3 models.

--
Paige Miller
StatDave
SAS Super FREQ

See the Frequently-Asked for Statistics link in the Important Links list on the right of the SAS Statistical Procedures Community page. In that list you will find a "Multivariate logit model" link to a SAS note that describes many available types of logistic models, including the multivariate model (mentioned near the end). Note that errors for logit models are not normally distributed. 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1458 views
  • 0 likes
  • 3 in conversation