03-04-2018 12:31 PM
I have a question on how to do a regression on correlated multiple binary dependent variables. I know that SAS proc qlim can do bivariate probit model, but I can't find similar function that can work on >=3 binary variables. I search for "multivariate probit" but the search result of proc MDC, But I think MDC works for choosing one of several options and the choice is mutual exclusive for one subject thus it doesn't work for my condition. I type some numbers below as an example of my data and looking forward to hearing from experts' response.
dependent variables: Y1, Y2 and Y3 -- all binary variables; and they are correlated
X1 -- X101 -- independent variables, can be either continuous or discrete; each Y may corresponds to different combination of X, for example, X1, X3, X10 etc. are in Y1 equation, X2, X8, X100 etc. are in Y2 equation and X3, X7, X52 etc. are in Y3 equation
In total I have 10,000 observations.
Y1 Y2 Y3 X1 X2 X3 ....... X100 X101
1 0 1 10 11 0 21 32
0 1 0 9 7 1 11 23
1 1 0 1 7 1 13 27
03-04-2018 02:08 PM - edited 03-04-2018 02:41 PM
I am not really understanding your problem. It sounds like you want one model fit which will be appropriate for all 3 Y variables and also takes into account the correlation between the Y variable ... but then you say:
each Y may corresponds to different combination of X, for example, X1, X3, X10 etc. are in Y1 equation, X2, X8, X100 etc. are in Y2 equation and X3, X7, X52 etc. are in Y3 equation
As I understand what you are saying, this is no longer a single model. This is three separate model fits, which you can do using PROC LOGISTIC.
03-04-2018 06:44 PM
Perhaps I didn't explain my problem clearly. Sorry about the confusion. What I want is one model for 3 Ys, all 3 Ys are binary, and the error terms are correlated. And, the independent variables in the 3 equations are not the same. the model could be described as follow:
Y1 = X1B1 + e1;
Y2 = X2*B2 +e2;
and (e1,e2,e3|x) ~ N((0,0,0), (1,rho12, rho13,
rho12, 1, rho23
rho13, rho23, 1)
Basically, I want to use a similar function as proc qlim (for 2 binary outcomes), but I want to work for at least 3 binary outcomes.
03-05-2018 08:40 AM - edited 03-05-2018 08:41 AM
I'll stick with my original answer. Once you require different X variables in the models for Y1, Y2 and Y3, I don't see how you can make this one single model, it becomes 3 models.
03-05-2018 11:07 AM
See the Frequently-Asked for Statistics link in the Important Links list on the right of the SAS Statistical Procedures Community page. In that list you will find a "Multivariate logit model" link to a SAS note that describes many available types of logistic models, including the multivariate model (mentioned near the end). Note that errors for logit models are not normally distributed.