Obsidian | Level 7

## Endogeneity in linear mixed models

Dear All,

I am estimating a multilevel (hierarchical) linear model using PROC MIXED. In simplified form (omitting subscripts and most independent variables), the model is

Y = b0 + b1 X1 + b2 X2 + u + e

where u denotes random effects and e error terms.

The SAS code is as follows:

```proc mixed data=mydata;
class i j k m;
model Y = X1 X2 m/ ddfm=bw ;
random intercept /subject=j      type=vc;
random intercept /subject=k      type=vc;
random intercept /subject=j*k    type=vc;
random intercept /subject=i(j*k) type=vc;
run;
```

I suspect that Y and X1 are endogenous because they are simultaneously determined. However, X2 is exogenous. My questions are:

1. How do I test if Y and X1 are endogenous? I know that there are endogeneity tests in PROC QLIM and other procedures, however, I am not sure how to add random effects in these procedures. Could I use these endogeneity tests without random effects?

2. Should I test for correlation between X1 and e only? Or should I also test for correlation between X1 and u (random effects)?

3. If I were to estimate the following system of equations using 2SLS in PROC SYSLIN, would I have to still worry about endogeneity?

Eq.1   Y = b0 + b1 X1 + b2 X2 + u + e1

Eq.2   X1 = c0 + c2 X2 + c3 Z + v + e2

where Z is an instrumental variable, u and v are random effects and e1 and e2 are errors terms.

4. If yes, how do I include random effects in PROC SYSLIN?

Any and all responses will be greatly appreciated. Many thanks in advance!

Sincerely,

Cuneyt

SAS Employee

## Re: Endogeneity in linear mixed models

Dear Cuneyt,

You are right, PROC QLIM has an endogeneity test, also you can model random effects (both random intercept, as in your case, and random coefficients) for a single SUBJECT value in PROC QLIM using the RANDOM statement. However, if you are using the RANDOM statement you can have only one MODEL statement. Therefore, you cannot model your reduced form equation (X1 = c0 + c2 X2 + c3 Z + v + e2) along with your structural equation (Y = b0 + b1 X1 + b2 X2 + u + e1) which are both necessary for the endogeneity test. However, you should go ahead and test for endogeneity of X1 in PROC QLIM even if you are not able to model the random effects. Because, if you do have any correlation between X1 and u and/or e1 this will show up in the test implying that you have the problem of endogeneity in your main model. The only thing that you won't be sure of is that you won't know which error component X1 is correlated to, as PROC QLIM will treat them as a single error term (say, v=u+e1).

For your second question, in either case you do have the problem of endogeneity. Because, the assumptions u|(X1, X2)~N(0, sigma_u^2) or e1|(X1, X2, u)~N(0, sigma_e1^2) will be violated and this implies endogeneity. Therefore, the test for endogeneity done in QLIM will give you an answer.

For your third question, no, you don't need to worry about the endogeneity because you are already correcting for it by modeling both the structural and the reduced form equations together (as you would do in PROC QLIM). The only thing is that, you won't be modelling the random effect. The last statement answers your last question.

I hope this helps,

Best regards,

Gunce

Discussion stats