Help with Instrumental Variable approach with proc syslin

anujmehta · Posted 03-28-2022 09:46 PM

I am attempting to perform an IV analysis to help account for unmeasured confounding in an observational study. Here is the summary of my dataset:

Outcome: wage (continuous value - log transformed)

Primary exposure/dependent variable: education - binary 1/0

IV: near: binary 1/0 (meets all criteria for a reasonable IV

var1-var8 - covariates/measured confounders all of which are 1/0 indicator variables, not associated with the IV.

The goal is to estimate the association of education and wage accounting for measured confounders and also unmeasured confounding via an IV analysis.

I want to use a 2 SLS approach. I can do a 2 step proc reg approach with the first model regressing education on near. I can output the predicted values and then use those values in the 2nd step in which i regress wage on pred_education.

I am running in some peculiar results when I try this in proc syslin. I want to do it in proc syslin to account for possible correlated error terms across the 2 models and to carry the SE of the estimates for pred_education forward into step 2.

In using proc syslin I use the following syntax with double adjustments for the covariates var1-var8:

Approach 1

proc syslin data 2sls;

endogenous education;

instruments near;

stepone: model education = near var1-var8;

steptwo: model wage = education var1-var8; run;

The model works fine and I get estimates for each model. in the first model, the parameter estimate for near=0.067, SE=0.219, p=0.0021. In the second model, the parameter estimate for education = 0.62, SE=0.2324, p=0.0077. I also get estimates for var1-var8 but i am leaving those out from this post. This is different than if i do a 2 step proc reg although that is not surprising.

What is surprising is what happened when I tried to adjust for measured confounders in only 1 step with the following code:

Approach 2

proc syslin data 2sls;

endogenous education;

instruments near;

stepone: model education = near;

steptwo: model wage = education var1-var8; run;

in the stepone model, var1-var8 have been eliminated and we only adjust for them in steptwo. When i run this code i get the following estimates:

stepone: parameter estimate for near = 0.1030, SE=0.0226, p<0.0001 [no estimates for var1-var8 as they were not included]

steptwo: parameter estimate for education = 0.62, SE=0.2324, p=0.0077, the same as the first approach with covariate adjustment in both steps. Moreover, the parameter estimates for var1-var8 are the same between approach 1 and 2.

My issue is that if I don't include var1-var8 in stepone I would expect the predictions for education to change which should affect the 2nd model in steptwo. My question to the community is why are the results of steptwo in Approach 1 and Approach 2 the same even if stepone is different between the 2.

Any insight would be greatly appreciated.

sbxkoenk · Posted 03-29-2022 10:21 AM

This is an econometrics question, so I have moved the topic to

"SAS Forecasting and Econometrics" board.

Koen

Help with Instrumental Variable approach with proc syslin

Re: Help with Instrumental Variable approach with proc syslin