BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DaveatPitt
Calcite | Level 5

Hello, I'm analyzing some data and need to test whether a 2SLS or OLS model is most appropriate.

I understand that I need to run the Hausman specification test and the syntax provided in the SAS documentation looks straightforward but I'm not sure how to execute the code.

The data contain a dependent variable Y, a mediator M, and two instruments X and Z, and their interaction.

the model I'm testing is Y = M + X + Z + X*Z.

The code provided in the SAS help guide under 'Hausman specification test' is provided below. My question is what needs to be inputted. What do y1 and y2 represent below? Does p represent predicted values? I assume interc refers to an interaction term? And what is d2?

proc model data=one out=fiml2; endogenous y1 y2;

y1 = py2 * y2 + px1 * x1 + interc;

y2 = py1* y1 + pz1 * z1 + d2;

fit y1 y2 / ols 2sls hausman;

instruments x1 z1; run;

Any help and even an example would be much appreciated.

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
kessler
SAS Employee

Dave,

To do an equivalent 2SLS estimation to the one I provided earlier using PROC MODEL you could do the following in PROC SYSLIN.   The additional DATA step is necesssary because I don't believe PROC SYSLIN supports the 'x*z' syntax used in the code you provided.

Marc

data one;
   set d;
   xz = x*z;
run;

proc syslin data=one 2sls first;
   endogenous m;
   instruments x z xz;
   model y = m;
run;

View solution in original post

7 REPLIES 7
gergely_batho
SAS Employee

Hi,

You are using the example from here:

SAS/ETS(R) 13.2 User's Guide

You can generate the example data set (one) using this link:

SAS/ETS User's Guide Example Programs

y1, y2 are endogenous variables, x1, z1 are instrumental variables,  py1, py2 px1, pz1 d2, inerc are all parameters.

DaveatPitt
Calcite | Level 5

Hi Gergely,

Thanks for pointing me to the sample data, which is helpful.

One other question - how does one modify the syntax for a third variable? For example, what if there were two instrumental variables PLUS their interaction in the model.  How would the two lines of the syntax be changed? Does 'interc' capture that? Another way to ask this might be, what if there were 3 instruments, not 2?

Using the varnames from the syntax above, the model I'm trying to run is:

y1 = y2 + x1 +z1 +x1*z1.

is there a way to modify the syntax to accomplish that?

thanks,

Dave

gergely_batho
SAS Employee

When you specify a model in PROC MODEL, you should explicitly use parameters:

y1 = y2par * y2 + x1par * x1 +z1par * z1 + xzpar * x1*z1 + intercept

In previous examples interc was the intercept parameter, not the interaction.

I suggest you to read the documentation of PROC MODEL, especially about 2SLS and instrumental variables.

If you want to include the interaction as an instrumental variable, you can create it in an assignment statement, then use it in the instruments statement. (xz=x1*z1; instruments xz;)

Do you have only 1 equation?

kessler
SAS Employee

Dave,

It looks like Gergely did a good job at showing you how to specify models in PROC MODEL and how to specify an instrument that represents the interaction between the variables X and Y.

To run the Hausman test for your model you could use something like the following example. In this example M is instrumented using X, Z, and X*Z.  In the ouput you should get a Hausman specification test static value of 7.65 with a p-value of 0.0218.  Therefore, using a 5% significance level you would have to reject the null hypothesis that the OLS estimater is consistent for this model.  I hope this helps.

Marc

data d;
   call streaminit (1);
   do i = 1 to 2000;
      r = rand('normal');
      x = r + rand('normal');
      z = r + rand('normal');
      m = r + rand('normal');
      y =  3*m + 1 + rand('normal');
      output;
   end;
run;

proc model data=d;
   y = pm*m + interc;
   xz = x*z;
   instruments x z xz;
   fit y / ols 2sls hausman;
quit;

DaveatPitt
Calcite | Level 5

Hi Marc,

Thanks for the help.

Sorry to be a bit slow - I'm not that familiar with 2SLS.

Here's what I ran in 2SLS and am now trying to run the Hausman test in PROC model. Even with the guidance on the chain, I still can't get it to work:

Proc SYSLIN data= one 2SLS FIRST;

Endogenous m ;

Instruments x z x*z;

Model y= m x*z/ OVERID DW PLOT;

Run;


thanks,


Dave

kessler
SAS Employee

Dave,

To do an equivalent 2SLS estimation to the one I provided earlier using PROC MODEL you could do the following in PROC SYSLIN.   The additional DATA step is necesssary because I don't believe PROC SYSLIN supports the 'x*z' syntax used in the code you provided.

Marc

data one;
   set d;
   xz = x*z;
run;

proc syslin data=one 2sls first;
   endogenous m;
   instruments x z xz;
   model y = m;
run;

DaveatPitt
Calcite | Level 5

Thanks all  - got the models to work.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 7836 views
  • 6 likes
  • 3 in conversation