Hausman-Taylor estimator for SAS

1 Like

This example comes from the 14.1 release of SAS/ETS®. One of the necessary assumptions for consistency of the random effects estimator for panel data is that the unobserved effect is uncorrelated with the observed covariates. In practice, this assumption is unlikely to be true as many covariates in a model of human behavior are likely correlated. The typical method of dealing with this problem is to introduce an estimator known with fixed unobserved effects. These fixed effects are explicitly estimated and ensure consistent estimates even when the unobserved effect is correlated with regressors. This fixed effect estimator, while consistent is quite restrictive in other ways. Most notably, any regressor which does not vary over time within each sampling unit will be dropped from the analysis.

So what is an alternative? In the newest release of SAS/ETS the PANEL procedure introduces the Hausman-Taylor estimator which provides an estimator between the fixed effect and random effect estimator. It allows for estimation of additional parameters because it relaxes certain restrictions about the way the unobserved effect and the covariates are correlated. Enjoy this preview from the upcoming release.

Cornwell and Rupert (1988) analyze data from the Panel Study of Income Dynamics (PSID), an income study of 595 individuals over the seven-year period 1976–1982 inclusive. Of particular interest is the effect of additional schooling on wages. Our analysis replicates that of Baltagi (2008, sec. 7.5), where it is surmised that covariate correlation with individual effects makes a standard random-effects model inadequate.

data psid; input id t lwage wks south smsa ms exp exp2 occ ind union fem blk ed;

label id = 'Person ID' t = 'Time' lwage = 'Log(wages)' wks = 'Weeks worked' south = '1 if resides in the South' smsa = '1 if resides in SMSA'

ms = '1 if married' exp = 'Years full-time experience' exp2 = 'exp squared' occ = '1 if blue-collar occupation' ind = '1 if manufacturing'

union = '1 if union contract' fem = '1 if female' blk = '1 if black' ed = 'Years of education';

datalines; 1 1 5.5606799126 32 1 0 1 3 9 0 0 0 0 0 9

1 2 5.7203102112 43 1 0 1 4 16 0 0 0 0 0 9

1 3 5.9964499474 40 1 0 1 5 25 0 0 0 0 0 9

1 4 5.9964499474 39 1 0 1 6 36 0 0 0 0 0 9

1 5 6.0614600182 42 1 0 1 7 49 0 1 0 0 0 9

1 6 6.1737899780 35 1 0 1 8 64 0 1 0 0 0 9

1 7 6.2441701889 32 1 0 1 9 81 0 1 0 0 0 9

2 1 6.1633100510 34 0 0 1 30 900 1 0 0 0 0 11

2 2 6.2146100998 27 0 0 1 31 961 1 0 0 0 0 11

2 3 6.2634000778 33 0 0 1 32 1024 1 1 1 0 0 11

2 4 6.5439100266 30 0 0 1 33 1089 1 1 0 0 0 11

2 5 6.6970300674 30 0 0 1 34 1156 1 1 0 0 0 11

2 6 6.7912201881 37 0 0 1 35 1225 1 1 0 0 0 11

2 7 6.8156399727 30 0 0 1 36 1296 1 1 0 0 0 11

... more lines ...

You begin by fitting a one-way random effects model.

proc panel data=psid; id id t; model lwage = wks south smsa ms exp exp2 occ ind union fem blk ed / ranone; run;

The output is shown in Output 20.5.1. The coefficient on variable ED estimtates that an additional year of schooling is associated with about an 10.7% increase in wages. However, the results of the Hausman test for random effects show a serious violation of the random-effects assumptions, namely that the regressors are independent of both error components.

Output 20.5.1: One-Way Random Effects Estimation

The PANEL Procedure

Fuller and Battese Variance Components (RanOne)

Dependent Variable: lwage (Log(wages))

Estimation Method	RanOne
Number of Cross Sections	595
Time Series Length	7

Variance Component for Cross Sections	0.100553
Variance Component for Error	0.023102

9

5288.98

<.0001

Intercept	1	4.030811	0.1044	38.59	<.0001	Intercept
wks	1	0.000954	0.000740	1.29	0.1971	Weeks worked
south	1	-0.00788	0.0281	-0.28	0.7795	1 if resides in the South
smsa	1	-0.02898	0.0202	-1.43	0.1517	1 if resides in SMSA
ms	1	-0.07067	0.0224	-3.16	0.0016	1 if married
exp	1	0.087726	0.00281	31.27	<.0001	Years full-time experience
exp2	1	-0.00076	0.000062	-12.31	<.0001	exp squared
occ	1	-0.04293	0.0162	-2.65	0.0081	1 if blue-collar occupation
ind	1	0.00381	0.0172	0.22	0.8242	1 if manufacturing
union	1	0.058121	0.0169	3.45	0.0006	1 if union contract
fem	1	-0.30791	0.0572	-5.38	<.0001	1 if female
blk	1	-0.21995	0.0660	-3.33	0.0009	1 if black
ed	1	0.10742	0.00642	16.73	<.0001	Years of education

An alternative could be a fixed-effects (FIXONE) model, but that would not permit estimation of the coefficient for ED, which does not vary within individuals. A compromise is the Hausman-Taylor model, for which you stipulate a set of covariates that are correlated with the individual effects (but uncorrelated with the observation-level errors). You specify the correlated variables with the CORRELATED= option in the INSTRUMENTS statement.

proc panel data=psid; id id t; instruments correlated = (wks ms exp exp2 union ed); model lwage = wks south smsa ms exp exp2 occ ind union fem blk ed / htaylor; run;

The results are shown in Output 20.5.2. In the table of parameter estimates is the added column "Type", that identifies which regressors are assumed correlated with individual effects (C), and which regressors do not vary within cross sections (TI). It was stated previously that the Hausman-Taylor model is a compromise between fixed and random effects, and you can think of the compromise this way: You want to fit a random-effects model, but the correlated (C) variables make that model invalid. Thus you fall back to the consistent fixed-effects model, but then the time-invariant (TI) variables are the problem because they will be dropped from that model. The solution is to use the Hausman-Taylor estimator.

The estimation results show that an additional year of schooling is now associated with a 13.8% increase in wages. Also presented is a Hausman test that compares this model to the fixed-effects model. As was the case previously when you fit the random-effects model, you can think of the Hausman test as a referendum on the assumptions you are making. For this estimation, it would seem your choice of which variables to treat as correlated is adequate. It also seems to hold true that any correlation present is with the individual-level effects, and not the observation-level errors.

Output 20.5.2: Hausman-Taylor Estimation

The PANEL Procedure

Hausman and Taylor Model for Correlated Individual Effects (HTaylor)

Dependent Variable: lwage (Log(wages))

Variance Component for Cross Sections	0.886993
Variance Component for Error	0.023044

9

3

5.26

0.1539

Intercept			1	2.912726	0.2837	10.27	<.0001	Intercept
wks	C		1	0.000837	0.000600	1.40	0.1627	Weeks worked
south			1	0.00744	0.0320	0.23	0.8159	1 if resides in the South
smsa			1	-0.04183	0.0190	-2.21	0.0274	1 if resides in SMSA
ms	C		1	-0.02985	0.0190	-1.57	0.1159	1 if married
exp	C		1	0.113133	0.00247	45.79	<.0001	Years full-time experience
exp2	C		1	-0.00042	0.000055	-7.67	<.0001	exp squared
occ			1	-0.0207	0.0138	-1.50	0.1331	1 if blue-collar occupation
ind			1	0.013604	0.0152	0.89	0.3720	1 if manufacturing
union	C		1	0.032771	0.0149	2.20	0.0280	1 if union contract
fem		TI	1	-0.13092	0.1267	-1.03	0.3014	1 if female
blk		TI	1	-0.28575	0.1557	-1.84	0.0665	1 if black
ed	C	TI	1	0.137944	0.0212	6.49	<.0001	Years of education

C: correlated with the individual effects

TI: constant (time-invariant) within cross sections

The Hausman-Taylor estimator is at its core an instrumental variables regression, where the instruments are derived from those regressors that are assumed uncorrelated with the individual effects. Technically it is the cross sectional means of these variables that need to be uncorrelated, and not the variables themselves.

The Amemiya-MaCurdy model is a close relative of the Hausman-Taylor model. The only difference between the two is that the Amemiya-MaCurdy model makes the added assumption that the regressors (and not just their means) are uncorrelated with the individual effects. By making that assumption, the Amemiya-MaCurdy model can take advantage of a more efficient set of instrumental variables.

proc panel data=psid; id id t; instruments correlated = (wks ms exp exp2 union ed); model lwage = wks south smsa ms exp exp2 occ ind union fem blk ed / amacurdy; run;

The results are shown in Output 20.5.3. Little is changed from the Hausman-Taylor model. The Hausman test presented compares the Amemiya-MaCurdy model to the Hausman-Taylor model (not the fixed effects model as previously), and shows that you were okay with making the one added assumption. You even gained a bit of efficiency in the process; compare the standard deviations of the coefficient on variable ED from both models.

Output 20.5.3: Amemiya-MaCurdy Estimation

The PANEL Procedure

Amemiya and MaCurdy Model for Correlated Individual Effects (AMaCurdy)

Dependent Variable: lwage (Log(wages))

Variance Component for Cross Sections	0.886993
Variance Component for Error	0.023044

13

14.67

0.3287

Intercept			1	2.927338	0.2751	10.64	<.0001	Intercept
wks	C		1	0.000838	0.000599	1.40	0.1622	Weeks worked
south			1	0.007282	0.0319	0.23	0.8197	1 if resides in the South
smsa			1	-0.04195	0.0189	-2.21	0.0269	1 if resides in SMSA
ms	C		1	-0.03009	0.0190	-1.59	0.1127	1 if married
exp	C		1	0.11297	0.00247	45.76	<.0001	Years full-time experience
exp2	C		1	-0.00042	0.000055	-7.72	<.0001	exp squared
occ			1	-0.02085	0.0138	-1.51	0.1299	1 if blue-collar occupation
ind			1	0.013629	0.0152	0.89	0.3709	1 if manufacturing
union	C		1	0.032475	0.0149	2.18	0.0293	1 if union contract
fem		TI	1	-0.13201	0.1266	-1.04	0.2972	1 if female
blk		TI	1	-0.2859	0.1555	-1.84	0.0660	1 if black
ed	C	TI	1	0.137205	0.0206	6.67	<.0001	Years of education

C: correlated with the individual effects

TI: constant (time-invariant) within cross sections

Finally, you should realize that the Hausman-Taylor and Amemiya-MaCurdy estimators are not cure-alls for correlated individual effects. Estimation tacitly relies on the uncorrelated regressors being sufficient to predict the correlated regressors. Otherwise you run into the problem of weak instruments. If you have weak instruments, you will obtain biased estimates with very large standard errors. However, that does not seem to be the case here.

Hausman-Taylor estimator for SAS

Free course: Data Literacy Essentials

Get Started