Tip: Fixed vs. Random Effects in Panel Data

Started ‎12-01-2014 by
Modified ‎10-05-2015 by
Views 12,598

Panel-data models are extensions of standard regression models that take into account group (or panel) effects.   As an example, consider a case where you are studying the effect of union membership on wages, controlling for other factors such as education and experience.  In your data people are measured over a period of seven years, and you would expect some year-to-year correlation with a given person. Thus, you could benefit from a panel model with latent person-level effects that can more adequately account for the correlation between yearly incomes within the same person.

This brings up the ever-present question as to whether to treat the person-level effects as fixed or random.  Although that terminology can be traced back to the early days of experimental design, it masks the true nature of the debate between the two methods.  To better grasp the distinction, ask yourself the following:  Does a regression coefficient for union membership represent a comparison of two people, one a dedicated union member and one who was never in a union?  Or does it compare two yearly incomes from the same person who happened to join a union in the interim?

The latter comparison is known as the within effect, because it compares incomes within the same person, and it is estimated directly using a fixed effects model. Using PROC PANEL, you obtain this with the FIXONE option:

proc panel data = union_data;

id personid year;

model lwage = union educ exp / fixone;

run;

Limiting your estimation to only within effects is desirable because it is as close as you are going to get to determining the causal effect of union membership, given the data at hand. By only using within-person comparisons, any possible confounding due to unmeasured person-level characteristics is prevented, and you obtain consistent parameter estimates.  That accomplishes much, but at the cost of efficiency. All the between-person comparisons in the data are essentially thrown away.

That brings us to the random-effects model, known also as Generalized Least Squares (GLS).  This model assumes that between-person effects are identical to within-person effects.  That’s a big assumption, but if true it allows you to throw more data (all the between-person comparisons) at the already consistent estimate you obtained in the fixed effects model.

You obtain estimates for the random effects model with the RANONE option:

proc panel data = union_data;

id personid year;

model lwage = union educ exp / ranone;

run;

The coefficient for union membership now has a smaller standard error, but the coefficient itself is way off from what is was previously (0.77 versus 0.42).  Was your assumption that between-person effects are equal to within-person effects violated?  A Hausman test can help answer that, and that is provided as part of the output with random-effects estimation.

The null hypothesis is one of equality of within and between effects – all effects, not just that for union membership.  This leads you to reject the random effects model in its present form, in favor of the fixed effects model.  You may choose to simply stop there and keep your fixed effects model.  If, however, you weren’t satisfied with the precision of your fixed-effects estimator you could look further into how disparate the between and within effects are. Perhaps there is something you could do to repair the random effects model.

But we’ll leave that for a later tip.

Version history
Last update:
‎10-05-2015 02:48 PM
Updated by:
Contributors
Article Labels
Article Tags