I am having trouble figuring out how to estimate a particular type of fixed effects regression. My data looks something like this: Company Date Output Manager 1 Manager 2 Manager 3 1 12/31/2002 500 1055 2291 . 1 3/31/2002 520 1055 2291 . 2 12/31/2002 180 2291 5538 7721 2 3/31/2002 178 2291 5538 7721 2 6/30/2002 188 5538 7721 . 3 9/30/2002 759 7721 . . There are a few thousand companies, each with about 100 observations on different dates. Each company has up to 3 managers, which are identified by a numeric code. I would like to estimate fixed effects for each manager in a regression of output on a number of controls (not shown). I'm having trouble thinking about how to do this given the structure of the data. It's important to notice that the particular column where a manager ID is listed has no intrinsic meaning -- it's just a list of which managers are at the company, and there can be up to 3 at a time. (They're sorted, so if a lower-ID manager leaves, all the IDs shift to the left, as happens for the third observation of company 2 above.) There are about 2,000 different manager IDs, so there are many fixed effects. (That is, the numbers 1055, 2291, 5538, and 7721 are just four of the 2,000 possible values.) I've been trying to think of how to estimate such a model. One approach would be to manually create a dummy variable matrix in SAS/IML and run the regression, but I'm running into memory problems. To be clear, for the data above, the dummy variables would look like this: mgr1055 mgr3765 mgr5538 mgr7721 1 1 0 0 1 1 0 0 1 0 1 1 1 0 1 1 0 0 1 1 0 0 0 1 The other way I thought might work would be to reshape the data so that it's "long", so that, for example, the first observation would generate two observations: Company Date Output Manager 1 12/31/2002 500 1055 1 12/31/2002 500 2291 This would easily generate the fixed effects, but the standard errors would be wrong because the regression doesn't "know" that this is just one observation. Perhaps clustering could fix this, but it's not clear to me if the regression would be correctly specified. So... is there a way to do this with PROC GLM or some other procedure? Any guidance on this would be much appreciated!
... View more