Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Is it possible to absorb fixed effects when using ...

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-12-2018 09:51 PM

I have a large dataset with over 15,000 Fixed Effects. When I run the code below, SAS fails to produce any output indicating that there is not enough memory (I am skipping the other ods output parts to shorten my code).

```
proc surveyreg data=have;
cluster id;
class year id;
model dependent_var = independent_var year id;
ods output ParameterEstimates = OutputStats_1
(where=(Parameter in ('Intercept','independent_var ')));
quit;
```

If I would not be required to cluster the standard errors at the id level, I could have simply used proc glm and absorbed the fixed effects variables. I was able to get results using the proc glm approach, but not with the proc surveryreg. I assume the proc surveyreg is a very slow approach? Is there any way to absorb the fixed effects as in proc glm? I am trying to run a panel data regression with year and id fixed effects and standard errors clustered at the id level.

Accepted Solutions

Solution

02-14-2018
02:54 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yegen

02-14-2018 08:54 AM

This is not really my area, but I will share a few thoughts.First, the CLUSTER variable is usually thought of as a random effect, but you are also listing it on the MODEL statement, which is for fixed effects. Although PROC SURVEYREG allows this syntax, other SAS procedures (eg, PHREG) do not.

If you want to model ID as a random effect, you might try PROC MIXED or HPMIXED. You can request sandwich estimators, which should be close to the estimates that you would get from SURVEYREG if it supported an ABSORB statement (which it doesn't). The big question, of course, is how many distinct levels of ID do you have, and will PROC MIXED be able to handle your data if it has many levels.

I hope someone more knowledgeable will have more to say. Maybe someone like @sld or @SAS_Rob might have thoughts on this issue.

All Replies

Solution

02-14-2018
02:54 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yegen

02-14-2018 08:54 AM

This is not really my area, but I will share a few thoughts.First, the CLUSTER variable is usually thought of as a random effect, but you are also listing it on the MODEL statement, which is for fixed effects. Although PROC SURVEYREG allows this syntax, other SAS procedures (eg, PHREG) do not.

If you want to model ID as a random effect, you might try PROC MIXED or HPMIXED. You can request sandwich estimators, which should be close to the estimates that you would get from SURVEYREG if it supported an ABSORB statement (which it doesn't). The big question, of course, is how many distinct levels of ID do you have, and will PROC MIXED be able to handle your data if it has many levels.

I hope someone more knowledgeable will have more to say. Maybe someone like @sld or @SAS_Rob might have thoughts on this issue.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

02-14-2018 09:05 AM - edited 02-14-2018 09:06 AM

I would echo what @Rick_SASis saying as well.

You normally would not want to put the ID variable on the MODEL statement, especially if you are have a random sample of subjects from some population. If you do not have a complex survey design, then there are better ways to get robust/sandwich estimators like he mentioned.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

02-14-2018 11:12 AM

Thanks for your helpful comments, @Rick_SAS and @SAS_Rob. The lower bound for the number of distinct levels of ID is 15,000 (and the upper bound is around 170,000). I will give the suggestion you have made a try. I also had a conversation with my co-author and we thought of the following. Since fixed effects just demean the LHS and RHS variables, one can just compute the means of the given variables at the distinct ID level. Since I have two different FEs (i.e., ID and year), I computed the mean of the same variables at the year level. Following that, I just subtracted both means (i.e., corresponding ID and year means) from the corresponding variables (e.g., Y_t,i - Y_mean_i - Y_mean_t) and obtained the demeaned variables. Then, I just used PROC SURVEYREG with clustering at the id-level and voila I got the results pretty quickly. PROC SURVEYREG does not seem to like large number of fixed effects, but handles well clustering (whereas, PROC GLM handles fixed effects well, but does not have a clustering option).

Thanks again, @Rick_SAS and @SAS_Rob.