Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Is it possible to absorb fixed effects when using Proc Surveyreg?

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 02-12-2018 09:51 PM
(5278 views)

I have a large dataset with over 15,000 Fixed Effects. When I run the code below, SAS fails to produce any output indicating that there is not enough memory (I am skipping the other ods output parts to shorten my code).

```
proc surveyreg data=have;
cluster id;
class year id;
model dependent_var = independent_var year id;
ods output ParameterEstimates = OutputStats_1
(where=(Parameter in ('Intercept','independent_var ')));
quit;
```

If I would not be required to cluster the standard errors at the id level, I could have simply used proc glm and absorbed the fixed effects variables. I was able to get results using the proc glm approach, but not with the proc surveryreg. I assume the proc surveyreg is a very slow approach? Is there any way to absorb the fixed effects as in proc glm? I am trying to run a panel data regression with year and id fixed effects and standard errors clustered at the id level.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

This is not really my area, but I will share a few thoughts.First, the CLUSTER variable is usually thought of as a random effect, but you are also listing it on the MODEL statement, which is for fixed effects. Although PROC SURVEYREG allows this syntax, other SAS procedures (eg, PHREG) do not.

If you want to model ID as a random effect, you might try PROC MIXED or HPMIXED. You can request sandwich estimators, which should be close to the estimates that you would get from SURVEYREG if it supported an ABSORB statement (which it doesn't). The big question, of course, is how many distinct levels of ID do you have, and will PROC MIXED be able to handle your data if it has many levels.

I hope someone more knowledgeable will have more to say. Maybe someone like @sld or @SAS_Rob might have thoughts on this issue.

7 REPLIES 7

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

This is not really my area, but I will share a few thoughts.First, the CLUSTER variable is usually thought of as a random effect, but you are also listing it on the MODEL statement, which is for fixed effects. Although PROC SURVEYREG allows this syntax, other SAS procedures (eg, PHREG) do not.

If you want to model ID as a random effect, you might try PROC MIXED or HPMIXED. You can request sandwich estimators, which should be close to the estimates that you would get from SURVEYREG if it supported an ABSORB statement (which it doesn't). The big question, of course, is how many distinct levels of ID do you have, and will PROC MIXED be able to handle your data if it has many levels.

I hope someone more knowledgeable will have more to say. Maybe someone like @sld or @SAS_Rob might have thoughts on this issue.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I would echo what @Rick_SASis saying as well.

You normally would not want to put the ID variable on the MODEL statement, especially if you are have a random sample of subjects from some population. If you do not have a complex survey design, then there are better ways to get robust/sandwich estimators like he mentioned.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks again, @Rick_SAS and @SAS_Rob.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi @Yegen,

I'm currently having the same problem with a model of mine (2 million unit fixed effects and 12 time fixed effects). I'm not that knowledgeable of the behind the scenes matrix math for variances and standard error calculation though. Would your demeaning approach still produce the proper clustered standard errors/covariance matrix?

If it matters, I'm attempting to get 2-way clustered errors on both sets of fixed effects using a macro I've found on several academic sites that uses survey reg twice, once with each cluster, then computes the 2-way clustered errors using the covariance matricies from surveyreg. I'm wondering if demeaning will ruin that somehow.

Jon

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi @JLcra,

Yes, that would still be fine. My co-author is a financial econometrician and he also confirmed that it would work well. Here is what I was doing.

- Find the mean of your LHS and RHS variables by grouping at the unit level.
- Subtract the mean from Step 1 from each variable (e.g., subtract mean of LHS from LHS variable).
- Now, use the unit level demeaned variables and find the mean of these demeaned variables by grouping at the time level.
- As in step 2, subtract the mean from Step 3 from the values you obtained after step 2.
- Then, use these demeaned values in your regression that allows 2-way clustering (see link of reliable code below).

Link of reliable SAS code: http://www.people.hbs.edu/igow/GOT/Code/clus2D.sas

I hope this helps.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You could try my Felm Macro which does it all

http://olivier.godechot.free.fr/hoparticle.php?id_art=721 .

Best

Olivier

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi, thanks for the post. I am running into the same issue. Can I have two related questions?

1. I notice that the t-stats for the coefficients from (i) the de-mean method and (ii) the PROC SURVEYREG with Class option are slightly different. Why? and how do we control for that?

2. The link to the reliable code is missing. Do you have the new link ?

Thanks

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.