Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Forecasting
- /
- 2SLS regression with fixed effects and clustered standard errors

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 07-05-2021 10:57 AM
(1806 views)

Hi, everyone. I have a panel data set like this:

firm | year | early_refin | turn_call | asset | leverage | elimat |

AAA | 1990 | 0 | 0 | 6 | 0.5 | 0 |

AAA | 1991 | 0 | 0 | 6 | 0.5 | 0 |

AAA | 1992 | 0 | 1 | 6 | 0.5 | 0 |

BBB | 1990 | 0 | 0 | 8 | 0.01 | 0.22 |

BBB | 1991 | 1 | 0 | 8 | 0.01 | 0.22 |

BBB | 1992 | 1 | 0 | 9 | 0.01 | 0 |

CCC | 1990 | 1 | 0 | 11 | 0.02 | 0 |

CCC | 1991 | 1 | 0 | 11 | 0.02 | 0.65 |

CCC | 1992 | 1 | 1 | 11 | 0.02 | 0 |

**Column definition:**

early_refin - dependent variable of 1st stage (dummy variable)

turn_call - instrument variable (dummy variable)

asset, leverage - control variables

elimat - dependent variable of 2nd stage

And I have to run a two-stage least square regression with this data set.

**The two-stage regression model** I want to run is as follows (simplified):

1st - early_refin = A + B*turn_call + C*asset + D*leverage + e

2nd - elimat = a + b* estimated early_refin + c* asset + d*leverage + ϵ

I also want to **include firm fixed effects and year fixed effects** and **cluster standard errors at firm level**.

I've tried PROC PANEL, but it seems that PROC PANEL cannot perform a 2SLS regression.

I would appreciate if anyone could give me some ideas about how to finish this task.

Thank you in advance.

30 REPLIES 30

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

PROC PANEL is indeed for panel data (time-series cross-sectional data) but it is for single-equation models with common pooled estimates of the coefficients of the explanatory variables.

It has been more than a decade that I have used Two-Stage Least Squares Estimation (2SLS) with "first-stage" *instrumental variable regression*, but I remember you should do it with PROC MODEL (SAS/ETS) or PROC SYSLIN (SAS/ETS).

Both PROC SYSLIN and PROC MODEL can deal with panel data.

Let me know if you can sort it out. If not, I can maybe spend a bit more time on it.

Good luck,

Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Koen,

Thank you for your helpful advice.

The code I tried yesterday was as follows:

```
proc syslin data=panel_data 2SLS first;
by firm year;
endogenous early_refin;
instruments turn_call asset leverage;
model elimat = early_refin asset leverage;
run;
```

However, I failed to get a result (all the parameter estimates displayed 0).

And also, I couldn't find an option where I could specify cluster standard errors at firm level in the documentation.

Do you have any idea about how to fix this?

Thank you so much for your help.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

Very weird you are getting all the parameter estimates displayed as 0.

Do you have a warning or a note in the LOG?

Which SAS version are you using? If you have SAS VIYA and SAS Econometrics you can use PROC CPANEL. PROC CPANEL can fit instrumental variables (IV) regression models for panel data and maybe it can deal with clustered standard errors as well (I haven't checked the latter yet). But many procedure in SAS can deal with clustered standard errors so maybe PROC CPANEL has that capability too.

I'm sorry that I am only now telling you about the existence of PROC CPANEL (CAS-enabled) but I have just discovered it myself thanks to your question.

Let me know if you have SAS Econometrics and PROC CPANEL.

Good luck,

Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Koen,

My program actually crashed down after running for a couple of minutes. I think there might be something wrong, or my data set is too large.

I use SAS 9.4 and I don't have SAS Econometrics, so I cannot run PROC CPANEL.

I found PROC SURVEYREG quite useful. It can deal with both fixed effects and clustered standard errors, and I used two PROC SURVEYREGs to perform 2-stage regression.

`*first stage;`

proc surveyreg data=panel_data;
class firm year;
cluster firm;
model early_refin = turn_call
asset
leverage
firm
year
/ adjrsq solution;
output out=firststage p=early_refi;
run;

*second stage;
proc surveyreg data=firststage;
class firm year;
cluster firm;
model elimat = early_refi
asset
leverage
firm
year
/ adjrsq solution;
run;

However, the result I got didn't match with the result my fellow got using R. The estimates are very different from hers, so I'm still looking for possible problems in my program.

I am not sure whether I should put fixed effect variables (firm and year) in the MODEL statement. The result without them in MODEL statement looks more similar with my fellow's, but I think they should be there. Do you have any idea about this?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello @YIN_YI_JEN ,

Yes. You should put fixed effect variables (firm and year) in the MODEL statement indeed.

In another post on this communities site, I have found a link to an article that may be of interest to you in this context:

Computing Clustered Standard Errors for Two-Stage Least Squares in SAS

(2007) Tanguy Brachet, University of Pennsylvania

https://works.bepress.com/tbrachet/2/

I haven't read the article so I don't have an opinion on the quality.

Kind regards,

Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Koen,

Thank you for your reply and the useful information.

I'm trying the method provided in the document that you sent me as a link, but I don't understand what the author meant by "Estimate the structural equation as usual and save the 2SLS residuals (PROC SYSLIN)."

Specifically, I would like to know how to correctly use PROC SYSLIN to do what the author stated. Could you provide me the code using PROC SYSLIN with the variables in my original post?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

I think I understand it.

But my working day is over. Other commitments tonight.

I will try to deliver you the PROC SYSLIN code tomorrow before EOB (Brussels time).

Kind regards,

Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Koen,

No worries. I'm just working very hard to figure this out, because this is the last mile of my MA thesis.

I've been more than thankful for your constant and quick reply.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

>> Could you provide me the code using PROC SYSLIN with the variables in my original post?

Can you test this?

Let me know if it works out for you!

```
proc syslin data=panel_data 2SLS first out=work.pred;
by firm year;
endogenous early_refin;
instruments turn_call asset leverage;
model elimat = early_refin asset leverage;
output PREDICTED=abc RESIDUAL=xyz;
run;
```

Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Koen,

This code works when I drop the BY statement. Is the BY statement for fixed effects?

And I'm trying PROC CPANEL which you've suggested before. Since I only have SAS 9.4, I'm looking for a way to start a CAS session. I was following the steps from a SAS tutorial on YouTube, but problems happened again.

The code I used to start a CAS session is as follows (totally the same as the tutorial):

```
%let path=/Users/lab433/Documents/SAS_dataset/RFS;
libname mycas "&path";
cas mySession sessopts=(caslib=casuser timeout=1800 locale="en_US");
```

However, the log said

"A host name is required to start a session. Use the HOST= CAS statement option, CASHOST=system option, or the _CASHOST_ environment variable to set the host name."

After checking the SAS documentation, I still don't have any idea about what to specify after the HOST statement. Do you know how to start a CAS session within SAS 9.4?

I really appreciate your kindness and help.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello @YIN_YI_JEN ,

I will tell you about what the BY-statement exactly is doing later today. Not much time now.

Now that I look at your code again, I do not think you need a BY-statement.

You need one big model and not as many models as there are BY-groups!

You cannot run PROC CPANEL (SAS Econometrics procedure) if you do not have SAS VIYA.

There's no way to start a CAS session if you only have SAS 9.4.

For CAS, you need SAS VIYA (or a co-existing SAS 9.4 + VIYA).

Kind regards,

Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

I just signed up for SAS Viya trial, but I have difficulties "activating" the environment (starting a CAS session, creating a CASLIB, passing data to CAS, etc.), since I am new to Jupyter Notebook.

Could you provide me the code to do all those "settings" before I can run a PROC CPANEL? If it's too complicated, is there any "SAS Viya on Jupyter Notebook" tutorial that you recommend? (I haven't found one on YouTube yet.)

Thanks a lot.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello @YIN_YI_JEN ,

Why are you using 'Jupyter Notebook' if it's new to you?

In that case I would just use 'SAS Studio 5.x' in your preferred browser

http(s)://SAS_VIYA_SERVER_NAME/SASStudioV

Can you login to SAS Studio? ... then I will tell you in great detail how to do the things you want to accomplish.

Cheers,

Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

Can you get SAS Studio V to work?

Or do you want to stick to Jupyter Notebook?

The latter is also fine of course but I'm not that familiar with coding in Jupyter Notebook. Only reading into it 🙄.

Kind regards,

Koen

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.