- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Dear all,
i have to perform panel data regression with industry-fixed effects and year fixed effects
my sample is having cross-section of companies and time series of years. moreover, industry classification of each company is also mentioned in the data set.
the format of my data set is as follows.
company_name | year | industry_code | y | x1 | x2 | x3 |
20 Microns Ltd | 2011 | 1 | 956.5 | 69.8 | 610.1 | 542.7 |
20 Microns Ltd. | 2012 | 1 | 82.3 | 5.1 | 142.8 | 281.8 |
20 Microns Ltd. | 2013 | 1 | -25.3 | -38.2 | -98.7 | 123 |
20 Microns Ltd. | 2014 | 1 | 413.7 | 26.1 | 130.4 | 65.8 |
7Seas Entertainment Ltd. | 2011 | 2 | -121.2 | 100.9 | -96.5 | 88 |
7Seas Entertainment Ltd. | 2012 | 2 | -20.9 | -80.7 | 18.6 | 246 |
7Seas Entertainment Ltd. | 2013 | 2 | -20.9 | -80.7 | 18.6 | 246 |
7Seas Entertainment Ltd. | 2014 | 2 | -2.8 | -21.8 | -104.8 | 41.4 |
7Seas Entertainment Ltd. | 2015 | 2 | 7871.9 | 1086.5 | 5092.9 | 5177.8 |
A B B India Ltd. | 2011 | 3 | -5687.4 | -686 | -4771.1 | 1111.3 |
A B B India Ltd. | 2012 | 3 | 678.6 | -263.9 | -134.7 | 657.5 |
A B B India Ltd. | 2013 | 3 | 3450.1 | 26.3 | 1066.3 | 1654.9 |
A B B India Ltd. | 2014 | 3 | 1010.7 | -41.6 | 1379.2 | 41.8 |
A B B India Ltd. | 2015 | 3 | 4309.7 | 88.4 | -2329.2 | 6173.1 |
3I Infotech Ltd. | 2011 | 1 | 4309.7 | 88.4 | -2329.2 | 6173.1 |
3I Infotech Ltd. | 2012 | 1 | 39.2 | -636 | -20 | 235.2 |
3I Infotech Ltd. | 2013 | 1 | 1056.3 | 32.1 | 611.9 | 555.8 |
3I Infotech Ltd. | 2014 | 1 | 887.9 | 56.9 | 992.2 | 3359.1 |
3I Infotech Ltd. | 2015 | 1 | -174.9 | 1.2 | -519.6 | 232.9 |
3I Infotech Ltd. | 2016 | 1 | -828.7 | 542.1 | -1084.5 | 2738.2 |
industry fixed effects is to be estimated based on industry code and time series fixed effects has to be estimated based on year
please suggest me SAS code
thanks in advance
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is a system error, can you check your SAS and whether it has the ETS license? By the way I just noticed your industry id has duplicated time sequence. Proc Panel suppose you id time look like:
id | time |
1 | 1 |
1 | 2 |
1 | 3 |
1 | 4 |
2 | 1 |
2 | 2 |
2 | 3 |
2 | 4 |
… | … |
If you have duplicated time periods in one industry I suggest you use dummy directly. Also in proc panel the id statement requires cross section id comes first and then time id:
data testdat;
input company_name $ year industry_code y x1 x2 x3;
datalines;
20_Microns_Ltd 2011 1 956.5 69.8 610.1 542.7
20_Microns_Ltd 2012 1 82.3 5.1 142.8 281.8
20_Microns_Ltd 2013 1 -25.3 -38.2 -98.7 123
20_Microns_Ltd 2014 1 413.7 26.1 130.4 65.8
7Seas_Entertai 2011 2 -121.2 100.9 -96.5 88
7Seas_Entertai 2012 2 -20.9 -80.7 18.6 246
7Seas_Entertai 2013 2 -20.9 -80.7 18.6 246
7Seas_Entertai 2014 2 -2.8 -21.8 -104.8 41.4
7Seas_Entertai 2015 2 7871.9 1086.5 5092.9 5177.8
A_B_B_India_Ltd. 2011 3 -5687.4 -686 -4771.1 1111.3
A_B_B_India_Ltd. 2012 3 678.6 -263.9 -134.7 657.5
A_B_B_India_Ltd. 2013 3 3450.1 26.3 1066.3 1654.9
A_B_B_India_Ltd. 2014 3 1010.7 -41.6 1379.2 41.8
A_B_B_India_Ltd. 2015 3 4309.7 88.4 -2329.2 6173.1
3I_Infotech_Ltd. 2011 1 4309.7 88.4 -2329.2 6173.1
3I_Infotech_Ltd. 2012 1 39.2 -636 -20 235.2
3I_Infotech_Ltd. 2013 1 1056.3 32.1 611.9 555.8
3I_Infotech_Ltd. 2014 1 887.9 56.9 992.2 3359.1
3I_Infotech_Ltd. 2015 1 -174.9 1.2 -519.6 232.9
3I_Infotech_Ltd. 2016 1 -828.7 542.1 -1084.5 2738.2
run;
proc sort data=testdat;
by company_name year;
run;
proc panel data=testdat;
id company_name year;
class industry_code year;
model y = x1 x2 year industry_code/ pooled;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you can specify an assumed response distribution, then you can fit an appropriate model that takes account of the correlation among the observations within each company by using PROC GEE. You could fit a model that allows you to see if there are differences among the industries and differences (or trend) over time. For example, something like the following can test for industry differences and a linear trend over time, assuming that the response is normally distributed and an autoregressive correlation structure within companies. Note that validity of the GEE model requires a large number of clusters (level of SUBJECT=), so more would be needed than you show in your example data.
proc gee;
class company_name industry_code;
model y=industry_code year x1-x3 / type3;
repeated subject=company_name / type=ar;
run;
- Tags:
- GEE
- Panel Data
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm not sure what you want, for cross section, you just want industry fix effect only or you want both industry and company effect???
For the first one you can do:
proc panel data=your_data_name;
id industry year;
model y = x1 x2 x3 / fixtwo;
run;
For the second one there's no built in model, you can add all dummies to the pooled model:
proc panel data=your_data_name;
id company_name year;
class company_name year industry;
model y = x1 x2 x3 company_name year industry/ pooled;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I want both industry and year fixed effects, but not company fixed effects.
When I use proc panel, I am getting error, because, industry code is repeated for more than one company in the same year.
It is becoming repeated measures panel data and I don't know how to handle it.
Please help me in this regard.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The error indicate you may need to sort the data first.
proc sort data=your_data;
by industry year;
run;
PROC PANEL assumes you've already sorted data by cross section first then time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
dear xyz0505
greeting of the day
when i run the panel regression with the code suggested by you.
But, i got the following error.
proc panel data=Model_file;
35 model Y = X1- X13 / fixtwo;
36 id year Industry;
37 run;
ERROR: Unable to create an instance of the TKEPAN TK Extension.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PANEL used (Total process time):
real time 0.09 seconds
cpu time 0.00 seconds
I am not able to understand it .
please help me in this regard
thanks in advance
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is a system error, can you check your SAS and whether it has the ETS license? By the way I just noticed your industry id has duplicated time sequence. Proc Panel suppose you id time look like:
id | time |
1 | 1 |
1 | 2 |
1 | 3 |
1 | 4 |
2 | 1 |
2 | 2 |
2 | 3 |
2 | 4 |
… | … |
If you have duplicated time periods in one industry I suggest you use dummy directly. Also in proc panel the id statement requires cross section id comes first and then time id:
data testdat;
input company_name $ year industry_code y x1 x2 x3;
datalines;
20_Microns_Ltd 2011 1 956.5 69.8 610.1 542.7
20_Microns_Ltd 2012 1 82.3 5.1 142.8 281.8
20_Microns_Ltd 2013 1 -25.3 -38.2 -98.7 123
20_Microns_Ltd 2014 1 413.7 26.1 130.4 65.8
7Seas_Entertai 2011 2 -121.2 100.9 -96.5 88
7Seas_Entertai 2012 2 -20.9 -80.7 18.6 246
7Seas_Entertai 2013 2 -20.9 -80.7 18.6 246
7Seas_Entertai 2014 2 -2.8 -21.8 -104.8 41.4
7Seas_Entertai 2015 2 7871.9 1086.5 5092.9 5177.8
A_B_B_India_Ltd. 2011 3 -5687.4 -686 -4771.1 1111.3
A_B_B_India_Ltd. 2012 3 678.6 -263.9 -134.7 657.5
A_B_B_India_Ltd. 2013 3 3450.1 26.3 1066.3 1654.9
A_B_B_India_Ltd. 2014 3 1010.7 -41.6 1379.2 41.8
A_B_B_India_Ltd. 2015 3 4309.7 88.4 -2329.2 6173.1
3I_Infotech_Ltd. 2011 1 4309.7 88.4 -2329.2 6173.1
3I_Infotech_Ltd. 2012 1 39.2 -636 -20 235.2
3I_Infotech_Ltd. 2013 1 1056.3 32.1 611.9 555.8
3I_Infotech_Ltd. 2014 1 887.9 56.9 992.2 3359.1
3I_Infotech_Ltd. 2015 1 -174.9 1.2 -519.6 232.9
3I_Infotech_Ltd. 2016 1 -828.7 542.1 -1084.5 2738.2
run;
proc sort data=testdat;
by company_name year;
run;
proc panel data=testdat;
id company_name year;
class industry_code year;
model y = x1 x2 year industry_code/ pooled;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
thank you for your wonderful suggestion to use dummy variables for the industries groups to capture the fixed effects.
As we know, we cannot introduce the dummy variables for all the groups of industries present in the sample, because it leads to dummy variable trap.
here, i have two alternative ways
1. select one industry group as reference group and not introduce dummy variable to that reference group; or
2. introduce all the dummy variables, but not include constant in the model,
can be please suggest me which alternative is better.
thanks in advance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PROC PANEL drop the redundant dummies for you.