BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Dear all,

i have to perform panel data regression with industry-fixed effects and year fixed effects

my sample is having  cross-section of companies and time series of years. moreover, industry classification of each company is also mentioned in the data set. 

the format of my data set is as follows. 

company_nameyearindustry_codeyx1x2x3
20 Microns Ltd20111956.569.8610.1542.7
20 Microns Ltd.2012182.35.1142.8281.8
20 Microns Ltd.20131-25.3-38.2-98.7123
20 Microns Ltd.20141413.726.1130.465.8
7Seas Entertainment Ltd.20112-121.2100.9-96.588
7Seas Entertainment Ltd.20122-20.9-80.718.6246
7Seas Entertainment Ltd.20132-20.9-80.718.6246
7Seas Entertainment Ltd.20142-2.8-21.8-104.841.4
7Seas Entertainment Ltd.201527871.91086.55092.95177.8
A B B India Ltd.20113-5687.4-686-4771.11111.3
A B B India Ltd.20123678.6-263.9-134.7657.5
A B B India Ltd.201333450.126.31066.31654.9
A B B India Ltd.201431010.7-41.61379.241.8
A B B India Ltd.201534309.788.4-2329.26173.1
3I Infotech Ltd.201114309.788.4-2329.26173.1
3I Infotech Ltd.2012139.2-636-20235.2
3I Infotech Ltd.201311056.332.1611.9555.8
3I Infotech Ltd.20141887.956.9992.23359.1
3I Infotech Ltd.20151-174.91.2-519.6232.9
3I Infotech Ltd.20161-828.7542.1-1084.52738.2

 

industry fixed effects is to be estimated based on industry code and time series fixed effects has to be estimated based on year

please suggest  me SAS code 

thanks in advance 

1 ACCEPTED SOLUTION

Accepted Solutions
xyz0505
SAS Employee

This is a system error, can you check your SAS and whether it has the ETS license? By the way I just noticed your industry id has duplicated time sequence. Proc Panel suppose you id time look like:

id time
1 1
1 2
1 3
1 4
2 1
2 2
2 3
2 4

If you have duplicated time periods in one industry I suggest you use dummy directly. Also in proc panel the id statement requires cross section id comes first and then time id:

 

data testdat;
input company_name $ year industry_code y x1 x2 x3;
datalines;
20_Microns_Ltd 2011 1 956.5 69.8 610.1 542.7
20_Microns_Ltd 2012 1 82.3 5.1 142.8 281.8
20_Microns_Ltd 2013 1 -25.3 -38.2 -98.7 123
20_Microns_Ltd 2014 1 413.7 26.1 130.4 65.8
7Seas_Entertai 2011 2 -121.2 100.9 -96.5 88
7Seas_Entertai 2012 2 -20.9 -80.7 18.6 246
7Seas_Entertai 2013 2 -20.9 -80.7 18.6 246
7Seas_Entertai 2014 2 -2.8 -21.8 -104.8 41.4
7Seas_Entertai 2015 2 7871.9 1086.5 5092.9 5177.8
A_B_B_India_Ltd. 2011 3 -5687.4 -686 -4771.1 1111.3
A_B_B_India_Ltd. 2012 3 678.6 -263.9 -134.7 657.5
A_B_B_India_Ltd. 2013 3 3450.1 26.3 1066.3 1654.9
A_B_B_India_Ltd. 2014 3 1010.7 -41.6 1379.2 41.8
A_B_B_India_Ltd. 2015 3 4309.7 88.4 -2329.2 6173.1
3I_Infotech_Ltd. 2011 1 4309.7 88.4 -2329.2 6173.1
3I_Infotech_Ltd. 2012 1 39.2 -636 -20 235.2
3I_Infotech_Ltd. 2013 1 1056.3 32.1 611.9 555.8
3I_Infotech_Ltd. 2014 1 887.9 56.9 992.2 3359.1
3I_Infotech_Ltd. 2015 1 -174.9 1.2 -519.6 232.9
3I_Infotech_Ltd. 2016 1 -828.7 542.1 -1084.5 2738.2
run;
proc sort data=testdat;
by company_name year;
run;

proc panel data=testdat;
id company_name year;
class industry_code year;
model y = x1 x2 year industry_code/ pooled;
run;

View solution in original post

8 REPLIES 8
StatDave
SAS Super FREQ

If you can specify an assumed response distribution, then you can fit an appropriate model that takes account of the correlation among the observations within each company by using PROC GEE. You could fit a model that allows you to see if there are differences among the industries and differences (or trend) over time. For example, something like the following can test for industry differences and a linear trend over time, assuming that the response is normally distributed and an autoregressive correlation structure within companies. Note that validity of the GEE model requires a large number of clusters (level of SUBJECT=), so more would be needed than you show in your example data.

proc gee;
class company_name industry_code;
model y=industry_code year  x1-x3 / type3;
repeated subject=company_name / type=ar;
run;

xyz0505
SAS Employee

I'm not sure what you want, for cross section, you just want industry fix effect only or you want both industry and company effect???

 

For the first one you can do:

proc panel data=your_data_name;
  id industry year;
  model y = x1 x2 x3 / fixtwo;
run;

 

For the second one there's no built in model, you can add all dummies to the pooled model:

proc panel data=your_data_name;
  id company_name year;

  class company_name year industry;
  model y = x1 x2 x3 company_name year industry/ pooled;
run;

srikanthyadav44
Quartz | Level 8
Dear Mr.xyz0505
I want both industry and year fixed effects, but not company fixed effects.
When I use proc panel, I am getting error, because, industry code is repeated for more than one company in the same year.
It is becoming repeated measures panel data and I don't know how to handle it.
Please help me in this regard.
xyz0505
SAS Employee

The error indicate you may need to sort the data first.

 

proc sort data=your_data;
by industry year;
run;

 

PROC PANEL assumes you've already sorted data by cross section first then time.

srikanthyadav44
Quartz | Level 8

dear xyz0505

greeting of the day

when i run the panel regression with the code suggested by you. 

But, i got the following error. 

 

proc panel data=Model_file;
35 model Y = X1- X13 / fixtwo;
36 id year Industry;
37 run;

ERROR: Unable to create an instance of the TKEPAN TK Extension.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PANEL used (Total process time):
real time 0.09 seconds
cpu time 0.00 seconds

 

I am not able to understand it . 

please help me in this regard

thanks in advance 

 

xyz0505
SAS Employee

This is a system error, can you check your SAS and whether it has the ETS license? By the way I just noticed your industry id has duplicated time sequence. Proc Panel suppose you id time look like:

id time
1 1
1 2
1 3
1 4
2 1
2 2
2 3
2 4

If you have duplicated time periods in one industry I suggest you use dummy directly. Also in proc panel the id statement requires cross section id comes first and then time id:

 

data testdat;
input company_name $ year industry_code y x1 x2 x3;
datalines;
20_Microns_Ltd 2011 1 956.5 69.8 610.1 542.7
20_Microns_Ltd 2012 1 82.3 5.1 142.8 281.8
20_Microns_Ltd 2013 1 -25.3 -38.2 -98.7 123
20_Microns_Ltd 2014 1 413.7 26.1 130.4 65.8
7Seas_Entertai 2011 2 -121.2 100.9 -96.5 88
7Seas_Entertai 2012 2 -20.9 -80.7 18.6 246
7Seas_Entertai 2013 2 -20.9 -80.7 18.6 246
7Seas_Entertai 2014 2 -2.8 -21.8 -104.8 41.4
7Seas_Entertai 2015 2 7871.9 1086.5 5092.9 5177.8
A_B_B_India_Ltd. 2011 3 -5687.4 -686 -4771.1 1111.3
A_B_B_India_Ltd. 2012 3 678.6 -263.9 -134.7 657.5
A_B_B_India_Ltd. 2013 3 3450.1 26.3 1066.3 1654.9
A_B_B_India_Ltd. 2014 3 1010.7 -41.6 1379.2 41.8
A_B_B_India_Ltd. 2015 3 4309.7 88.4 -2329.2 6173.1
3I_Infotech_Ltd. 2011 1 4309.7 88.4 -2329.2 6173.1
3I_Infotech_Ltd. 2012 1 39.2 -636 -20 235.2
3I_Infotech_Ltd. 2013 1 1056.3 32.1 611.9 555.8
3I_Infotech_Ltd. 2014 1 887.9 56.9 992.2 3359.1
3I_Infotech_Ltd. 2015 1 -174.9 1.2 -519.6 232.9
3I_Infotech_Ltd. 2016 1 -828.7 542.1 -1084.5 2738.2
run;
proc sort data=testdat;
by company_name year;
run;

proc panel data=testdat;
id company_name year;
class industry_code year;
model y = x1 x2 year industry_code/ pooled;
run;

srikanthyadav44
Quartz | Level 8
dear Mr. xyz0505
thank you for your wonderful suggestion to use dummy variables for the industries groups to capture the fixed effects.
As we know, we cannot introduce the dummy variables for all the groups of industries present in the sample, because it leads to dummy variable trap.
here, i have two alternative ways
1. select one industry group as reference group and not introduce dummy variable to that reference group; or
2. introduce all the dummy variables, but not include constant in the model,

can be please suggest me which alternative is better.
thanks in advance.
xyz0505
SAS Employee

PROC PANEL drop the redundant dummies for you.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1916 views
  • 3 likes
  • 3 in conversation