Dear all, I need your help regarding finding an adequate proc command to analyze a panel dataset with several thousand firm-year observations. My dataset, of which I also attached a small fragment of 100 observations, includes the variables firm_id, year, Industry_id, the dependent variable Y and 3 independent variables X1-X3. My goal is to run an Industry and Year fixed effects regression with standard error clustering on the firm-level. I have seen that several options are possible, but I wonder if I understood them correctly, how they differentiate and which would be appropriate for me to use: Option 1: Proc surveyreg proc surveyreg data=testset;
cluster firm_id; class Industry_id Year;
model Y = X1 X2 X3 / solution;
run;
quit; The problem, that I see here is that proc surveyreg is mainly for analyzing survey data and not regular panel data or am I wrong and this should not be of any concern? Furthermore, is it correct, that the the cluster statement is responsible for the firm-level clustering and the class statement for the fixed effects or what exactly is the class statement doing in this case ? Option 2: Proc GLM proc glm data=testset;
class industry_id year;
model Y = X1 X2 X3 /solution;
run;
quit; Unfortunately, I don't see how to cluster for the firms with this option, or is there any statement? Also is the class statement correct in this case to have a Year and industry FE regression? Otherwise, would the absorb statement be the correct way to account for Fixed Effects? Option 3: Proc panel proc sort data=testset;
by firm_id year;
run;
proc panel data=testset;
id firm_id year;
model Y = X1 X2 X3 /Fixtwo;
Run; After sorting the dataset, I tried using the Proc panel method, but here I don't see where to include the Industry_id variable and thus account for the Industry FE. Also, is the assumption correct that the Fixtwo statement, corrects for 1) the firm FE (due to firm_id) and 2) the Year FE? How could I instead include the Industry_id as a FE and cluster the Std errors on firm level? A fourth option would maybe be the Proc tscsreg: proc tscsreg data=testset;
id firm_id year;
model Y = X1 X2 X3 /fixtwo;
run;
quit; But again, here I don't cluster the standard errors and also I would again account for the Firm and Year FE instead of an Industry FE. So in general I have the problem that I don't find a way to combine the clustering with the FE approach and also I have a problem with including the Industry_id variable, to have a Industry FE regression. Has somebody already run this kind of regression and/or could please help me with this problem?
... View more