I have firm-year observations with industry data. Using this data, I have created industry*year interaction called 'group1'. I would like to
1) cluster standard error by firm-year
2) include firm and group1 fixed effect
What I could have used (which worked if the fixed effects were only industry and year) is as below:
proc surveyreg data=work;
cluster fyear gvkey;
class gvkey group1;
model dv=iv1 iv2 iv3 gvkey group1 / adjrsq solution;
quit;
This no longer works because I have too many variables for fixed effects (both firm and group1). I also cannot do 2-way clustering (fyear and gvkey) in this procedure.
I have been downloading the dataset and run the tests with STATA, but I believe there must be a way to yield the same results using SAS. Would anyone be able to help me with any suggestions? Thank you in advance.
Multilevel (hierarchical) linear models are generally fit by using PROC MIXED (or GLIMMIX for certain response variables). In PROC MIXED you can use the RANDOM statement (and SUBJECT= option) to specify the nested relationships, such as students within classroom within school within districts.
There are many papers that you can find if you search for
sas proceedings hierarchical models "proc mixed"
such as
Using PROC MIXED in Hierarchical Linear Models
A Multilevel Model Primer Using SAS PROC MIXED
There are also papers that show how to use PROC GLMMIX for generalized linear models and PROC NLMIXED for nonlinear models. Replace "proc mixed" int he search with the relevant terms if you are interested in those topics.
Since you mention STATA, I will mention that there is an excellent book that compares mixed models in different software packages: West, B. T., Welch, K. B., & Galecki, A. T. (2015). Linear mixed models: A practical guide using statistical software (2nd ed.). Boca Raton, FL: CRC Press. It compares SAS, SPSS, Stata, R/S-plus, and HLM.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.