14 hours ago
Dear all,
I am experiencing problem on using PROC PANEL in running GMM model. I have a dataset (attached) with the following variables:
GVKEY = firm identification code
FYEAR = financial year
Samples:
DIV = dummy variable 1 or 0
Dependent variable:
RD = research & development costs, R&D
Independent variables:
RDRD = quadratic term of R&D
MB = market to book ratio
SGWTH = sales growth
CHG_WC = change of main effect
STKISSUES = stock issues
The following is my requirement s:
1. I wish to run a GMM with dependent variable, RD. lagged RD should be included as independent variable in the GMM model for two samples, DIV = 0 and DIV = 1.
2. No lagged value of the independent variables of RDRD, MB and SGWTH.
3. Lagged value of the independent variables of CHG_WC and STKISSUES are to be generated.
4. To assess instrument validity using AR and Sargan test.
I am using the following program to run the regression, however, the results were not generated due to shortage of memory:
proc sort data=want1;
by gvkey fyear;
run;
proc panel data=FC.want1;
by DIV;
id gvkey fyear;
model RD = LAGRD RDRD MB SGWTH CHG_WC STKISSUES
/ gmm nolevels twostep maxband=5 artest = 2;;
run;
I am not sure is there any problem on the above program? Can Proc Panel help to generate the lagged variables of CHG_WC and STKISSUES authomatically?
Thank you and hope to get reply soon.
MSPAK
1. As far as getting p-values and other inference for combinations of parameter estimates (e.g. Beta_2 + Beta_3); you can issue a TEST statement, such as
TEST X1 + X1_1 = 0
which will give you the p-value for the combined effect Beta_2 + Beta_3.
2. For testing across two sub-samples, there is nothing currently in PROC PANEL to do this directly. As an alternative, I would fit it as one regression, placing an indicator variable for small vs. big firm any place that will take it. That is, you would have a main effect for this indicator, and then interact that indicator with all the regressors in the model.
I would advise three things to do before proceeding with this analysis.
1. There are many observations with missing values in one or more of the variables in the regression. Deleting those observations creates significant gaps in the time series for each firms. By default, PROC PANEL will calculate lags based on observations that are consecutive in time, regardless of the time gap. If this is what you want, then you are good. If not, then this would disqualify a bigger subset of firms from the analysis.
2. I would advise against using DIV as a BY-group since DIV varies within firms. Doing so would split the data in two and just exacerbate the problem in 1. I would recommend using DIV as a covarate in the model instead.
3. You can create lagged variables in PROC PANEL by using what I call "dry-run" mode. For example,
proc panel data = want1;
id gvkey fyear;
lag rd(1) chg_wc(1) stkissues(1) / out = want2;
run;
I would start there. Please email if you have any further questions.
Thank you bobby for your suggested program for generating lags using proc panel.
I have another question on proc panel.
It is stated in the user's guide that the DEPVAR options in INSTRUMENTS statement specifies instruments related to dependent variable. With nothing specified, both level and differenced dependent variables are included in the instrument matrix. My question is:
If nothing is specified for DEPVAR option, how do I know how many lags of dependent variables are used in level and differenced? Does SAS provide an option to know the level of lags used?
Thank you.
MSPAK
Yes, several options control lags/leads in the instrument matrix.
tha MAXBAND= options specifies the maximum number of time periods (per instrumental variable) that are allowed into the moment condition. This is the case for dependant and independent variables. In order to fully understand how these moment conditions are created you might also interested in checking the BANDOPT option.
BANDOPT=CENTERED | LEADING | TRAILING specifies which observations are included in the instrument list when the MAXBAND= option is specified. You can specify the following values: CENTERED uses both leading and trailing observations. LEADING uses only leading observations. TRAILING uses only trailing observations.
Thank you bbridgerb for reply. I will try to run the regression and see whether I can obtain how many lagged levels are used for the equation in differences and lagged differences used as instruments for the equation in levels.
I have another few question is that:
1. Referring to the following is the model:
Y = β1Y(t-1) + β2X1(t) + β3X1 (t-1) + β4X2 (t) + β5X2(t-1).....
After running the GMM regression using proc panel, I should be able to obtain all the coefficients β1, β2, and so on.
I can get the combined coefficient for X1 by totaling β2+β3. However, how can i ontain the combined p-value of X1. I think I cannot just totaling the p-value of them. Could anyone suggest an option in proc panel or any SAS program that can be used for GMM?
2. How to get difference-Hansen test (p-value) from the GMM if I run the above regressions for two samples, for example, Big and Small firms? If I need to make a comparison, do you think I need to run GMM for these two samples using a single program?
Thank you.
MSPAK
1. As far as getting p-values and other inference for combinations of parameter estimates (e.g. Beta_2 + Beta_3); you can issue a TEST statement, such as
TEST X1 + X1_1 = 0
which will give you the p-value for the combined effect Beta_2 + Beta_3.
2. For testing across two sub-samples, there is nothing currently in PROC PANEL to do this directly. As an alternative, I would fit it as one regression, placing an indicator variable for small vs. big firm any place that will take it. That is, you would have a main effect for this indicator, and then interact that indicator with all the regressors in the model.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.