Hi
I have run a regression on a dataset that has 2 different IDs, there is one X and one Y variable. I have an intercept and a coefficient for each one of these two IDs. My ultimate purpose is to Subtract intercept of ID5 from the intercept of ID1 (b0 (ID5) - b0 (ID1)) and measure NeweyWest t-stat of this difference. Please guide me in this regard, thanks.
See the code below that I am using for regression:
ods exclude all;
proc model data=Have;
by ID;
endo Y;
exog X;
instruments _exog_;
parms b0 b1;
Y=b0 + b1*X;
fit Y / gmm kernel=(bart,5,0) vardef=n;
ods output parameterestimates=Want;
run;
ods exclude none;
What is the model? Are the Month and Year variables important? Do you want the slopes to depend on the ID, or just the intercepts?
If the model is Y = X and the ID only affects the intercept, you can use
proc glm data=Have;
class ID(ref='5');
model Y = X ID / solution;
run;
In the ParameterEstimates table, the row for 'ID 1' shows the difference between the ID=1 level and the reference level (which is ID=5). The 't Value' column gives the value of the t statistic and the ''Pr > |t|)' column is the p value for the statistic.
Do you want OLS or time series regression?
For OLS regression, you can use the CLASS statement and set the reference level of the ID variable to ID1. For example, here is an example that sets the reference level for the TYPE variable in an OLS regression?
proc glm data=sashelp.cars;
where type in ('Sedan' 'SUV' 'Truck' 'Wagon' 'Sports');
class type(ref='SUV'); /* set reference level for classification variable */
model mpg_city = weight type / solution;
run;
What is the model? Are the Month and Year variables important? Do you want the slopes to depend on the ID, or just the intercepts?
If the model is Y = X and the ID only affects the intercept, you can use
proc glm data=Have;
class ID(ref='5');
model Y = X ID / solution;
run;
In the ParameterEstimates table, the row for 'ID 1' shows the difference between the ID=1 level and the reference level (which is ID=5). The 't Value' column gives the value of the t statistic and the ''Pr > |t|)' column is the p value for the statistic.
I believe I answered your first question. The intercept for ID=5 is the Intercept term in the model. The intercept for ID=1 is (Intercept + the estimate for ID=1). If you also want the slope to depend on the ID variable, change the model to
MODEL Y = X | ID / solution;
I don't know anything about Newey-West corrected standard errors, but you can ask questions about PROC MODEL and time series in the SAS Forecasting and Econometric community. From Wikipedia, it appears that Newey-West corrections are for autocorrelated and heteroskedastic errors in time series data. OLS regression assumes that you do not have autocorrelation or heteroskedasticity.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.