I want to do a special rolling window.
I want to estimate a variable X every quarter using the 60 past observations of variable Y. This is an example the variable X of december 2015 is estimed using data from january 2010 to december 2015 of the variable Y. (if I do not have engouh data Y we can use the data if there is more that 36 observtions). then, we repeat the estimation for march 2015 using the data from april 2010 to march 2015.
Then I will have an estimate of X every quarter (march juin september and december of each year).
How can I do that?????
A bit of a blast from the past, but one of the first results when searching for 'rolling regression' is this post:
You asked a very similar question a few years ago 🙂
https://communities.sas.com/t5/SAS-Procedures/macro-for-a-rolling-window-regression/m-p/177594
Some example input data would be helpful. You should convert an existing SAS dataset (or at least enough to test your need with) as a datastep. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as a text file (txt extension) to show exactly what you have and that we can test code against.
Your case should include 3 cases, one with the 60 values, one with fewer than 60 but more than 36 and one with fewer.
Also you should show what the desired output for the example data would be.
However your description "Then I will have an estimate of X every quarter (march juin september and december of each year)." does not appear to be a rolling window, that would be a quarterly summary for the previous 3 months for each month: March, April, May etc.
It seems like you really want a per calendar quarter summary. Which if you have a SAS date value for the dates can be easily accomplished.
The task is to run rolling 60-month regressions of monthly returns.
the variable estimated at 31Dec2015, for instance, uses data from 31Jan2010 to 31Dec2015, inclusive.
so it is an estimation using a rolling window
Show data, input and desired output.
I have really a trouble to put data can you please explain for me more clearly that the link how to do it
Use PROC EXPAND with the CONVERT statement.
For what proc expand Reeza
@sasphd wrote:
For what proc expand Reeza
I don't know what that means.
Look at the example and links in the article "Compute a moving average in SAS"
If you don't have big table, try SQL.
proc sql;
create table want as
select *,(select mean(x) from have where date between intnx('month',a.date,-6,'s') and a.date and id=a.id) as rolling_6month_mean
from have as a;
quit;
I did not want to calculate a mean. I want to use the data to do regression something like that
http://pages.stern.nyu.edu/~adesouza/sasfinphd/index/node25.html
@sasphd wrote:
I did not want to calculate a mean. I want to use the data to do regression something like that
http://pages.stern.nyu.edu/~adesouza/sasfinphd/index/node25.html
I get a "Page not found" for that link.
Boehmer, Broussard, and Kallunki (2002) recommend using macros to run rolling regressions. While macros make impossible tasks possible, they aren't particularly efficient. I describe here a macro-independent way of running rolling regressions, and doing similar tasks.
The task is to run rolling 24-month regressions of monthly stock excess returns on the Fama-French-Carhart factors. That is, at the end of each month, I need to:
I first create a dataset containing ``rankdates'', which are the date identifiers for the rolling regression. A rankdate of 31Dec2001, for instance, uses data from 31Jan2000 to 31Dec2001, inclusive.
To do this, I first get the first and last date for each permno, and then create a complete list by filling in the in-between dates. I do this because if I were to just use the permno-date pairs available in MSF, if a return is missing for a particular month, then that month will not be a rankdate, even though it might be perfectly valid to make that date a rankdate.
data firstandlastdates; set crsp.msf(keep=permno date); by permno; /*MSF is always sorted by permno date*/ retain firstdate; date=intnx('month', date, 1)-1; if first.permno then firstdate=date; if last.permno then do; lastdate=date; output; end; run; data permnosrankdates(rename=(date=rankdate)); set firstandlastdates; date=firstdate; do while(date<=lastdate); output; date=intnx('month', date+1, 1)-1; end; run;
For each rankdate, I then get the list of the 24 dates from which that rankdate will use data.
data permnosrankdates; set permnosrankdates; date=rankdate; i=1; do while(i<=24); output; date=intnx('month', date, 0)-1; i=i+1; end; run;
permnosrankdates is a dataset that looks, in part, like this:
Obs PERMNO rankdate date i 1 10000 19851231 31DEC1985 1 2 10000 19851231 30NOV1985 2 3 10000 19851231 31OCT1985 3 4 10000 19851231 30SEP1985 4 5 10000 19851231 31AUG1985 5 6 10000 19851231 31JUL1985 6 7 10000 19851231 30JUN1985 7 8 10000 19851231 31MAY1985 8 9 10000 19851231 30APR1985 9 10 10000 19851231 31MAR1985 10 11 10000 19851231 28FEB1985 11 12 10000 19851231 31JAN1985 12 13 10000 19851231 31DEC1984 13 14 10000 19851231 30NOV1984 14 15 10000 19851231 31OCT1984 15 16 10000 19851231 30SEP1984 16 17 10000 19851231 31AUG1984 17 18 10000 19851231 31JUL1984 18 19 10000 19851231 30JUN1984 19 20 10000 19851231 31MAY1984 20 21 10000 19851231 30APR1984 21 22 10000 19851231 31MAR1984 22 23 10000 19851231 29FEB1984 23 24 10000 19851231 31JAN1984 24 25 10000 19860131 31JAN1986 1 26 10000 19860131 31DEC1985 2 27 10000 19860131 30NOV1985 3 28 10000 19860131 31OCT1985 4 29 10000 19860131 30SEP1985 5 30 10000 19860131 31AUG1985 6
We don't need to keep i, but I kept it for clarity.
Once we have this, all we need to do is merge it with the factors and the returns:
data ff; set ff.factors_monthly(keep=date rf smb hml umd mktrf); date=intnx('month', date, 1)-1; run; proc sort data=permnosrankdates; by date permno; data permnosrankdates; merge permnosrankdates(in=a) ff(in=b); by date; if a and b; run; data msf; set crsp.msf(keep=permno date ret); where ret is not missing; date=intnx('month', date, 1)-1; run; proc sort data=msf; by date permno; run; /*permnosrankdates is already sorted*/ data permnosrankdates; merge permnosrankdates(in=a) msf(in=b); by date permno; if a and b; run;
Notice that I merged by date, not rankdate.
And now all that remains is to calculate excess returns and run the regressions:
data permnosrankdates; set permnosrankdates; exret=ret-rf; run; proc sort data=permnosrankdates; by permno rankdate; proc reg data=permnosrankdates outest=est edf; by permno rankdate; model exret=mktrf smb hml umd; run;
Notice I run proc reg by rankdate, not date.
Running proc print on EST yields the following output:
Obs PERMNO rankdate _MODEL_ _TYPE_ _DEPVAR_ _RMSE_ Intercept mktrf 1 10000 19860228 MODEL1 PARMS exret . -0.2624 0.000 2 10000 19860331 MODEL1 PARMS exret . 1.9027 -32.219 3 10000 19860430 MODEL1 PARMS exret . 1.4644 -31.110 4 10000 19860531 MODEL1 PARMS exret . 4.3923 2.102 5 10000 19860630 MODEL1 PARMS exret . -14.0666 -191.074 6 10000 19860731 MODEL1 PARMS exret 0.48716 -0.1872 -4.129 7 10000 19860831 MODEL1 PARMS exret 0.34479 -0.1559 -2.784 8 10000 19860930 MODEL1 PARMS exret 0.29117 -0.1216 -0.492 9 10000 19861031 MODEL1 PARMS exret 0.25216 -0.1214 -0.499 10 10000 19861130 MODEL1 PARMS exret 0.25809 -0.0767 -1.790 11 10000 19861231 MODEL1 PARMS exret 0.27266 -0.1228 -0.709 12 10000 19870131 MODEL1 PARMS exret 0.25251 -0.1217 -0.665 13 10000 19870228 MODEL1 PARMS exret 0.23828 -0.1090 -0.784 14 10000 19870331 MODEL1 PARMS exret 0.24214 -0.1270 -1.872 15 10000 19870430 MODEL1 PARMS exret 0.22978 -0.1254 -1.942 16 10000 19870531 MODEL1 PARMS exret 0.22052 -0.1194 -1.983 17 10000 19870630 MODEL1 PARMS exret 0.22052 -0.1194 -1.983 Obs smb hml umd exret _IN_ _P_ _EDF_ _RSQ_ 1 0.000 0.00 0.000 -1 0 1 0 . 2 0.000 0.00 0.000 -1 1 2 0 1.00000 3 -71.326 0.00 0.000 -1 2 3 0 1.00000 4 340.707 477.88 0.000 -1 3 4 0 1.00000 5 -884.257 -1322.51 506.315 -1 4 5 0 1.00000 6 -13.672 -16.38 7.106 -1 4 5 1 0.05020 7 -8.529 -10.71 5.470 -1 4 5 2 0.54571 8 7.327 3.24 4.754 -1 4 5 3 0.51848 9 7.303 3.22 4.751 -1 4 5 4 0.52996 10 3.312 -0.84 3.657 -1 4 5 5 0.42145 11 4.391 0.97 3.748 -1 4 5 6 0.30088 12 4.378 0.91 3.725 -1 4 5 7 0.30569 13 4.342 -0.04 3.353 -1 4 5 8 0.31433 14 0.837 -3.20 2.548 -1 4 5 9 0.26778 15 0.669 -3.35 2.531 -1 4 5 10 0.27495 16 0.604 -3.41 2.454 -1 4 5 11 0.27148 17 0.604 -3.41 2.454 -1 4 5 11 0.27148
We can tell how many observations we have for each date by looking at the _EDF_: the error degrees of freedom. We asked for this by including edf in the proc model statement. If we want to keep only regressions for which there were 12 observations or more, we keep those where _EDF_>=7.
Added by @Reeza: This is originally from:
http://pages.stern.nyu.edu/~adesouza/sasfinphd/index/node25.html
That would have been SAS 8 of some flavor I believe. I would seriously look through the SAS/ETS procedures to see which features may have been added to address your issues.
1) Make a macro to do it.
%macro reg(start=,end=);
proc reg data=have(where=(date between &start and &end));
............
%mend;
data _null_;
do i=1 to nobs-window;
call execute('%reg(start=i , end=i+window)');
end;
run;
2) IML code .
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.