Dear SAS Community,
As a first step to perform to perform a rolling regression I would like to create rolling regression windows.
Keintz (2012) suggested following approach:
DATA rwin / view=rwin;
ws = 90;
nwin = nrecs - ws +1;
do w=1 to nwin;
do p=w to w + ws -1;
set myseries point=p nobs=nrecs;
output;
end;
end;
stop;
run;
However, I want to create annual rolling windows but with different numbers of observations per window. Window 1 must contain all firm observations from 1950-1959, window 2 all firm observation of the years 1951-1960 etc…. the last window contains all observations from 2006-2015. The number of observations differ between years.
Is it possible to solve the problem by altering the code of keintz (2012)?
Thanks very much in advance
Best regards
Anja
Hi,
Can you post test data (form of a datastep) with some simple data, and what the output should look like. If you have a date then:
data have; thedate="01jan1950"d; output; thedate="01jan1956"d; output; thedate="01jan1962"d; output; thedate="01jan1970"d; output; run; data want; set have; window=floor((year(thedate)-1950) / 10)+1; run;
So you can get a window number based on year very simply with a formula.
Hello RW9,
Thanks for your reply. This is how my data set looks like (rudimenarily)
Beob. Year FirmID Earnings LaggedEarnings 1 2 3 4 5 6 7 8 9 10 11 12 13
1970 | 1 | 100 | 0 |
1971 | 1 | 200 | 100 |
1972 | 1 | 200 | 200 |
1973 | 1 | 200 | 200 |
1974 | 1 | 200 | 200 |
1970 | 2 | 100 | 0 |
1971 | 2 | 200 | 100 |
1972 | 2 | 200 | 200 |
1970 | 3 | 100 | 0 |
1971 | 3 | 200 | 100 |
1972 | 3 | 200 | 200 |
1973 | 3 | 200 | 200 |
1974 | 3 | 200 | 200 |
And that is how the ouput should look like (but with 10 years rolling regression instead of 3 years)
Beob. Window Year FirmID Earnings LaggedEarnings 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
1 | 1970 | 1 | 100 | 0 |
1 | 1970 | 2 | 100 | 0 |
1 | 1970 | 3 | 100 | 0 |
1 | 1971 | 1 | 200 | 100 |
1 | 1971 | 2 | 200 | 100 |
1 | 1971 | 3 | 200 | 100 |
1 | 1972 | 1 | 200 | 200 |
1 | 1972 | 2 | 200 | 200 |
1 | 1972 | 3 | 200 | 200 |
2 | 1971 | 1 | 200 | 100 |
2 | 1971 | 2 | 200 | 100 |
2 | 1971 | 3 | 200 | 100 |
2 | 1972 | 1 | 200 | 200 |
2 | 1972 | 2 | 200 | 200 |
2 | 1972 | 3 | 200 | 200 |
2 | 1973 | 1 | 200 | 200 |
2 | 1973 | 3 | 200 | 2001 |
3 | 1972 | 1 | 200 | 200 |
3 | 1972 | 3 | 200 | 200 |
3 | 1973 | 1 | 200 | 200 |
3 | 1973 | 3 | 200 | 2001 |
3 | 1974 | 1 | 200 | 200 |
3 | 1974 | 3 | 200 | 200 |
Thanks !!
Hi,
Its a good idea to post test data in the form of a datastep.
data have; input Year FirmID Earnings LaggedEarnings; datalines; 1970 1 100 0 1971 1 200 100 1972 1 200 200 1973 1 200 200 1974 1 200 200 1970 2 100 0 1971 2 200 100 1972 2 200 200 1970 3 100 0 1971 3 200 100 1972 3 200 200 1973 3 200 200 1974 3 200 200 ; run; data want (drop=inner); set have; retain window firm_id inner; if _n_=1 then do; window=1; firm_id=1; inner=1; end; else do; firm_id=sum(firm_id,1); if firm_id > 3 then do; firm_id=1; inner=sum(inner,1); if inner > 3 then do; inner=1; window=sum(window,1); end; end; end; run;
Hi,
Thank you very much. Sorry, my statement was a misleading.
I want to divide my dataset by time-windows. The window "1970" should contain all firm data of years 1961-1970. The window "1971" should contain all firm data of years 1962-1971 and so on.... So in the end all firms of the year 1970 are listed in 10 different windows (1970-1979).
Afterwants I want to run a pooled regression. Like:
PROC Reg Data=want Outest=Regression;
By Window;
Run;
So that I get regression coefficients for each window.
It is not very elegant because I have a lot of doublings in my want dataset but since I am new to SAS I do not know how to code pooled rolling regression otherwise.
Thank you very much.
Something like (maybe a bit overcomplicated, don't really have time to look at it at the moment):
data have; input Year FirmID Earnings LaggedEarnings; datalines; 1970 1 100 0 1971 1 200 100 1972 1 200 200 1973 1 200 200 1974 1 200 200 1970 2 100 0 1971 2 200 100 1972 2 200 200 1970 3 100 0 1971 3 200 100 1972 3 200 200 1973 3 200 200 1974 3 200 200 1983 3 200 200 1984 3 200 200 ; run; data want; length window 8; set have (where=(1=0)); run; proc sql noprint; select min(YEAR),max(YEAR) into :LOW,:HIGH from HAVE; quit; %macro Get_Window; %local window; %let window=1; %do i=&low. %to &high. %by 10; data temp; set have (where=(&i. <= year <= %eval(&i.+10))); window=&window.; run; proc append base=want data=temp; run; %let window=%eval(&window.+1); %end; %mend; %Get_Window;
Hello RW9,
thanks again. Your macro runs smoothly and it divides the data into 10-years windows (e.g All firm data from 1960-1969--> window 1, data from 1970-1979 --> window 2...).
But do you maybe habe an idea how to create rolling 10-years windows? Like window 1 1960-1969; window 2 1961-197o ....?
Try dropping the "%by 10" part. You will have to fiddle with it from there, I am travelling now.
RW9, thank you very much. It works
You made my day!!
I'd recommend the following sequence:
The code here applies that process to sashelp.stocks (10 days instead of 10 years)
For a 10-year window this process of making 10-year rolling SSCP may not be particlularly faster than making 10-year rolling oriiginal data windows, but for daily and monthly data, where each window commonly has more records, it becomes notably superior.
proc sort data=sashelp.stocks
out=stocks (rename=(close=y open=ylag1 high=x1 low=x2 volume=x3 adjclose=x4));
by stock date;
run;
/* Make a data set view of squares and cross-products */
data sqcp /view=sqcp;
set stocks;
array vars {7} intercept y ylag1 x1-x4;
retain intercept 1;
array sqcp {7,7};
do r=1 to 7;
sqcp{r,r}=vars{r}**2;
if r<7 then do c=r+1 to 7;
sqcp{r,c}=vars{r}*vars{c};
sqcp{c,r}=sqcp{r,c};
end;
end;
drop r c;
run;
/* Make rolling sscp (SUM of squares & cross-prods) data for 10-day windows */
proc expand data=sqcp out=rsscp_data2 (where=(intercept=10)) method=none;
by stock;
id date;
convert _numeric_ / transform=(movsum 10);
run;
/* Pool sscp data over all stocks */
proc summary data=rsscp_data nway;
class date;
var _numeric_;
output out=pooled_rsscp_data (drop=_TYPE_ _FREQ_) sum=;
run;
/* Reformat the data as a TYPE=SSCP data set, ready for PROC REG */
data pooled_rolling_sscp (type=SSCP drop=sqcp: n v);
length _TYPE_ $8 _NAME_ $32;
set pooled_rsscp_data;
array vars {7} intercept y ylag1 x1-x4;
array sqcp {7,7};
array vnames {7} $32 _temporary_ ;
if _n_=1 then do v=1 to dim(vars); vnames{v}=vname(vars{v}); end;
_TYPE_='SSCP';
do n=1 to 7;
_NAME_=vnames{n};
do v=1 to 7; vars{v}=sqcp{n,v}; end;
output;
end;
_type_='N';
_NAME_=' ';
do v=1 to 7; vars{v}=sqcp{1,1}; end;
output;
run;
proc reg data=pooled_rolling_sscp (obs=8);
var y ylag1 x1-x4;
model y=ylag1 x1-x4;
run;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.