Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- How to create rolling windows with different numbers of observations p...

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 12-08-2016 11:20 AM
(2763 views)

Dear SAS Community,

As a first step to perform to perform a rolling regression I would like to create rolling regression windows.

Keintz (2012) suggested following approach:

DATA rwin / view=rwin;

ws = 90;

nwin = nrecs - ws +1;

do w=1 to nwin;

do p=w to w + ws -1;

set myseries point=p nobs=nrecs;

output;

end;

end;

stop;

run;

However, I want to create annual rolling windows but with different numbers of observations per window. Window 1 must contain all firm observations from 1950-1959, window 2 all firm observation of the years 1951-1960 etc…. the last window contains all observations from 2006-2015. The number of observations differ between years.

Is it possible to solve the problem by altering the code of keintz (2012)?

Thanks very much in advance

Best regards

Anja

9 REPLIES 9

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi,

Can you post test data (form of a datastep) with some simple data, and what the output should look like. If you have a date then:

data have; thedate="01jan1950"d; output; thedate="01jan1956"d; output; thedate="01jan1962"d; output; thedate="01jan1970"d; output; run; data want; set have; window=floor((year(thedate)-1950) / 10)+1; run;

So you can get a window number based on year very simply with a formula.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello RW9,

Thanks for your reply. This is how my data set looks like (rudimenarily)

Beob. Year FirmID Earnings LaggedEarnings 1 2 3 4 5 6 7 8 9 10 11 12 13

1970 | 1 | 100 | 0 |

1971 | 1 | 200 | 100 |

1972 | 1 | 200 | 200 |

1973 | 1 | 200 | 200 |

1974 | 1 | 200 | 200 |

1970 | 2 | 100 | 0 |

1971 | 2 | 200 | 100 |

1972 | 2 | 200 | 200 |

1970 | 3 | 100 | 0 |

1971 | 3 | 200 | 100 |

1972 | 3 | 200 | 200 |

1973 | 3 | 200 | 200 |

1974 | 3 | 200 | 200 |

And that is how the ouput should look like (but with 10 years rolling regression instead of 3 years)

Beob. Window Year FirmID Earnings LaggedEarnings 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

1 | 1970 | 1 | 100 | 0 |

1 | 1970 | 2 | 100 | 0 |

1 | 1970 | 3 | 100 | 0 |

1 | 1971 | 1 | 200 | 100 |

1 | 1971 | 2 | 200 | 100 |

1 | 1971 | 3 | 200 | 100 |

1 | 1972 | 1 | 200 | 200 |

1 | 1972 | 2 | 200 | 200 |

1 | 1972 | 3 | 200 | 200 |

2 | 1971 | 1 | 200 | 100 |

2 | 1971 | 2 | 200 | 100 |

2 | 1971 | 3 | 200 | 100 |

2 | 1972 | 1 | 200 | 200 |

2 | 1972 | 2 | 200 | 200 |

2 | 1972 | 3 | 200 | 200 |

2 | 1973 | 1 | 200 | 200 |

2 | 1973 | 3 | 200 | 2001 |

3 | 1972 | 1 | 200 | 200 |

3 | 1972 | 3 | 200 | 200 |

3 | 1973 | 1 | 200 | 200 |

3 | 1973 | 3 | 200 | 2001 |

3 | 1974 | 1 | 200 | 200 |

3 | 1974 | 3 | 200 | 200 |

Thanks !!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi,

Its a good idea to post test data in the form of a datastep.

data have; input Year FirmID Earnings LaggedEarnings; datalines; 1970 1 100 0 1971 1 200 100 1972 1 200 200 1973 1 200 200 1974 1 200 200 1970 2 100 0 1971 2 200 100 1972 2 200 200 1970 3 100 0 1971 3 200 100 1972 3 200 200 1973 3 200 200 1974 3 200 200 ; run; data want (drop=inner); set have; retain window firm_id inner; if _n_=1 then do; window=1; firm_id=1; inner=1; end; else do; firm_id=sum(firm_id,1); if firm_id > 3 then do; firm_id=1; inner=sum(inner,1); if inner > 3 then do; inner=1; window=sum(window,1); end; end; end; run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi,

Thank you very much. Sorry, my statement was a misleading.

I want to divide my dataset by time-windows. The window "1970" should contain all firm data of years 1961-1970. The window "1971" should contain all firm data of years 1962-1971 and so on.... So in the end all firms of the year 1970 are listed in 10 different windows (1970-1979).

Afterwants I want to run a pooled regression. Like:

PROC Reg Data=want Outest=Regression;

By Window;

Run;

So that I get regression coefficients for each window.

It is not very elegant because I have a lot of doublings in my want dataset but since I am new to SAS I do not know how to code pooled rolling regression otherwise.

Thank you very much.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Something like (maybe a bit overcomplicated, don't really have time to look at it at the moment):

data have; input Year FirmID Earnings LaggedEarnings; datalines; 1970 1 100 0 1971 1 200 100 1972 1 200 200 1973 1 200 200 1974 1 200 200 1970 2 100 0 1971 2 200 100 1972 2 200 200 1970 3 100 0 1971 3 200 100 1972 3 200 200 1973 3 200 200 1974 3 200 200 1983 3 200 200 1984 3 200 200 ; run; data want; length window 8; set have (where=(1=0)); run; proc sql noprint; select min(YEAR),max(YEAR) into :LOW,:HIGH from HAVE; quit; %macro Get_Window; %local window; %let window=1; %do i=&low. %to &high. %by 10; data temp; set have (where=(&i. <= year <= %eval(&i.+10))); window=&window.; run; proc append base=want data=temp; run; %let window=%eval(&window.+1); %end; %mend; %Get_Window;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello RW9,

thanks again. Your macro runs smoothly and it divides the data into 10-years windows (e.g All firm data from 1960-1969--> window 1, data from 1970-1979 --> window 2...).

But do you maybe habe an idea how to create rolling 10-years windows? Like window 1 1960-1969; window 2 1961-197o ....?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Try dropping the "%by 10" part. You will have to fiddle with it from there, I am travelling now.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

RW9, thank you very much. It works

You made my day!!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I'd recommend the following sequence:

- Make sure your dataset is sorted by ID/YEAR
- Make a data set VIEW containing all the needed squares and cross-products - call it SQ_CP_BY_IDYEAR
- Run a proc expand to generate 10-year rolling SSCP values for each firm
- Run proc summary to pool data over all firms
- Convert the rolling sscp into a TYPE=SSCP dataset, which PROC REG can process

The code here applies that process to sashelp.stocks (10 days instead of 10 years)

For a 10-year window this process of making 10-year rolling SSCP may not be particlularly faster than making 10-year rolling oriiginal data windows, but for daily and monthly data, where each window commonly has more records, it becomes notably superior.

```
proc sort data=sashelp.stocks
out=stocks (rename=(close=y open=ylag1 high=x1 low=x2 volume=x3 adjclose=x4));
by stock date;
run;
/* Make a data set view of squares and cross-products */
data sqcp /view=sqcp;
set stocks;
array vars {7} intercept y ylag1 x1-x4;
retain intercept 1;
array sqcp {7,7};
do r=1 to 7;
sqcp{r,r}=vars{r}**2;
if r<7 then do c=r+1 to 7;
sqcp{r,c}=vars{r}*vars{c};
sqcp{c,r}=sqcp{r,c};
end;
end;
drop r c;
run;
/* Make rolling sscp (SUM of squares & cross-prods) data for 10-day windows */
proc expand data=sqcp out=rsscp_data2 (where=(intercept=10)) method=none;
by stock;
id date;
convert _numeric_ / transform=(movsum 10);
run;
/* Pool sscp data over all stocks */
proc summary data=rsscp_data nway;
class date;
var _numeric_;
output out=pooled_rsscp_data (drop=_TYPE_ _FREQ_) sum=;
run;
/* Reformat the data as a TYPE=SSCP data set, ready for PROC REG */
data pooled_rolling_sscp (type=SSCP drop=sqcp: n v);
length _TYPE_ $8 _NAME_ $32;
set pooled_rsscp_data;
array vars {7} intercept y ylag1 x1-x4;
array sqcp {7,7};
array vnames {7} $32 _temporary_ ;
if _n_=1 then do v=1 to dim(vars); vnames{v}=vname(vars{v}); end;
_TYPE_='SSCP';
do n=1 to 7;
_NAME_=vnames{n};
do v=1 to 7; vars{v}=sqcp{n,v}; end;
output;
end;
_type_='N';
_NAME_=' ';
do v=1 to 7; vars{v}=sqcp{1,1}; end;
output;
run;
proc reg data=pooled_rolling_sscp (obs=8);
var y ylag1 x1-x4;
model y=ylag1 x1-x4;
run;
quit;
```

--------------------------

The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for

Allow PROC SORT to output multiple datasets

--------------------------

The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for

Allow PROC SORT to output multiple datasets

--------------------------

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.