Why do you set macrovar EARLIEST_ENDDATE to August 2017? Seems to me that is the latest enddate, which is not a needed parameter for this task.
Given your earliest_begdate=31dec1996, deterimine what window size is the shortest permisssable (5 months, 15 months after 31dec1996?). Then use the corresponding date as EARLIEST_ENDDATE.
I don't have time to work on something this complex, especially to download a file and work with it. Here's an example I did for rolling means. It should give you the idea on how to do your recursive regression. Good Luck. https://gist.github.com/statgeek/e5e43ff45a4ba1f64d0873ff3bc35974
Are you saying that you want to do a regression for every rolling window of every size from NV+1 up to 141 months, for each country (where NV is n of vars in your model)?
In other words, if each country has 240 months of data, and 5 variables in your regression model, you want the following collection of windows
235 rolling windows of size 6
234 rolling windows of size 7
...
100 rolling windows of size 141
==============================
22,780 rolling windows per country containing 1,464,720 records
I simply don't see the point of doing this.
But on the subject of efficient performance of rolling regressions, it makes much more sense to construct rolling SSCP's which can then be passed to PROC REG, with an accompanying BY COUNTRY WINDOW_ID statement (or in your case BY COUNTRY START_DATE END_DATE). The rolling SSCP can be efficiently constructed by single pass through the original 240 observations, adding a new observation to the SSCP and simultaneously dropping the old observation leaving the rolling window. That's relatively trivial, but quite efficient You have the non-trivial complication of wanting a range of window sizes.
Thanks for your help. Not exactly.
I have 10 countries with each country has 239 months of data,
I want the following collection of 98 recursive windows for EACH country, as follows:
Country 1:
Window 1 has the first 141 observations (first 141 months)
Window 2 has the first 142 observations (first 142 months)
Window 3 has the first 143 observations (first 141 months)
...
Window 98 has the first 239 observations (first 239 months: and this is the full sample period)
Country 2:
Window 1 has the first 141 observations (first 141 months)
Window 2 has the first 142 observations (first 142 months)
Window 3 has the first 143 observations (first 141 months)
...
Window 98 has the first 239 observations (first 239 months: and this is the full sample period)
an so on.....
==============================
The best way to provide data examples is in the form of data step code. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
Data presented that way removes ambiguities of character/numeric or date value vs character and a few other issues common with pasted text "data'.
Then post a second data step of the resultant data set for the example input.
Here's a sample program that does "recursive" regressions using sashelp.stocks. Each regression is identified by STOCK, BEG_DATE, and END_DATE, where BEG_DATE in your case is a constant. The "trick" is not to constantly re-read observations to perform recursive regressions, but rather to pass through the data once, creating recursive SSCP's, which can then be read into PROC REG, with a BY STOCK BEG_DATE END_DATE set of by groups. Of course, you'll probably want to tell the proc reg to output datasets of estimates rather than printing to the listing file as I have below:
proc sort data=sashelp.stocks out=have;
by stock date;
format date date9.;
run;
%let earliest_begdate=01jan1990;
%let earliest_enddate=01jan1992;
data recursive_sscp (type=SSCP drop=_I _J _nv_plus_intercept _nobs rename=(date=end_date));
set have (keep=stock date close high low open);
by stock;
where date >= "&earliest_begdate"d;
retain beg_date ;
format beg_date date9.;
array vars{*} Intercept Open High Low Close;
array vnames{5} $32 _temporary_ ('Intercept','Open','High','Low','Close');
retain _nv_plus_intercept 5;
array sscp{5,5} _temporary_;
intercept=1;
_nobs+1;
if first.stock then do;
beg_date=date;
_nobs=1;
call missing(of sscp{*});
end;
do _I=1 to _nv_plus_intercept;
sscp{_I,_I}+vars{_I}**2;
if _I < _nv_plus_intercept then do _J=_I+1 to _nv_plus_intercept ;
sscp{_I,_J}+vars{_I}*vars{_J};
sscp{_J,_I}=sscp{_I,_J};
end;
end;
if date>="&earliest_enddate"d;
_type_='SSCP';
do _I=1 to _nv_plus_intercept;
_name_=vnames{_I};
do _J=1 to _nv_plus_intercept;
vars{_j}=sscp{_I,_J};
end;
output;
end;
_type_='N';
_name_=' ';
do _J=1 to _nv_plus_intercept;
vars{_J}=_nobs;
end;
output;
run;
proc reg data=recursive_sscp plots=none;
by stock beg_date end_date;
model close=open low high;
quit;
Note that not only is there a variable named _TYPE_ in data set recursive_sscp, but also the dataset itself has TYPE=SSCP (run a proc contents on it and you'll see). The longer your recursive windows become, the more likely you'll see performance improvements.
[edited addition]
Also I have set to macrovars to tell SAS when the beginning of each window is to be, and also the range of the shortest acceptable window.
Why do you set macrovar EARLIEST_ENDDATE to August 2017? Seems to me that is the latest enddate, which is not a needed parameter for this task.
Given your earliest_begdate=31dec1996, deterimine what window size is the shortest permisssable (5 months, 15 months after 31dec1996?). Then use the corresponding date as EARLIEST_ENDDATE.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.