turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Recursive Data

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-31-2017 02:03 PM

Hello All, ************************************************************************************************************************************* ******* Input data set: 'developingrank', and it is attached, which includes time series data for N countries ************************************************************************************************************************************* ****** Code: I use the following code to build a data set for rolling regression (i.e., dropping earlier observations as additional observations become available): data roll; set developingrank; i=1;do while(i<=141); output; rankdate=intnx('month', date, 0)-1; i=i+1;; end;run; data roll; set roll; where ('30Aug08'd <= rankdate); run; ************************************************************************************************************************************* ****** Output (attached): Roll The output of the above code is data set 'roll' (attached), so that - the first rankdate (Dec 1996) includes data from December 1996 to August 2008 (141 observations), - the second rankdate (Jan 1997) includes data from January 1997 to September 2008 (141 observations), - the third rankdate (Feb 1997) includes data from Feb1997 to October 2008 (141 observations) , and so on ************************************************************************************************************************************* ****** REQUIRED CODE: I want to change the above SAS code to run recursive regression (i.e. keeps the starting date (i.e., December 1996) fixed, and then adding an observation to the end of the sample with every run of the regression). ************************************************************************************************************************************* ****** REQUIRED OUTPUT DATASET: So, the desired output dataset should be as follows: - the first rankdate (Dec 1996) includes data from December 1996 to August 2008 (141 observations) , - the second rankdate (Jan 1997) includes data from December 1996 to September 2008 (142 observations), - the third rankdate (Feb 1997) includes data from December 1996 to October 2008 (143 observations) , and so on ........ Any feedback will be appreciated. Thanks in advance.

Accepted Solutions

Solution

11-01-2017
01:43 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yamani

11-01-2017 01:33 PM

Why do you set macrovar EARLIEST_ENDDATE to August 2017? Seems to me that is the latest enddate, which is not a needed parameter for this task.

Given your earliest_begdate=31dec1996, deterimine what window size is the shortest permisssable (5 months, 15 months after 31dec1996?). Then use the corresponding date as EARLIEST_ENDDATE.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yamani

10-31-2017 02:05 PM

I am attaching the output data set (Roll)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yamani

10-31-2017 03:18 PM - edited 10-31-2017 03:19 PM

I don't have time to work on something this complex, especially to download a file and work with it. Here's an example I did for rolling means. It should give you the idea on how to do your recursive regression. Good Luck. https://gist.github.com/statgeek/e5e43ff45a4ba1f64d0873ff3bc35974

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yamani

10-31-2017 11:30 PM

Are you saying that you want to do a regression for every rolling window of every size from NV+1 up to 141 months, for each country (where NV is n of vars in your model)?

In other words, if each country has 240 months of data, and 5 variables in your regression model, you want the following collection of windows

235 rolling windows of size 6

234 rolling windows of size 7

...

100 rolling windows of size 141

==============================

22,780 rolling windows per country containing 1,464,720 records

I simply don't see the point of doing this.

But on the subject of efficient performance of rolling regressions, it makes much more sense to construct rolling SSCP's which can then be passed to PROC REG, with an accompanying BY COUNTRY WINDOW_ID statement (or in your case BY COUNTRY START_DATE END_DATE). The rolling SSCP can be efficiently constructed by single pass through the original 240 observations, adding a new observation to the SSCP and simultaneously dropping the old observation leaving the rolling window. That's relatively trivial, but quite efficient You have the non-trivial complication of wanting a range of window sizes.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mkeintz

11-01-2017 10:18 AM - edited 11-01-2017 10:34 AM

Thanks for your help. Not exactly.

I have 10 countries with each country has 239 months of data,

I want the following collection of 98 recursive windows for EACH country, as follows:

Country 1:

Window 1 has the first 141 observations (first 141 months)

Window 2 has the first 142 observations (first 142 months)

Window 3 has the first 143 observations (first 141 months)

...

Window 98 has the first 239 observations (first 239 months: and this is the full sample period)

Country 2:

Window 1 has the first 141 observations (first 141 months)

Window 2 has the first 142 observations (first 142 months)

Window 3 has the first 143 observations (first 141 months)

...

Window 98 has the first 239 observations (first 239 months: and this is the full sample period)

an so on.....

==============================

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yamani

11-01-2017 10:34 AM

The best way to provide data examples is in the form of data step code. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

Data presented that way removes ambiguities of character/numeric or date value vs character and a few other issues common with pasted text "data'.

Then post a second data step of the resultant data set for the example input.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yamani

11-01-2017 11:34 AM - edited 11-01-2017 11:36 AM

Here's a sample program that does "recursive" regressions using sashelp.stocks. Each regression is identified by STOCK, BEG_DATE, and END_DATE, where BEG_DATE in your case is a constant. The "trick" is not to constantly re-read observations to perform recursive regressions, but rather to pass through the data once, creating recursive SSCP's, which can then be read into PROC REG, with a BY STOCK BEG_DATE END_DATE set of by groups. Of course, you'll probably want to tell the proc reg to output datasets of estimates rather than printing to the listing file as I have below:

**proc** **sort** data=sashelp.stocks out=have;

by stock date;

format date date9.;

**run**;

%let earliest_begdate=01jan1990;

%let earliest_enddate=01jan1992;

**data** recursive_sscp (type=SSCP drop=_I _J _nv_plus_intercept _nobs rename=(date=end_date));

set have (keep=stock date close high low open);

by stock;

where date >= **"&earliest_begdate"d**;

retain beg_date ;

format beg_date date9.;

array vars{*} Intercept Open High Low Close;

array vnames{**5**} $**32** _temporary_ ('Intercept','Open','High','Low','Close');

retain _nv_plus_intercept **5**;

array sscp{**5**,**5**} _temporary_;

intercept=**1**;

_nobs+**1**;

if first.stock then do;

beg_date=date;

_nobs=**1**;

call missing(of sscp{*});

end;

do _I=**1** to _nv_plus_intercept;

sscp{_I,_I}+vars{_I}****2**;

if _I < _nv_plus_intercept then do _J=_I+**1** to _nv_plus_intercept ;

sscp{_I,_J}+vars{_I}*vars{_J};

sscp{_J,_I}=sscp{_I,_J};

end;

end;

if date>=**"&earliest_enddate"d**;

_type_='SSCP';

do _I=**1** to _nv_plus_intercept;

_name_=vnames{_I};

do _J=**1** to _nv_plus_intercept;

vars{_j}=sscp{_I,_J};

end;

output;

end;

_type_='N';

_name_=' ';

do _J=**1** to _nv_plus_intercept;

vars{_J}=_nobs;

end;

output;

**run**;

**proc** **reg** data=recursive_sscp plots=none;

by stock beg_date end_date;

model close=open low high;

**quit**;

Note that not only is there a variable named * _TYPE_* in data set recursive_sscp, but also the

[edited addition]

Also I have set to macrovars to tell SAS when the beginning of each window is to be, and also the range of the shortest acceptable window.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mkeintz

11-01-2017 12:40 PM

Thanks for your reply. This is helpful.

I adjusted the code you sent to my data. However, the regressions are estimated just once for each country with the same and ending date. I don't know how to pass through the data once

proc sort data=ma.developed out=have;

by country date;

format date date9.;

run;

%let earliest_begdate=31dec1996;

%let earliest_enddate=31aug2017;

data recursive_sscp (type=SSCP drop=_I _J _nv_plus_intercept _nobs rename=(date=end_date));

set have (keep=country date return MA);

by country;

where date >= "&earliest_begdate"d;

retain beg_date ;

format beg_date date9.;

array vars{*} Intercept return MA;

array vnames{3} $32 _temporary_ ('Intercept','return','MA');

retain _nv_plus_intercept 3;

array sscp{3,3} _temporary_;

intercept=1;

_nobs+1;

if first.country then do;

beg_date=date;

_nobs=1;

call missing(of sscp{*});

end;

do _I=1 to _nv_plus_intercept;

sscp{_I,_I}+vars{_I}**2;

if _I < _nv_plus_intercept then do _J=_I+1 to _nv_plus_intercept ;

sscp{_I,_J}+vars{_I}*vars{_J};

sscp{_J,_I}=sscp{_I,_J};

end;

end;

if date>="&earliest_enddate"d;

_type_='SSCP';

do _I=1 to _nv_plus_intercept;

_name_=vnames{_I};

do _J=1 to _nv_plus_intercept;

vars{_j}=sscp{_I,_J};

end;

output;

end;

_type_='N';

_name_=' ';

do _J=1 to _nv_plus_intercept;

vars{_J}=_nobs;

end;

output;

run;

proc reg data=recursive_sscp plots=none;

by country beg_date end_date;

model RETURN = MA;

quit;

Solution

11-01-2017
01:43 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Yamani

11-01-2017 01:33 PM

Why do you set macrovar EARLIEST_ENDDATE to August 2017? Seems to me that is the latest enddate, which is not a needed parameter for this task.

Given your earliest_begdate=31dec1996, deterimine what window size is the shortest permisssable (5 months, 15 months after 31dec1996?). Then use the corresponding date as EARLIEST_ENDDATE.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mkeintz

11-01-2017 01:43 PM

mkeintz,

Awesome!! Finally it works! I was working on this point for more than a week!

I really appreciate your help. Thanks a lot!

Here is the adjusted code, in case anyone else needs it:

proc sort data=ma.developed out=have;

by country date;

format date date9.;

run;

%let earliest_begdate=31dec1996;

%let earliest_enddate=31aug2008;

data recursive_sscp (type=SSCP drop=_I _J _nv_plus_intercept _nobs rename=(date=end_date));

set have (keep=country date return MA);

by country;

where date >= "&earliest_begdate"d;

retain beg_date ;

format beg_date date9.;

array vars{*} Intercept return MA;

array vnames{3} $32 _temporary_ ('Intercept','return','MA');

retain _nv_plus_intercept 3;

array sscp{3,3} _temporary_;

intercept=1;

_nobs+1;

if first.country then do;

beg_date=date;

_nobs=1;

call missing(of sscp{*});

end;

do _I=1 to _nv_plus_intercept;

sscp{_I,_I}+vars{_I}**2;

if _I < _nv_plus_intercept then do _J=_I+1 to _nv_plus_intercept ;

sscp{_I,_J}+vars{_I}*vars{_J};

sscp{_J,_I}=sscp{_I,_J};

end;

end;

if date>="&earliest_enddate"d;

_type_='SSCP';

do _I=1 to _nv_plus_intercept;

_name_=vnames{_I};

do _J=1 to _nv_plus_intercept;

vars{_j}=sscp{_I,_J};

end;

output;

end;

_type_='N';

_name_=' ';

do _J=1 to _nv_plus_intercept;

vars{_J}=_nobs;

end;

output;

run;

proc reg data=recursive_sscp plots=none;

by country beg_date end_date;

model RETURN = MA;

quit;

Awesome!! Finally it works! I was working on this point for more than a week!

I really appreciate your help. Thanks a lot!

Here is the adjusted code, in case anyone else needs it:

proc sort data=ma.developed out=have;

by country date;

format date date9.;

run;

%let earliest_begdate=31dec1996;

%let earliest_enddate=31aug2008;

data recursive_sscp (type=SSCP drop=_I _J _nv_plus_intercept _nobs rename=(date=end_date));

set have (keep=country date return MA);

by country;

where date >= "&earliest_begdate"d;

retain beg_date ;

format beg_date date9.;

array vars{*} Intercept return MA;

array vnames{3} $32 _temporary_ ('Intercept','return','MA');

retain _nv_plus_intercept 3;

array sscp{3,3} _temporary_;

intercept=1;

_nobs+1;

if first.country then do;

beg_date=date;

_nobs=1;

call missing(of sscp{*});

end;

do _I=1 to _nv_plus_intercept;

sscp{_I,_I}+vars{_I}**2;

if _I < _nv_plus_intercept then do _J=_I+1 to _nv_plus_intercept ;

sscp{_I,_J}+vars{_I}*vars{_J};

sscp{_J,_I}=sscp{_I,_J};

end;

end;

if date>="&earliest_enddate"d;

_type_='SSCP';

do _I=1 to _nv_plus_intercept;

_name_=vnames{_I};

do _J=1 to _nv_plus_intercept;

vars{_j}=sscp{_I,_J};

end;

output;

end;

_type_='N';

_name_=' ';

do _J=1 to _nv_plus_intercept;

vars{_J}=_nobs;

end;

output;

run;

proc reg data=recursive_sscp plots=none;

by country beg_date end_date;

model RETURN = MA;

quit;