BookmarkSubscribeRSS Feed
Yamani
Obsidian | Level 7

Hello all,

 

I want to create a data set that will allow me to run recursive regression.

 

********************************************************* Data Set *****************************************

 

* My dataset is panel monthly data with 10 countries

* Sample period: From December 1996 to August 2017.

 

Here is an example to show my data set.

Date                  Country     Y       X      Number      Observation

12/31/1996       Australia                           1                   1

01/31/1997      Australia                            1                   2

......

.....

.....

08/31/2017     Australia                              1                   250

12/31/1996     Canada                               2                   1

........

.......

08/31/2017     Australia                              2                  250

......

.....

......

12/31/1996     UK                                      10                   1

.......

......

......

08/31/2017     UK                                      10                   250

 

************************************ Goal: Run Recursive Regression ************************************

 

* I want to run several recursive regressions for EACH country (i.e. keeps the starting date (i.e., December 1996) fixed,

and then adding an observation to the end of the sample with every run of the regression).
More specifically:

- the first regression is run with data from December 1996 to August 2008,

- the second regression is run with data from December 1996 to September 2008,

- the third regression is run with data from December 1996 to October 2008,

...........................

........................

........................

- the last regression is run with data from December 1996 to August 2017.

 

I was thinking to create a new date variable (call it Rankdate) that shows the ending date in each recursive regression, as follows:

    Rankdate                     Date             Country     Y       X       Number      Observations

 August 2008             12/31/1996       Australia                            1                   1

 August 2008              01/31/1997      Australia                            1                   2

......

August 2008               08/31/2008      Australia                           1                 141

September 2008         12/31/1996       Australia                          1                 142

.....

September 2008         09/30/2008     Australia                              1                 283

...................

.....................

August 2017                  12/31/1996       Australia                          1 

............

August 2017                  08/30/2017       Australia                          1               

August 2008                  12/31/1996        Canada                             2                   1

........

.......

                                  

......

.....

......

August 2017            08/31/2017                   UK                            10

 

After having the above dataset, I can easily run PROC REG or PROC PANEL by RANKDATE and by COUNTRY.

 

My problem, however, is how to create the above data set.

 

Thanks for your help in advance.                              

7 REPLIES 7
Reeza
Super User

You're asking too broad of a question. I would first explore SAS ETS to make sure there isn't a PROC that does what you want. I'm not familiar enough to answer that question. 

 

Second, build your macro to do a single regression with your data parameters. 

 

Third, build your data set with the start/end dates needed.

 

Fourth, use Call Execute and your data set from Step3 plus the macro from Step2 to run all your regressions. 

 

 

 

 

Yamani
Obsidian | Level 7

Hi Reeza,

 

Thanks for your reply. All what I need is just building my data set with the start/end dates needed, as described in my first post. Once I build the data, I can do the regression. The problem, however, is that I don't know how to build the data.

Reeza
Super User

Then show us what you have, very clearly and preferably in a data step. 

And show, exactly what output you'd expect from that input. 

 

So simplify the problem to single (or several) cases that illustrate your issue and the expected output and we can help from there. 

 

You'll likely need a DO loop along with the INTNX function which increments months. 

Yamani
Obsidian | Level 7

Hi Reeza,

 

Thanks again for your willing to help. I really appreciate it.

 

******************************** First: Data set

I am attaching a sample of my data set (sampledata) for two countries: Australia and Canada

 

******************************** Second: Analysis

 

I built one data set to run rolling regression (which drops earlier observations as additional observations become available), and

 

Here is the code I used to build data set to be used for Rolling Regression (and it works fine):

 

data firstandlastdates;

set sampledata(keep=country date);

by country;

retain firstdate;
date=intnx('month', date, 1)-1;

if first.country then firstdate=date;
if last.country then do;lastdate=date;

output;end;run;

 

 

data developedrank(rename=(date=rankdate));

set firstandlastdates;

date=firstdate;
do while(date<=lastdate);output;

date=intnx('month', date+1, 1)-1;

end;run;

 

data sampleresults (drop=firstdate lastdate);

set developedrank;

date=rankdate;
i=1;do while(i<=141);output;

date=intnx('month', date, 0)-1;i=i+1;

format date DATE9.;

end;run;

 

For your reference, I am attaching the 'sampleresults' file (the output from the above code)

 


******************** Third: Required

I want to adjust the above code to use it to run recursive regression rather than rolling regression:

 

Specifically, I want to adjust the code so that keeps the starting date (i.e., December 1996) fixed, and then adding an observation to the end of the sample with every run of the regression).

Yamani
Obsidian | Level 7

I am attaching also my sample data

Yamani
Obsidian | Level 7
Hello All, Let me simplify my previous question: ************************************************************************************************************************************* ******* Input data set: 'developingrank', and it is attached, which includes time series data for N countries ************************************************************************************************************************************* ****** Code: I use the following code to build a data set for rolling regression (i.e., dropping earlier observations as additional observations become available): data roll; set developingrank; i=1;do while(i<=141); output; rankdate=intnx('month', date, 0)-1; i=i+1;; end;run; data roll; set roll; where ('30Aug08'd <= rankdate); run; ************************************************************************************************************************************* ****** Output (attached): Roll The output of the above code is data set 'roll' (attached), so that - the first rankdate (Dec 1996) includes data from December 1996 to August 2008 (141 observations), - the second rankdate (Jan 1997) includes data from January 1997 to September 2008 (141 observations), - the third rankdate (Feb 1997) includes data from Feb1997 to October 2008 (141 observations) , and so on ************************************************************************************************************************************* ****** REQUIRED CODE: I want to change the above SAS code to run recursive regression (i.e. keeps the starting date (i.e., December 1996) fixed, and then adding an observation to the end of the sample with every run of the regression). ************************************************************************************************************************************* ****** REQUIRED OUTPUT DATASET: So, the desired output dataset should be as follows: - the first rankdate (Dec 1996) includes data from December 1996 to August 2008 (141 observations) , - the second rankdate (Jan 1997) includes data from December 1996 to September 2008 (142 observations), - the third rankdate (Feb 1997) includes data from December 1996 to October 2008 (143 observations) , and so on ........ Any feedback will be appreciated. Thanks in advance.
Yamani
Obsidian | Level 7
I am attaching the output data set (Roll)

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 3405 views
  • 2 likes
  • 2 in conversation