BookmarkSubscribeRSS Feed
carbs
Calcite | Level 5

I need to do a rolling time-series regression in order to test my regression model. I found a suitable example related to this (link below). The idea is to make the monthly regression go in 5-year loops, iterating 1 year forward at a time. My regression is of following type: Identity= meanHML meanMOM. Dates I have in form 1990-07, 1990-08...until 2008-12. If someone could help me modify the code suitable, I would be happy.

http://www.sascommunity.org/wiki/Rolling_Calculations

4 REPLIES 4
ArtC
Rhodochrosite | Level 12

I am not sure about your question, but I did notice that in your code above you have:

%if %lowcase(%substr(&date,1,4))= year %then %let year_date=1;      *Don't know if should put something here?;

%else %let year_date=0;

You may not place a comment or other statement between the %IF and its %ELSE.

carbs
Calcite | Level 5

Thanks for the hint. However, I decided to change to approach away from macros, since it was quite difficult for me to understand. I've edited the original question to a new form.

carbs
Calcite | Level 5

Could someone help me out with this rolling regression issue? I've tried to modify the code found from sascommunity.org/wiki but unsuccesfully. Here's what I've wrote, I know it's not correct but at least a starting point..I have the data readily sorted by date (dataset tt). I hope someone would be able to modify the code to a more sensible diection. Thanks!!

%let nmonths=60;

data expanded(drop = nn);

length span $ 13;

set tt;

do nn = 1 to &nmonths;

   span = catx(   '-'

                , put(intnx('month',199007,nn-&nmonths),yymm6.)

                , put(intnx('month',200812,nn-1     ),yymm6.)

              );

   output;

   end;

run;

*Re-order. ;

proc sort data=expanded;

by span date;

run;

*Finally, run all of the regressions independently in a single step;

.

proc reg data=expanded noprint outest=regout;

by span;

model Identity = meanHML meanMOM;

run;

mkeintz
PROC Star

You can establish a 60 row (60 month) array.  The first 60 months of your data go into this array, and then are written out to make data for the first 5-year window.  Then the next 12 months can be read in, overwriting rows 1 through 12 of the array (i.e. replacing year 1 data with year 6 data).  At that point write out the full array again, yielding data for the 2nd windows.  Read another 12 months into rows 13-24 (replacing year 2 data with year 7 data), and the write out the 3rd 60-month window.  Et Cetera.  No subsequent sorting is required, which (in combination with making a data VIEW instead of a data FILE) should make things faster.

data want (keep=id endyear endmo y x1 x2)/ view=want;

  array vars {60,3}  /*60 rows for Y, x1 and x2 */;

  do n=1 by 1 until (last.id);

    set have (keep=id date y x1 x2);

    by id;

    row=mod(n-1,60)+1;   /* for N=1 .. 60 , row=1..60, then N=61==>row=1, N=62==>row=2 etc. */

    vars{row,1}=y;  vars{row,2}=x1;  vars{row}=x{3};

    if n>=60 and mod(n,12)=0 then do;  ** If we are at least at obs 60 and it is divisible by 12 ... **;

      endyear=year(date);  endmo=month(date);

      do row=1 to 60;

        y=vars{I,1}; x1=vars{I,2} x2=vars{I,3};

        output;

      end;

    end;

run;

proc reg data=want noproint outest=regout;

  by id endyear endmo;

  model y=x1 x2;

quit;

Now you might be concerned that while the data for the first window (year-year5) will have 60 rows in chronological order, not all windows will (for instance for the second window, the data will be ordered year6, year2 ... year5).  But PROC REG gives the same result no matter the order.

But if you do care, then, instead of the "do row=1 to 60" block, you can do this:

   do J=N-59 to N;

     row=mod(J-1,60)+1;

     y=vars(row,1};  x1=vars(row,2};  x2=vars{row,3);

    output;

  end;

Now even though WANT is a data VIEW and not a data FILE, this can take a long time.  It would likely to be a good deal faster if you could use the DATA step to make rolling sum-of-squares-and-cross-products, which can be read by PROC REG.  Not only is less data passed from one sep to the next, but making a rolling SSCP in the data step ought to be faster.  I'm working on a paper to show this.  Hoping to present it at NESUG 2012

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2458 views
  • 3 likes
  • 3 in conversation