BookmarkSubscribeRSS Feed
AnjaV
Calcite | Level 5

Dear SAS Community,

As a first step to perform to perform a rolling regression I would like to create rolling regression windows.

Keintz (2012) suggested following approach:

 

DATA rwin / view=rwin;

ws = 90;

nwin = nrecs - ws +1;

do w=1 to nwin;

do p=w to w + ws -1;

set myseries point=p nobs=nrecs;

output;

end;

end;

stop;

run;

 

However, I want to create annual rolling windows but with different numbers of observations per window. Window 1 must contain all firm observations from 1950-1959, window 2 all firm observation of the years 1951-1960 etc…. the last window contains all observations from 2006-2015. The number of observations differ between years.

 

Is it possible to solve the problem by altering the code of keintz (2012)?

Thanks very much in advance

Best regards 

Anja

9 REPLIES 9
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

 

Can you post test data (form of a datastep) with some simple data, and what the output should look like.  If you have a date then:

data have;
  thedate="01jan1950"d; output;
  thedate="01jan1956"d; output;
  thedate="01jan1962"d; output;
  thedate="01jan1970"d; output;
run;

data want;
  set have;
  window=floor((year(thedate)-1950) / 10)+1;
run;

So you can get a window number based on year very simply with a formula.  

AnjaV
Calcite | Level 5

Hello RW9,

 

Thanks for your reply. This is how my data set looks like (rudimenarily)

 

Beob. Year FirmID Earnings LaggedEarnings 1 2 3 4 5 6 7 8 9 10 11 12 13

197011000
19711200100
19721200200
19731200200
19741200200
197021000
19712200100
19722200200
197031000
19713200100
19723200200
19733200200
19743200200

 

And that is how the ouput should look like (but with 10 years rolling regression instead of 3 years)

Beob. Window Year FirmID Earnings LaggedEarnings 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

1197011000
1197021000
1197031000
119711200100
119712200100
119713200100
119721200200
119722200200
119723200200
219711200100
219712200100
219713200100
219721200200
219722200200
219723200200
219731200200
2197332002001
319721200200
319723200200
319731200200
3197332002001
319741200200
319743200200

 Thanks !!

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

 

Its a good idea to post test data in the form of a datastep.

data have;
  input Year FirmID Earnings LaggedEarnings;
datalines;
1970	1	100	0
1971	1	200	100
1972	1	200	200
1973	1	200	200
1974	1	200	200
1970	2	100	0
1971	2	200	100
1972	2	200	200
1970	3	100	0
1971	3	200	100
1972	3	200	200
1973	3	200	200
1974	3	200	200
;
run;

data want (drop=inner);
  set have;
  retain window firm_id inner;
  if _n_=1 then do;
    window=1;
    firm_id=1;
    inner=1;
  end;
  else do;
    firm_id=sum(firm_id,1);
    if firm_id > 3 then do;
      firm_id=1;
      inner=sum(inner,1);
      if inner > 3 then do;
        inner=1;
        window=sum(window,1);
      end;
    end;
  end;
run;
AnjaV
Calcite | Level 5

Hi,

 

Thank you very much. Sorry, my statement was a misleading.

 

I want to divide my dataset by time-windows. The window "1970" should contain all firm data of years 1961-1970. The window "1971" should contain all firm data of years 1962-1971 and so on.... So in the end all firms of the year 1970 are listed in 10 different windows (1970-1979). 

 

Afterwants I want to run a pooled regression. Like:

 

PROC Reg Data=want Outest=Regression;

By Window;

Run; 

 

So that I get regression coefficients for each window.

 

It is not very elegant because I have a lot of doublings in my want dataset but since I am new to SAS I do not know how to code pooled rolling regression otherwise.

 

Thank you very much.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Something like (maybe a bit overcomplicated, don't really have time to look at it at the moment):

data have;
  input Year FirmID Earnings LaggedEarnings;
datalines;
1970	1	100	0
1971	1	200	100
1972	1	200	200
1973	1	200	200
1974	1	200	200
1970	2	100	0
1971	2	200	100
1972	2	200	200
1970	3	100	0
1971	3	200	100
1972	3	200	200
1973	3	200	200
1974	3	200	200
1983	3	200	200
1984	3	200	200
;
run;

data want;
  length window 8;
  set have (where=(1=0));
run;

proc sql noprint;
  select  min(YEAR),max(YEAR)
  into    :LOW,:HIGH
  from    HAVE;
quit;

%macro Get_Window;
  %local window;
  %let window=1;
  %do i=&low. %to &high. %by 10;
    data temp;
      set have (where=(&i. <= year <= %eval(&i.+10)));
      window=&window.;
    run;
    proc append base=want data=temp;
    run;
    %let window=%eval(&window.+1);
  %end;
%mend;
%Get_Window;
AnjaV
Calcite | Level 5

Hello RW9,

 

thanks again. Your macro runs smoothly and it divides the data into 10-years windows (e.g All firm data from 1960-1969--> window 1, data from 1970-1979 --> window 2...).

But do you maybe habe an idea how to create rolling 10-years windows? Like window 1 1960-1969; window 2 1961-197o ....?

 

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Try dropping the "%by 10" part.  You will have to fiddle with it from there, I am travelling now.

AnjaV
Calcite | Level 5

RW9, thank you very much. It works

 

You made my day!!

mkeintz
PROC Star

 

I'd recommend the following sequence:

 

  1.  Make sure your dataset is sorted by ID/YEAR
  2. Make a data set VIEW containing all the needed squares and cross-products - call it  SQ_CP_BY_IDYEAR
  3. Run a proc expand to generate 10-year rolling SSCP values for each firm
  4. Run proc summary to pool data over all firms
  5. Convert the rolling sscp into a TYPE=SSCP dataset, which PROC REG can process

 

The code here applies that process to sashelp.stocks (10 days instead of 10 years)

 

For a 10-year window this process of making 10-year rolling SSCP may not be particlularly faster than making 10-year rolling oriiginal data windows, but for daily and monthly data, where each window commonly has more records, it becomes notably superior.

 

proc sort data=sashelp.stocks 
   out=stocks (rename=(close=y open=ylag1 high=x1 low=x2 volume=x3 adjclose=x4));
  by stock date;
run;

/* Make a data set view of squares and cross-products */
data sqcp   /view=sqcp;
  set stocks;
  array vars {7} intercept y ylag1 x1-x4;
  retain intercept 1;

  array sqcp {7,7};
  do r=1 to 7;
    sqcp{r,r}=vars{r}**2;
    if r<7 then do c=r+1 to 7;
      sqcp{r,c}=vars{r}*vars{c};
      sqcp{c,r}=sqcp{r,c};
    end;
  end;
  drop r c;
run;

/* Make rolling sscp (SUM of squares & cross-prods) data for 10-day windows */
proc expand data=sqcp  out=rsscp_data2 (where=(intercept=10)) method=none;
  by stock;
  id date;
  convert _numeric_  / transform=(movsum 10);
run;

/* Pool sscp data over all stocks */
proc summary data=rsscp_data  nway;
  class date;
  var _numeric_;
  output out=pooled_rsscp_data (drop=_TYPE_ _FREQ_) sum=;
run;

/* Reformat the data as a TYPE=SSCP data set, ready for PROC REG */
data pooled_rolling_sscp (type=SSCP drop=sqcp:  n v);
  length _TYPE_ $8 _NAME_ $32;
  set pooled_rsscp_data;  

  array vars {7} intercept y ylag1 x1-x4;
  array sqcp {7,7};

  array vnames {7} $32 _temporary_ ;
  if _n_=1 then do v=1 to dim(vars); vnames{v}=vname(vars{v}); end;

  _TYPE_='SSCP';
  do n=1 to 7;
    _NAME_=vnames{n};
    do v=1 to 7;  vars{v}=sqcp{n,v};  end;
    output;
  end;

  _type_='N';
  _NAME_=' ';
  do v=1 to 7; vars{v}=sqcp{1,1}; end;
  output;
run;

proc reg data=pooled_rolling_sscp (obs=8);
  var y ylag1 x1-x4;
  model y=ylag1 x1-x4;
run;
quit;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 3029 views
  • 2 likes
  • 3 in conversation