DATA Step, Macro, Functions and more

rolling window regression

Reply
Contributor
Posts: 42

rolling window regression

Hi, I am using the following code on the data which some what looks like below;

permnodateerrmrfhmlsmb
11/31/2008    
12/29/2008    
1uptil 12/31/2017    
24/31/2009    
25/31/2009    
2uptil 12/31/2017    
31/31/2009    
32/29/2009    
3uptil 12/31/2013   

 


* Macro with parameters; %MACRO RW (data= ,out_ds= ,model_equation= ,permno= , date= ,STARTd= ,ENDd= ,regprint=); %DO i=1 %TO ?? ; /*?*?NUMBER OF 36th month rolling window */ /* all the dates available so ? covers the whole period*/ /* dm log 'clear'; dm 'odsresults; clear'; */ %let Date = %Sysfunc( InputN( 31jan2008 ,Date9.)) ; /*to create a macro variable for reference date which is the beginning of the rolling window date*/ %let dated = %Sysfunc( PutN( &date , Date9.)) ; /*to input date and give it format by putn*/ %put Date = &Date ; %put Dated = &Dated; %IF &i =1 %THEN %DO; %let START = %Sysfunc( InputN( 31jan2008, Date9. )) ; /*to create a macro variable for reference date which is the beginning of the rolling window date*/ %let STARTd = %Sysfunc( PutN( &start , Date9.)) ; /*to input date by input and give it format by putn*/ %put START = &START ; %put STARTd = &STARTd; %let END= %Sysfunc( Intnx( Month , &start , 35,end )) ; %let ENDd=%Sysfunc( PutN( &END , Date9.)) ; %put end = &end; %put endd = &endd; %GOTO CONTINUE;/*TO SEND THE FIRST WINDOW(DATE) TO THE NEXT STEP WHICH IS SQL FOR SUBSETTING THE MAIN DATA*/ %END; %Else %IF &i>1 %THEN %DO; %let Start = %Sysfunc( Intnx( month, &Date ,&i-1)) ;/*to specify intervals with intnx ,start should be 1 month after the previouse starting date*/ %let End = %Sysfunc( Intnx( Month , &start , 35,end)) ;/*monthly interval for end to specify 36 rolling window*/ %put start = &start end = &end; %let STARTd=%Sysfunc( PutN( &START , Date9.)) ; /*to get DATE9. FORMAT FOR date instead of default date*/ %let ENDd=%Sysfunc( PutN( &END , Date9.)) ; %PUT STARTd=&STARTd ENDd=&ENDd; %continue: proc sql NOPRINT; /* TO SUBSET OUR COMPLETE DATASET IN ACCORDANCE TO THE ROLLING WINDOWS INTERVALS*/ create table temp1 as select* from TEST where DATE BETWEEN "&STARTd"d and "&ENDd"d; by permno date; quit; /*TO CALCULATE NOBS AS A MACRO VARIABLE TO KNOW HOW MANY funds we got per loop FOR EACH WINDOWs*/ /*TEMP1 IS ALREADY SORTED BY CROSS AND DATE*/ proc means DATA=temp1 N NOPRINT ; BY PERMNO; var permno; OUTPUT out=temp2 N(PERMNO)=N; RUN; proc sql noprint; /*TO GET NOBS PER WINDOW IN A MACRO VARIBLE*/ select count(*) into :OBSN from TEMP2 quit; %put Count=&OBSN.;/*&OBSN CONTAINS NOBS FOR EACH WINDOW TO BE USED FOR ff reg*/ /*1-2: RUNNING multiple regression/PER INTERVALS out EACH TIME*/*********************************************; proc reg data=temp1 noprint outest=Out_Estimates ; model ER = RMRF HML SMB; by permno; run; quit; title 'out_estimates, loop ' &i; proc print data=Out_Estimates;run; %END; * Show INPUT data this regression: ; title 'Temp1, loop ' &i ; proc print data=temp1; RUN; %END; %MEND RW; *Invoke the macro; %RW (data= TEST , out_ds= temp1, model_equation= ER = RMRF HML SMB, permno= permno , date=date , STARTd= 31jan2008 , ENDd= 31dec2017, regprint=noprint) ; QUIT;

 I am confused with how to deal with the following part for my code.

%DO i=1 %TO ?? ; /*?*?NUMBER OF 36th month rolling window */ 
                       /* all the dates available so ? covers the whole period*/
  1. I have 2048 funds, the data is monthly from jan-08 to dec-17 but for some funds the data is available for the entire period but for the others for some of the years only. However, for each fund I have at-least 5 years of consecutive data. e.g for fund one date range will be jan-08 to Dec-17 but for other it might be Apr-10 to Mar-15 that is, the date may vary from one fund to the next.

How should I deal with this part given my data?

 

  1. Also, the size of my rolling window is not remaining constant i.e 36 rather it is variating for each window. Can you point out the issue given my code?

 

Contributor
Posts: 32

Re: rolling window regression

Please can we have sample output dataset you want?.

Contributor
Posts: 42

Re: rolling window regression

Posted in reply to emrancaan

The sample output should look like this,

 

permnodateerrmrfhmlsmb    
131/1/2008        
129/02/2008        
 goes on uptil s6 times with the last value as      
131/12/2010        
          
Obs_MODEL__TYPE__DEPVAR__RMSE_Interceptrmrfhmlsmber
1MODEL1PARMSer      
          
permnodateerrmrfhmlsmb    
129/02/2008        
131/03/2008        
 goes on uptil s6 times with the last value as      
131/01/2011        
          
Obs_MODEL__TYPE__DEPVAR__RMSE_Interceptrmrfhmlsmber
1MODEL1PARMSer      
          
it should keep creating rolling window for permno one uptil the last date available for this data  
with the window size of 36 only and when this is done then it should start creating the table for permno two the same way 
          
permnodateer hmlsmb  rmrf    
231/03/2010        
231/04/2010        
 goes on uptil s6 times with the last value as      
229/02/2012        

 

 

Super User
Posts: 13,034

Re: rolling window regression

First before getting any recommendations on setting a value for loop limit exactly HOW is that loop index to be used?

 

If you want something to be done for each level of a variable in a data set the preferred method is to use BY Group processing so each level of the variable (or combinations of levels for multiple variables) are processed.

See this not particularly useful model but demonstrates regression using a by group:

data work.class;
   set sashelp.class;
run;

proc sort data=work.class;
   by sex;
run;

proc reg data=work.class;
   by sex;
   model weight = age;
run;
quit;

Your install should have the sashelp.class variable so you can run this.

 

The key bit I only need to know that I want the regression for each level of a variable. So sort the data by that variable and use BY in the regression. Output for each level.

 

When you subset by time then sorting after the subset and then the regression using your PERMNO variable.

 

Your issue with changing interval lengths is because for some start periods you don't have as many records in the window. Up to you decide what to do in that case.

Ask a Question
Discussion stats
  • 3 replies
  • 88 views
  • 0 likes
  • 3 in conversation