Rolling regression with conditions

Reply
Contributor
Posts: 31

Rolling regression with conditions

[ Edited ]

Hi.

I have a (company ID-time) panel data. Now I have to do 5-year rolling regression of COGS (as dependent variable) for each firm with 3 independent variables:

- COGS's first lag (COGS (t-1))

- current SALE (SALE (t))

- SALE's first lag (SALE (t-1))

and retain all of the coefficients for each date that has rolling window with >= 10 observations after filtering for outliers

 

 The problem is:

+/ The rolling windows for each regression must have at least 10 observations (if any variable has more than 10 missing values in a rolling window, then skip the regression for that window)

+/ Quarterly growth rates of COGS, SALEQ and ASSET are no more than plus or minus 75% (if they are >75% or <-75%, then drop these observations from regressions)

.

The sample data is like this:

 

gvkeydateCompanyCOGSQSALEAsset
100119830331A1.258210
100119830630A1.42.312
100119830930A1.5314
100119831231A1.6 416
……..………….…….…….…….
100219860331B2.3 520
100219860630B3.5 622
100219861031B6.5 6.525
100219860131B7929
……..……..……..……..……..……..
    
     

The output is like this:

gvkeydateCompanyInterceptCoef_lagcogsCoef_saleCoef_lagsale
100119830331A…….…….…….…….
100119830630A…….…….…….…….
100119830930A…….…….…….…….
100119831231A…….…….…….…….
……..………….…….…….…….…….
100219860331B…….…….…….…….
100219860630B…….…….…….…….
100219861031B…….…….…….…….
100219860131B…….…….…….…….
……..……..……..…….…….…….…….
    
     
     
     
     
     
     
     
Contributor
Posts: 31

Re: Rolling regression with conditions

Posted in reply to trungcva112
Please. Anything could help
Trusted Advisor
Posts: 1,145

Re: Rolling regression with conditions

Posted in reply to trungcva112

If you drop quarters with greater than a 75% change in cogsq or sale, then how do you define lag(cogsq) and lag(sale) for the subsequent quarter.

 

If  Q3 has 80% change vs Q2, then what values will you use as lag of coqsq and sale in Q4?

Contributor
Posts: 31

Re: Rolling regression with conditions

Posted in reply to trungcva112
Good question. For example, If cogs in Q3 has 80% change vs Q2, then the lag of cogs in Q4 would be cogs in Q2 (or equivalently the nearest quarter that has cogs growth rate within (-75%, +75%).
Trusted Advisor
Posts: 1,145

Re: Rolling regression with conditions

Posted in reply to trungcva112

So does this mean that the concept of lagged values of cogsq and sale can represent varying time spans?   Let's say you have successive changes in sales of +80%, -%78%, and +90%, for a highly seasonal company.  Then your lagged value of cogsq and sale will represent one-year old values.

 

And more generally, I don't get this notion of dropping dramatic changes anyway, since it is likely to introduce even more dramatic changes in your analysis data.  Let's say a company is in a growth spurt, with successive changes in sales of 78% (for Q2) and 20% (for Q3).  By dropping Q2 and inserting Q1 sales as the lagged value for Q3, you will have introduced a "change" of 96% between current and "preceding" value of the variable.  That would increase the beta coefficient for lagged sale.

 

What you MIGHT want to do instead is insert a dummy variable indicating records with large absolute proportional changes, and keep those records.

 

 

Contributor
Posts: 31

Re: Rolling regression with conditions

Posted in reply to trungcva112
Thanks for your advice. But how can I insert this dummy variable and other conditions in the rolling regression?
Contributor
Posts: 31

Re: Rolling regression with conditions

Posted in reply to trungcva112
Anyone has any idea?
Trusted Advisor
Posts: 1,145

Re: Rolling regression with conditions

Posted in reply to trungcva112

This program should give you then needed infrastructure:

 

data need1 (keep=window data cogsq sale lag_: outlier_:) /view=need1;

  do n=1 by 1 until (last.gvkey);
   set have;
   by gvkey;
   lag_cogsq=lag(cogsq);
   lag_sale=lag(sale);
   if n=1 then call missing(of lag_:);
   else do;
     outlier_cogsq=1-(0.25<cogsq/lag_cogsq<1.75);
     outlier_sale=1-(0.25<sale/lag_sale<1.75);
   end;

   array var {*} data cogsq sale lag_cogsq lag_sale outlier_cogsq outlier_sale;
   array data {100,7}; /*Up to 100 historic records for 7 vars*/

   do v=1 to 7; data{n,v}=var{v};end
  end;

  if n>10 then do end=11 to n;
    beg=max(2,n-59);
    window=end-11;
    do row=beg to end;
      do v=1 to 7; var{v}=data{row,v};end;
      output;
    end;
  end;
run;

proc reg data=need;
 ......
 run;
quit;

 

 

Remember data set NEED will be roughly 20 times the size of HAVE (you want 5-year rolling quarterly data). So I made NEED a data set VIEW instead of a data set FILE.  It's only activated when a subsequent PROC calls for NEED, and the data is streamed directly to the proc instead of a disk file.

Contributor
Posts: 31

Re: Rolling regression with conditions

Hi everyone.

I have describe the data in more detail. Could anyone has an idea? Because previous codes do not work
Trusted Advisor
Posts: 1,145

Re: Rolling regression with conditions

Posted in reply to trungcva112
  1. The phrase "do not work" is not a usefull description of the problem.  What results did you get vs what you  expected? Please show those results.
     
  2. Test this with a sample data set with two gvkeys and just a couple years of data.  And if you are testing my suggestion, change the dataset view VNEED to a data set file.  Then you can examine the  intermediate data set to confirm it is constructed as intended.
Ask a Question
Discussion stats
  • 9 replies
  • 212 views
  • 0 likes
  • 2 in conversation