## Regression over rolling windows for each company

Frequent Contributor
Posts: 75

# Regression over rolling windows for each company

Hi All,

I have daily data (day is denoted t) and for the end of each month and for each firm i, I need to run the regression using rolling annual periods and keep beta and gamma estimates. The regression looks like this:

Y(t,i) = a + beta(t,i)⋅X(t) + beta(t-1,i)⋅X(t-1) + gamma(t,i)⋅Z(t) + gamma(t-1,i)⋅Z(t-1) + e(t,i)

I need to keep beta(t,i)+beta(t-1,i) and gamma(t,i)+gamma(t-1,i) for each company at the end of each month.

For example, at the end of December 2000 (last day of December), for company i, I need to run the above regression using daily data from January 2000 until December 2000 and obtain beta(t,i)+beta(t-1,i) and gamma(t,i)+gamma(t-1,i) for the end of December 2000. Then, at the end of January 2001 (last day of January), for the same company i, I need to run the same regression using daily data from February 2000 until January 2001 and obtain beta and gamma estimates for the end of January 2001. And so on.

I have around 7000 companies, so speed is important :-)

Also, obviously, if my daily data for company start in, say, January 1996, then my first beta and gamma estimates will be for the end of December 1996. All monthly observations between January 1996 and November 1996 should have missing values.

Kind regards,

Ruslan

Frequent Contributor
Posts: 75

Guys,

Super User
Posts: 10,778

## Re: Regression over rolling windows for each company

```It is auto-regression model.Using BY for speed.

data have;
do company=1 to 2;
do date=1000 to 2000;
x=ceil(100*ranuni(0));
y=ceil(100*ranuni(0));
z=ceil(100*ranuni(0));
output;
end;
end;
format date date9.;
run;
data want;
set have;
by company date groupformat;
format date monyy7.;
lag_x=lag(x);
lag_z=lag(z);
if first.date then call missing(lag_x,lag_z);
run;

proc reg data=want outest=outest noprint;
by company date;
format date monyy7.;
model y=x lag_x z lag_z;
quit;

```
Posts: 5,529

## Re: Regression over rolling windows for each company

[ Edited ]

Test this with a small number of companies:

``````data have;
call streaminit(86896);
do company = 1, 2;
do date = "01JAN1998"d to "31DEC2002"d;
x = rand("UNIFORM") * 100;
y = rand("UNIFORM") * 100;
z = rand("UNIFORM") * 100;
output;
end;
end;
format date yymmdd10.;
run;

data regs;
set have;
do i = -11 to 0;
reg = intnx("MONTH", date, i);
output;
end;
format reg yymm.;
drop i;
run;

proc sql;
create table byGroups as
select company, reg, date, x, y, z
from regs
group by company, reg
having count(*) >= 365
order by company, reg, date;
quit;

data byGroupsReg / view=byGroupsReg;
do until(last.reg);
set byGroups; by company reg;
output;
lagX = x;
lagZ = z;
end;
run;

proc reg data=byGroupsReg
outest=byGroupStats(drop=_model_ _type_ _depvar_ y)
plots=none /* noprint */;
by company reg;
model y = x lagX z lagZ;
run;

``````
PG
Super User
Posts: 10,778

## Re: Regression over rolling windows for each company

```Opps. I missed rolling windows.

data have;
do company=1 to 2;
do date=1000 to 2000;
x=ceil(100*ranuni(0));
y=ceil(100*ranuni(0));
z=ceil(100*ranuni(0));
output;
end;
end;
format date date9.;
run;
data have;
set have;
by company;
lag_x=lag(x);
lag_z=lag(z);
if first.company then call missing(lag_x,lag_z);
run;

data key;
set have;
by company date groupformat;
format date monyy7.;
if first.date then do;monyy=put(date,monyy7.);output;end;
keep company monyy;
run;

data want;
if _n_=1 then do;
if 0 then set have;
declare hash h(dataset:'have',hashexp:20);
h.definekey('company','date');
h.definedata('x','y','z','lag_x','lag_z');
h.definedone();
end;
set key;
temp=input(monyy,monyy7.);
start=intnx('month',temp,-11,'b');
end=intnx('month',temp,0,'e');
group+1;
do i=start to end;
if h.find(key:company,key:i)=0 then output;
end;
drop i temp start end date;
run;

proc reg data=want outest=outest noprint;
by group company monyy;
model y= x lag_x z lag_z;
quit;;

```
Discussion stats
• 4 replies
• 490 views
• 2 likes
• 3 in conversation