Hi
I have a data set with monthly returns for about 4000 different firms. I want to create a new column with "cumulative past returns from month t – 12 to t − 2" for each firm separately, excluding the most recent one i.e. "t − 1". Here firm is identified by PERMNO.
Please guide me how to take cumulative lag (t-12, t-2) by firm?
See a sample of my data set below. Thanks.
Data have;
input Date: mmddyy10. PERMNO Ret;
format Date mmddyy10.;
datalines;
01/31/2001 21227 -0.03
02/28/2001 21227 -0.06
03/31/2001 21227 -0.04
04/30/2001 21227 0.04
05/31/2001 21227 0.12
06/30/2001 21227 -0.08
07/31/2001 21227 0.12
08/31/2001 21227 0.02
09/30/2001 21227 0.11
10/31/2001 21227 0.17
11/30/2001 21227 -0.11
12/31/2001 21227 -0.04
01/31/2002 21227 -0.2
02/28/2002 21227 0.11
03/31/2002 21227 -0.06
04/30/2002 21227 -0.03
06/30/2005 22100 -0.01
07/31/2005 22100 0.02
08/31/2005 22100 0.09
09/30/2005 22100 0.006
10/31/2005 22100 -0.07
11/30/2005 22100 0.06
12/31/2005 22100 0.09
01/31/2006 22100 0.07
02/28/2006 22100 -0.15
03/31/2006 22100 0.05
04/30/2006 22100 0.08
05/31/2006 22100 -0.01
06/30/2006 22100 -0.02
07/31/2006 22100 -0.14
08/31/2006 22100 -0.10
09/30/2006 22100 0.007
10/31/2006 22100 0.05
11/30/2006 22100 0.06
12/31/2006 22100 0.12
run;
@RM6 To run a regression, I need to control for past returns. One of the controls is "past year return" and literature uses "lag t-12, lag t-2" for this variable. This means "cumulative return of past 11 months excluding the most recent month". Hope it clarifies.
Two questions:
Why do you want Cumulative past monthly returns from time t-12 to t-2 at time t? Seems counter intuitive to me?
Do you have SAS/ETS? If so, PROC EXPAND might be worth looking into 🙂
1- I need to measure "past year return" and literature uses "lag t-12, lag t-2" for this variable. This means "cumulative return of past 11 months excluding the most recent month". Cumulative t-12,t-2 is the sum of lag-returns from lag 2 to lag 12.
2- Yes, I do have SAS/ETS.
Data Want (Drop = MyCumSum_dummy);
Retain MyCumSum_dummy Nr;
Set have;
By PERMNO;
If First.PERMNO Then Nr = 0;
Nr + 1;
MyCumSum_dummy = Sum(MyCumSum_dummy, Lag2(Ret), IfN(Nr >= 12, -Lag13(Ret), 0));
MyCumSum = IfN(Nr >= 12, MyCumSum_dummy, .);
Run;
* OR;
Proc TimeData Data = have
Out = _NULL_
OutArray = want2;
By PERMNO;
ID Date Interval = MONTH;
Var Ret;
OutArray MyCumSum;
Do t = 1 To _LENGTH_;
MyCumSum[t] = Ret[t-12]+Ret[t-11]+Ret[t-10]+Ret[t-9]+
Ret[t-8]+Ret[t-7]+Ret[t-6]+Ret[t-5]+
Ret[t-4]+Ret[t-3]+Ret[t-2];
End;
Run;
@Peter_C I need to use something like
Ret[t-12]+Ret[t-11]+Ret[t-10]+Ret[t-9]+
Ret[t-8]+Ret[t-7]+Ret[t-6]+Ret[t-5]+
Ret[t-4]+Ret[t-3]+Ret[t-2]
where "t-n" are lags. But the problem is that I need to keep rolling it for each month "t" by PERMNO.
When doing rolling sums/means, the easiest is often to put the values in a temporary array. Only here we have to use LAG because we do not want the newest values in:
data want; array values (0:10) 8 _temporary_; call missing(of values(*)); do _N_=0 by 1 until(last.permno); set have; by permno; sum=sum(of values(*)); output; values(mod(_N_,11))=ifn(_N_=0,.,lag(Ret)); /* use IFN so that LAG is always called */ end; run;
If you do not want the sum before you have all 11 values, use "if _N_>=10 then sum=sum(of values(*));"
If I understand what you mean.
Data have;
input Date: mmddyy10. PERMNO Ret;
format Date mmddyy10.;
datalines;
01/31/2001 21227 -0.03
02/28/2001 21227 -0.06
03/31/2001 21227 -0.04
04/30/2001 21227 0.04
05/31/2001 21227 0.12
06/30/2001 21227 -0.08
07/31/2001 21227 0.12
08/31/2001 21227 0.02
09/30/2001 21227 0.11
10/31/2001 21227 0.17
11/30/2001 21227 -0.11
12/31/2001 21227 -0.04
01/31/2002 21227 -0.2
02/28/2002 21227 0.11
03/31/2002 21227 -0.06
04/30/2002 21227 -0.03
06/30/2005 22100 -0.01
07/31/2005 22100 0.02
08/31/2005 22100 0.09
09/30/2005 22100 0.006
10/31/2005 22100 -0.07
11/30/2005 22100 0.06
12/31/2005 22100 0.09
01/31/2006 22100 0.07
02/28/2006 22100 -0.15
03/31/2006 22100 0.05
04/30/2006 22100 0.08
05/31/2006 22100 -0.01
06/30/2006 22100 -0.02
07/31/2006 22100 -0.14
08/31/2006 22100 -0.10
09/30/2006 22100 0.007
10/31/2006 22100 0.05
11/30/2006 22100 0.06
12/31/2006 22100 0.12
;
run;
proc sql;
create table want as
select *,(select sum(ret) from have where PERMNO=a.PERMNO and
Date between intnx('month',a.date,-12,'e') and intnx('month',a.date,-2,'e')) as sum
from have as a;
quit;
For each t, you want rolling window returns, where the window size is 11 months, and the window end-point is t-2. Keep a rolling window of the sum of the log(ret+1) for lagged returns,,, and then just exp that sum to get cumulative returns:
Data have;
input Date: mmddyy10. PERMNO Ret;
format Date mmddyy10.;
datalines;
01/31/2001 21227 -0.03
02/28/2001 21227 -0.06
03/31/2001 21227 -0.04
04/30/2001 21227 0.04
05/31/2001 21227 0.12
06/30/2001 21227 -0.08
07/31/2001 21227 0.12
08/31/2001 21227 0.02
09/30/2001 21227 0.11
10/31/2001 21227 0.17
11/30/2001 21227 -0.11
12/31/2001 21227 -0.04
01/31/2002 21227 -0.2
02/28/2002 21227 0.11
03/31/2002 21227 -0.06
04/30/2002 21227 -0.03
06/30/2005 22100 -0.01
07/31/2005 22100 0.02
08/31/2005 22100 0.09
09/30/2005 22100 0.006
10/31/2005 22100 -0.07
11/30/2005 22100 0.06
12/31/2005 22100 0.09
01/31/2006 22100 0.07
02/28/2006 22100 -0.15
03/31/2006 22100 0.05
04/30/2006 22100 0.08
05/31/2006 22100 -0.01
06/30/2006 22100 -0.02
07/31/2006 22100 -0.14
08/31/2006 22100 -0.10
09/30/2006 22100 0.007
10/31/2006 22100 0.05
11/30/2006 22100 0.06
12/31/2006 22100 0.12
run;
data want (drop=_:);
set have ;
by permno;
_month+1;
if first.permno then _month=1;
_logret=log(1+ret);
_sumlog+lag2(_logret);
_sumlog=_sumlog -ifn(_n_>12,0,lag12(_logret));
if _month>12 then window_12_2_ret=exp(_sumlog)-1;
run;
Notes:
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.