Hi, I want to calculate compound yearly return for each company (gvkey) for each fiscal year (with year end is FY_date). The issue is that companies may have different year end. Monthly return is mret.
Basically, I want a new variable, say:
cum_mret = (1+ return of the first month of fiscal year)*(1+return of the second month of fiscal year)*...*(return of the 12th month of fiscal year).
Monthly returns used to calculate cum_mret are determined based on FY_date. For example, gvkey has the year end of 31/05/2004, then first month used to calculate cum_mret is 6/2003 and the 12th month used to calculate cum_mret is 5/2004.
Some months of some gvkey have missing values for mret.
Here are some observations. Could you please give me some suggestion about the code?
gvkey | year | month | mret | FY_Date |
242 | 2003 | 1 | -0.09524 | 31/05/2003 |
242 | 2003 | 2 | 0.342105 | 31/05/2003 |
242 | 2003 | 3 | -0.01961 | 31/05/2003 |
242 | 2003 | 4 | -0.04 | 31/05/2003 |
242 | 2003 | 5 | 0 | 31/05/2003 |
242 | 2003 | 6 | 0 | 31/05/2003 |
242 | 2003 | 7 | 0.125 | 31/05/2003 |
242 | 2003 | 8 | 0.388889 | 31/05/2003 |
242 | 2003 | 9 | 0.08 | 31/05/2003 |
242 | 2003 | 10 | 0.049383 | 31/05/2003 |
242 | 2003 | 11 | 0.117647 | 31/05/2003 |
242 | 2003 | 12 | 0.010526 | 31/05/2003 |
242 | 2004 | 1 | 0.229167 | 31/05/2004 |
242 | 2004 | 2 | 0.228814 | 31/05/2004 |
242 | 2004 | 3 | 0.186207 | 31/05/2004 |
242 | 2004 | 4 | -0.00581 | 31/05/2004 |
242 | 2004 | 5 | 0.122807 | 31/05/2004 |
242 | 2004 | 6 | -0.01563 | 31/05/2004 |
242 | 2004 | 7 | -0.02646 | 31/05/2004 |
242 | 2004 | 8 | -0.04076 | 31/05/2004 |
242 | 2004 | 9 | 0.121813 | 31/05/2004 |
242 | 2004 | 10 | 0.010101 | 31/05/2004 |
242 | 2004 | 11 | 0.045 | 31/05/2004 |
242 | 2004 | 12 | 0.253589 | 31/05/2004 |
242 | 2005 | 1 | -0.05725 | 31/05/2005 |
242 | 2005 | 2 | 0.40081 | 31/05/2005 |
242 | 2005 | 3 | -0.02601 | 31/05/2005 |
242 | 2005 | 4 | 0.017804 | 31/05/2005 |
242 | 2005 | 5 | 0.180758 | 31/05/2005 |
242 | 2005 | 6 | -0.00988 | 31/05/2005 |
242 | 2005 | 7 | -0.02494 | 31/05/2005 |
242 | 2005 | 8 | -0.03964 | 31/05/2005 |
242 | 2005 | 9 | 0.159787 | 31/05/2005 |
242 | 2005 | 10 | 0.145809 | 31/05/2005 |
242 | 2005 | 11 | -0.002 | 31/05/2005 |
242 | 2005 | 12 | -0.01205 | 31/05/2005 |
Don't calculate cumulative products, get cumulative SUMS of log(mret +1), then take the antilog and subtract 1 at end of fiscal year:
data yearly_returns (keep=gvkey fy_date annual_ret nmonths);
set have;
by gvkey fy_date;
if first.fy_date then do;
sum_log_ret_plus1=0;
nmonths=0;
end;
sum_log_ret_plus1 + log(1+mret);
nmonhs+1;
if last.fy_date; /* editted typographical error, was "if last_fydate;" */
annual_ret=exp(sum_log_ret_plus1)-1;
run;
If you want SQL:
proc sql;
create table want as
select gvkey,fy_date, count(*) as nmonths, exp(sum(log(mret+1)))-1 as annual_return
from have
group by gvkey,fy_date;
quit;
I suspect the DATA step will be faster, since it takes advantage of the order of data in HAVE.
Shouldn't FY_date change after month=5, not month=12? Please clarify.
Thanks @PGStats. FY_date of the same a company (gvkey) rarely changes. But I do not expect that it is fixed. However, different FY_dates for different companies are common. Is it clear?
Thank you very much! Thierry
It's possible that YEAR and MONTH are meant to be fiscal year/fiscal month, not calendar year/month. In which case FY_DATE changes after month=12.
The problem is that fiscal years ending in Jan through May (actually Jan 15- Jun 14) of calendar year X are identified as fiscal year X-1. So the fiscal year ending on 31/05/2003 should be identified as FY 2002. But that's not what @tritringuyen's data says. It has variable YEAR=2003.
Perhaps the OP just assigned YEAR = YEAR(fy_date),
If so, it should have been YEAR=ifn(month(fy_date)>5,year(fy_date),year(fy_date)-1).
Don't calculate cumulative products, get cumulative SUMS of log(mret +1), then take the antilog and subtract 1 at end of fiscal year:
data yearly_returns (keep=gvkey fy_date annual_ret nmonths);
set have;
by gvkey fy_date;
if first.fy_date then do;
sum_log_ret_plus1=0;
nmonths=0;
end;
sum_log_ret_plus1 + log(1+mret);
nmonhs+1;
if last.fy_date; /* editted typographical error, was "if last_fydate;" */
annual_ret=exp(sum_log_ret_plus1)-1;
run;
If you want SQL:
proc sql;
create table want as
select gvkey,fy_date, count(*) as nmonths, exp(sum(log(mret+1)))-1 as annual_return
from have
group by gvkey,fy_date;
quit;
I suspect the DATA step will be faster, since it takes advantage of the order of data in HAVE.
Thank you very much, @mkeintz. In fact, the fyear and month in my data is calendar year and month (not fiscal year). The only information about fiscal year given is FY_date (fiscal year end). I suppose that the code should be changed?
Thanks a lot! Thierry
@tritringuyen wrote:
.Thank you very much, @mkeintz. In fact, the fyear and month in my data is calendar year and month (not fiscal year). The only information about fiscal year given is FY_date (fiscal year end). I suppose that the code should be changed?
This is a good opportunity to study the logic of the program and change the variable name used if the proposed name really serves the same purpose as fy_date. Experiment a little.
Dear @mkeintz, thanks for your reply. I wrote the code to transform calendar year/month (year and month) to fiscal year/month (fyear and fmonth) based on FY_date (fiscal year end). Your prior code works very well with the data I posted previously. But after having fyear and fmonth, I do not rely on FY_date to compute yearly return.
new data looks like this (different sample, but variables are the same with the prior post):
gvkey | monthyear | year | month | mret | FY_Date | FY_year | FY_month | fmonth | fyear |
215 | 2005m1 | 2005 | 1 | -0.07499 | 30/09/2005 | 2005 | 9 | 4 | 2005 |
215 | 2005m2 | 2005 | 2 | 0.32432 | 30/09/2005 | 2005 | 9 | 5 | 2005 |
215 | 2005m3 | 2005 | 3 | 0 | 30/09/2005 | 2005 | 9 | 6 | 2005 |
215 | 2005m4 | 2005 | 4 | -0.18368 | 30/09/2005 | 2005 | 9 | 7 | 2005 |
215 | 2005m5 | 2005 | 5 | 0.175003 | 30/09/2005 | 2005 | 9 | 8 | 2005 |
215 | 2005m6 | 2005 | 6 | 0.170211 | 30/09/2005 | 2005 | 9 | 9 | 2005 |
215 | 2005m7 | 2005 | 7 | 0.072727 | 30/09/2005 | 2005 | 9 | 10 | 2005 |
215 | 2005m8 | 2005 | 8 | 0 | 30/09/2005 | 2005 | 9 | 11 | 2005 |
215 | 2005m9 | 2005 | 9 | 0 | 30/09/2005 | 2005 | 9 | 12 | 2005 |
215 | 2005m10 | 2005 | 10 | 0.016952 | 30/09/2005 | 2005 | 9 | 1 | 2006 |
215 | 2005m11 | 2005 | 11 | -0.01667 | 30/09/2005 | 2005 | 9 | 2 | 2006 |
215 | 2005m12 | 2005 | 12 | 0 | 30/09/2005 | 2005 | 9 | 3 | 2006 |
215 | 2006m1 | 2006 | 1 | 0 | 30/09/2006 | 2006 | 9 | 4 | 2006 |
215 | 2006m2 | 2006 | 2 | 0.372884 | 30/09/2006 | 2006 | 9 | 5 | 2006 |
215 | 2006m3 | 2006 | 3 | 0.370371 | 30/09/2006 | 2006 | 9 | 6 | 2006 |
215 | 2006m4 | 2006 | 4 | 0.054053 | 30/09/2006 | 2006 | 9 | 7 | 2006 |
215 | 2006m5 | 2006 | 5 | -0.06838 | 30/09/2006 | 2006 | 9 | 8 | 2006 |
215 | 2006m6 | 2006 | 6 | 0.08257 | 30/09/2006 | 2006 | 9 | 9 | 2006 |
215 | 2006m7 | 2006 | 7 | 0.593219 | 30/09/2006 | 2006 | 9 | 10 | 2006 |
215 | 2006m8 | 2006 | 8 | -0.16489 | 30/09/2006 | 2006 | 9 | 11 | 2006 |
215 | 2006m9 | 2006 | 9 | 0.063693 | 30/09/2006 | 2006 | 9 | 12 | 2006 |
215 | 2006m10 | 2006 | 10 | 0.389222 | 30/09/2006 | 2006 | 9 | 1 | 2007 |
215 | 2006m11 | 2006 | 11 | 0.189654 | 30/09/2006 | 2006 | 9 | 2 | 2007 |
215 | 2006m12 | 2006 | 12 | 0.23913 | 30/09/2006 | 2006 | 9 | 3 | 2007 |
215 | 2007m1 | 2007 | 1 | 0.125731 | 30/09/2007 | 2007 | 9 | 4 | 2007 |
215 | 2007m2 | 2007 | 2 | 0.002611 | 30/09/2007 | 2007 | 9 | 5 | 2007 |
215 | 2007m3 | 2007 | 3 | 0.184896 | 30/09/2007 | 2007 | 9 | 6 | 2007 |
215 | 2007m4 | 2007 | 4 | 0.006594 | 30/09/2007 | 2007 | 9 | 7 | 2007 |
215 | 2007m5 | 2007 | 5 | 0.087336 | 30/09/2007 | 2007 | 9 | 8 | 2007 |
215 | 2007m6 | 2007 | 6 | -0.04235 | 30/09/2007 | 2007 | 9 | 9 | 2007 |
215 | 2007m7 | 2007 | 7 | 0.037894 | 30/09/2007 | 2007 | 9 | 10 | 2007 |
215 | 2007m8 | 2007 | 8 | 0.103449 | 30/09/2007 | 2007 | 9 | 11 | 2007 |
215 | 2007m9 | 2007 | 9 | -0.09375 | 30/09/2007 | 2007 | 9 | 12 | 2007 |
215 | 2007m10 | 2007 | 10 | 0.016227 | 30/09/2007 | 2007 | 9 | 1 | 2008 |
215 | 2007m11 | 2007 | 11 | -0.00798 | 30/09/2007 | 2007 | 9 | 2 | 2008 |
215 | 2007m12 | 2007 | 12 | -0.01207 | 30/09/2007 | 2007 | 9 | 3 | 2008 |
215 | 2008m1 | 2008 | 1 | -0.07847 | 30/09/2008 | 2008 | 9 | 4 | 2008 |
215 | 2008m2 | 2008 | 2 | 0.051111 | 30/09/2008 | 2008 | 9 | 5 | 2008 |
215 | 2008m3 | 2008 | 3 | 0.078224 | 30/09/2008 | 2008 | 9 | 6 | 2008 |
215 | 2008m4 | 2008 | 4 | 0.066667 | 30/09/2008 | 2008 | 9 | 7 | 2008 |
215 | 2008m5 | 2008 | 5 | 0.049633 | 30/09/2008 | 2008 | 9 | 8 | 2008 |
215 | 2008m6 | 2008 | 6 | 0.102022 | 30/09/2008 | 2008 | 9 | 9 | 2008 |
215 | 2008m7 | 2008 | 7 | -0.03834 | 30/09/2008 | 2008 | 9 | 10 | 2008 |
215 | 2008m8 | 2008 | 8 | 0 | 30/09/2008 | 2008 | 9 | 11 | 2008 |
215 | 2008m9 | 2008 | 9 | -0.16943 | 30/09/2008 | 2008 | 9 | 12 | 2008 |
215 | 2008m10 | 2008 | 10 | -0.272 | 30/09/2008 | 2008 | 9 | 1 | 2009 |
215 | 2008m11 | 2008 | 11 | 0.01923 | 30/09/2008 | 2008 | 9 | 2 | 2009 |
215 | 2008m12 | 2008 | 12 | 0.126685 | 30/09/2008 | 2008 | 9 | 3 | 2009 |
I wrote the following code:
data yearly_return;
do until (last.fyear);
set monthly_return;
by gvkey fyear;
if first.fyear then
compound_ret=1;
compound_ret=compound_ret*(1+mret);
end;
run;
data yearly_return;
set yearly_return;
yret = compound_ret - 1;
run;
I manually checked the results of my code. It seems ok. What do you think? Am I correct?
If I want to use your code, how the code should be modified?
Thank you very much!
Best Regards,
Tri Tri
t
@tritringuyen wrote:
Dear @mkeintz, thanks for your reply. I wrote the code to transform calendar year/month (year and month) to fiscal year/month (fyear and fmonth) based on FY_date (fiscal year end). Your prior code works very well with the data I posted previously. But after having fyear and fmonth, I do not rely on FY_date to compute yearly return.
new data looks like this (different sample, but variables are the same with the prior post):
gvkey monthyear year month mret FY_Date FY_year FY_month fmonth fyear 215 2005m1 2005 1 -0.07499 30/09/2005 2005 9 4 2005 215 2005m2 2005 2 0.32432 30/09/2005 2005 9 5 2005 215 2005m3 2005 3 0 30/09/2005 2005 9 6 2005 215 2005m4 2005 4 -0.18368 30/09/2005 2005 9 7 2005 215 2005m5 2005 5 0.175003 30/09/2005 2005 9 8 2005 215 2005m6 2005 6 0.170211 30/09/2005 2005 9 9 2005 215 2005m7 2005 7 0.072727 30/09/2005 2005 9 10 2005 215 2005m8 2005 8 0 30/09/2005 2005 9 11 2005 215 2005m9 2005 9 0 30/09/2005 2005 9 12 2005 215 2005m10 2005 10 0.016952 30/09/2005 2005 9 1 2006 215 2005m11 2005 11 -0.01667 30/09/2005 2005 9 2 2006 215 2005m12 2005 12 0 30/09/2005 2005 9 3 2006 215 2006m1 2006 1 0 30/09/2006 2006 9 4 2006 215 2006m2 2006 2 0.372884 30/09/2006 2006 9 5 2006 215 2006m3 2006 3 0.370371 30/09/2006 2006 9 6 2006 215 2006m4 2006 4 0.054053 30/09/2006 2006 9 7 2006 215 2006m5 2006 5 -0.06838 30/09/2006 2006 9 8 2006 215 2006m6 2006 6 0.08257 30/09/2006 2006 9 9 2006 215 2006m7 2006 7 0.593219 30/09/2006 2006 9 10 2006 215 2006m8 2006 8 -0.16489 30/09/2006 2006 9 11 2006 215 2006m9 2006 9 0.063693 30/09/2006 2006 9 12 2006 215 2006m10 2006 10 0.389222 30/09/2006 2006 9 1 2007 215 2006m11 2006 11 0.189654 30/09/2006 2006 9 2 2007 215 2006m12 2006 12 0.23913 30/09/2006 2006 9 3 2007 215 2007m1 2007 1 0.125731 30/09/2007 2007 9 4 2007 215 2007m2 2007 2 0.002611 30/09/2007 2007 9 5 2007 215 2007m3 2007 3 0.184896 30/09/2007 2007 9 6 2007 215 2007m4 2007 4 0.006594 30/09/2007 2007 9 7 2007 215 2007m5 2007 5 0.087336 30/09/2007 2007 9 8 2007 215 2007m6 2007 6 -0.04235 30/09/2007 2007 9 9 2007 215 2007m7 2007 7 0.037894 30/09/2007 2007 9 10 2007 215 2007m8 2007 8 0.103449 30/09/2007 2007 9 11 2007 215 2007m9 2007 9 -0.09375 30/09/2007 2007 9 12 2007 215 2007m10 2007 10 0.016227 30/09/2007 2007 9 1 2008 215 2007m11 2007 11 -0.00798 30/09/2007 2007 9 2 2008 215 2007m12 2007 12 -0.01207 30/09/2007 2007 9 3 2008 215 2008m1 2008 1 -0.07847 30/09/2008 2008 9 4 2008 215 2008m2 2008 2 0.051111 30/09/2008 2008 9 5 2008 215 2008m3 2008 3 0.078224 30/09/2008 2008 9 6 2008 215 2008m4 2008 4 0.066667 30/09/2008 2008 9 7 2008 215 2008m5 2008 5 0.049633 30/09/2008 2008 9 8 2008 215 2008m6 2008 6 0.102022 30/09/2008 2008 9 9 2008 215 2008m7 2008 7 -0.03834 30/09/2008 2008 9 10 2008 215 2008m8 2008 8 0 30/09/2008 2008 9 11 2008 215 2008m9 2008 9 -0.16943 30/09/2008 2008 9 12 2008 215 2008m10 2008 10 -0.272 30/09/2008 2008 9 1 2009 215 2008m11 2008 11 0.01923 30/09/2008 2008 9 2 2009 215 2008m12 2008 12 0.126685 30/09/2008 2008 9 3 2009
I wrote the following code:
data yearly_return;
do until (last.fyear);
set monthly_return;
by gvkey fyear;
if first.fyear then
compound_ret=1;
compound_ret=compound_ret*(1+mret);
end;
run;
data yearly_return;
set yearly_return;
yret = compound_ret - 1;
run;I manually checked the results of my code. It seems ok. What do you think? Am I correct?
If I want to use your code, how the code should be modified?
Your code seems logically equivalent to mine, although
Thank you very much @mkeintz. I use your code. You are an excellent advisor!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.