BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tritringuyen
Quartz | Level 8

Hi, I want to calculate compound yearly return for each company (gvkey) for each fiscal year (with year end is FY_date). The issue is that companies may have different year end. Monthly return is mret.

 

Basically, I want a new variable, say:

 

cum_mret = (1+ return of the first month of fiscal year)*(1+return of the second month of fiscal year)*...*(return of the 12th month of fiscal year).

 

Monthly returns used to calculate cum_mret are determined based on FY_date. For example, gvkey has the year end of 31/05/2004, then first month used to calculate cum_mret is 6/2003 and the 12th month used to calculate cum_mret is 5/2004.

 

Some months of some gvkey have missing values for mret.

 

Here are some observations. Could you please give me some suggestion about the code?

 

gvkeyyearmonthmretFY_Date
24220031-0.0952431/05/2003
242200320.34210531/05/2003
24220033-0.0196131/05/2003
24220034-0.0431/05/2003
24220035031/05/2003
24220036031/05/2003
242200370.12531/05/2003
242200380.38888931/05/2003
242200390.0831/05/2003
2422003100.04938331/05/2003
2422003110.11764731/05/2003
2422003120.01052631/05/2003
242200410.22916731/05/2004
242200420.22881431/05/2004
242200430.18620731/05/2004
24220044-0.0058131/05/2004
242200450.12280731/05/2004
24220046-0.0156331/05/2004
24220047-0.0264631/05/2004
24220048-0.0407631/05/2004
242200490.12181331/05/2004
2422004100.01010131/05/2004
2422004110.04531/05/2004
2422004120.25358931/05/2004
24220051-0.0572531/05/2005
242200520.4008131/05/2005
24220053-0.0260131/05/2005
242200540.01780431/05/2005
242200550.18075831/05/2005
24220056-0.0098831/05/2005
24220057-0.0249431/05/2005
24220058-0.0396431/05/2005
242200590.15978731/05/2005
2422005100.14580931/05/2005
242200511-0.00231/05/2005
242200512-0.0120531/05/2005
1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

@tritringuyen

 

Don't calculate cumulative products, get cumulative SUMS of log(mret +1), then take the antilog and subtract 1 at end of fiscal year:

 

data yearly_returns (keep=gvkey fy_date annual_ret nmonths);

  set have;

  by gvkey fy_date;

  if first.fy_date then do;

    sum_log_ret_plus1=0;

    nmonths=0;

  end;

 

  sum_log_ret_plus1 + log(1+mret);

   nmonhs+1;

 

  if last.fy_date;   /* editted typographical error, was  "if last_fydate;" */

  annual_ret=exp(sum_log_ret_plus1)-1;

run;

 

 

If you want SQL:

 

proc sql;

  create table want as

  select gvkey,fy_date, count(*) as nmonths, exp(sum(log(mret+1)))-1 as annual_return

  from have

  group by gvkey,fy_date;

quit;

 

I suspect the DATA step will be faster, since it takes advantage of the order of data in HAVE.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

9 REPLIES 9
PGStats
Opal | Level 21

Shouldn't FY_date change after month=5, not month=12? Please clarify.

PG
tritringuyen
Quartz | Level 8

Thanks @PGStats. FY_date of the same a company (gvkey) rarely changes. But I do not expect that it is fixed. However, different FY_dates for different companies are common. Is it clear?

 

Thank you very much! Thierry

 

 

mkeintz
PROC Star

@PGStats:

 

It's possible that YEAR and MONTH are meant to be fiscal year/fiscal month, not calendar year/month. In which case FY_DATE changes after month=12.

 

The problem is that fiscal years ending in Jan through May (actually Jan 15- Jun 14) of calendar year X are identified as fiscal year X-1.  So the fiscal year ending on 31/05/2003 should be identified as FY 2002.  But that's not what @tritringuyen's data says. It has variable YEAR=2003.

 

Perhaps the OP just assigned YEAR = YEAR(fy_date), 

If so, it should have been YEAR=ifn(month(fy_date)>5,year(fy_date),year(fy_date)-1).

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mkeintz
PROC Star

@tritringuyen

 

Don't calculate cumulative products, get cumulative SUMS of log(mret +1), then take the antilog and subtract 1 at end of fiscal year:

 

data yearly_returns (keep=gvkey fy_date annual_ret nmonths);

  set have;

  by gvkey fy_date;

  if first.fy_date then do;

    sum_log_ret_plus1=0;

    nmonths=0;

  end;

 

  sum_log_ret_plus1 + log(1+mret);

   nmonhs+1;

 

  if last.fy_date;   /* editted typographical error, was  "if last_fydate;" */

  annual_ret=exp(sum_log_ret_plus1)-1;

run;

 

 

If you want SQL:

 

proc sql;

  create table want as

  select gvkey,fy_date, count(*) as nmonths, exp(sum(log(mret+1)))-1 as annual_return

  from have

  group by gvkey,fy_date;

quit;

 

I suspect the DATA step will be faster, since it takes advantage of the order of data in HAVE.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tritringuyen
Quartz | Level 8

Thank you very much, @mkeintz. In fact, the fyear and month in my data is calendar year and month (not fiscal year). The only information about fiscal year given is FY_date (fiscal year end). I suppose that the code should be changed?

 

Thanks a lot! Thierry

mkeintz
PROC Star

@tritringuyen wrote:

.Thank you very much, @mkeintz. In fact, the fyear and month in my data is calendar year and month (not fiscal year). The only information about fiscal year given is FY_date (fiscal year end). I suppose that the code should be changed?


 

This is a good opportunity to study the logic of the program and change the variable name used if the proposed name really serves the same purpose as fy_date.  Experiment a little.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tritringuyen
Quartz | Level 8

Dear @mkeintz, thanks for your reply. I wrote the code to transform calendar year/month (year and month) to fiscal year/month (fyear and fmonth) based on FY_date (fiscal year end). Your prior code works very well with the data I posted previously. But after having fyear and fmonth, I do not rely on FY_date to compute yearly return.

 

new data looks like this (different sample, but variables are the same with the prior post):

 

gvkeymonthyearyearmonthmretFY_DateFY_yearFY_monthfmonthfyear
2152005m120051-0.0749930/09/20052005942005
2152005m2200520.3243230/09/20052005952005
2152005m320053030/09/20052005962005
2152005m420054-0.1836830/09/20052005972005
2152005m5200550.17500330/09/20052005982005
2152005m6200560.17021130/09/20052005992005
2152005m7200570.07272730/09/200520059102005
2152005m820058030/09/200520059112005
2152005m920059030/09/200520059122005
2152005m102005100.01695230/09/20052005912006
2152005m11200511-0.0166730/09/20052005922006
2152005m12200512030/09/20052005932006
2152006m120061030/09/20062006942006
2152006m2200620.37288430/09/20062006952006
2152006m3200630.37037130/09/20062006962006
2152006m4200640.05405330/09/20062006972006
2152006m520065-0.0683830/09/20062006982006
2152006m6200660.0825730/09/20062006992006
2152006m7200670.59321930/09/200620069102006
2152006m820068-0.1648930/09/200620069112006
2152006m9200690.06369330/09/200620069122006
2152006m102006100.38922230/09/20062006912007
2152006m112006110.18965430/09/20062006922007
2152006m122006120.2391330/09/20062006932007
2152007m1200710.12573130/09/20072007942007
2152007m2200720.00261130/09/20072007952007
2152007m3200730.18489630/09/20072007962007
2152007m4200740.00659430/09/20072007972007
2152007m5200750.08733630/09/20072007982007
2152007m620076-0.0423530/09/20072007992007
2152007m7200770.03789430/09/200720079102007
2152007m8200780.10344930/09/200720079112007
2152007m920079-0.0937530/09/200720079122007
2152007m102007100.01622730/09/20072007912008
2152007m11200711-0.0079830/09/20072007922008
2152007m12200712-0.0120730/09/20072007932008
2152008m120081-0.0784730/09/20082008942008
2152008m2200820.05111130/09/20082008952008
2152008m3200830.07822430/09/20082008962008
2152008m4200840.06666730/09/20082008972008
2152008m5200850.04963330/09/20082008982008
2152008m6200860.10202230/09/20082008992008
2152008m720087-0.0383430/09/200820089102008
2152008m820088030/09/200820089112008
2152008m920089-0.1694330/09/200820089122008
2152008m10200810-0.27230/09/20082008912009
2152008m112008110.0192330/09/20082008922009
2152008m122008120.12668530/09/20082008932009

 

 

I wrote the following code:

 


data yearly_return;
do until (last.fyear);
set monthly_return;
by gvkey fyear;
if first.fyear then
compound_ret=1;
compound_ret=compound_ret*(1+mret);
end;
run;


data yearly_return;
set yearly_return;
yret = compound_ret - 1;
run;

I manually checked the results of my code. It seems ok. What do you think? Am I correct?

 

If I want to use your code, how the code should be modified?

 

Thank you very much!

 

Best Regards,

 

Tri Tri

mkeintz
PROC Star

t

 


@tritringuyen wrote:

Dear @mkeintz, thanks for your reply. I wrote the code to transform calendar year/month (year and month) to fiscal year/month (fyear and fmonth) based on FY_date (fiscal year end). Your prior code works very well with the data I posted previously. But after having fyear and fmonth, I do not rely on FY_date to compute yearly return.

 

new data looks like this (different sample, but variables are the same with the prior post):

 

gvkey monthyear year month mret FY_Date FY_year FY_month fmonth fyear
215 2005m1 2005 1 -0.07499 30/09/2005 2005 9 4 2005
215 2005m2 2005 2 0.32432 30/09/2005 2005 9 5 2005
215 2005m3 2005 3 0 30/09/2005 2005 9 6 2005
215 2005m4 2005 4 -0.18368 30/09/2005 2005 9 7 2005
215 2005m5 2005 5 0.175003 30/09/2005 2005 9 8 2005
215 2005m6 2005 6 0.170211 30/09/2005 2005 9 9 2005
215 2005m7 2005 7 0.072727 30/09/2005 2005 9 10 2005
215 2005m8 2005 8 0 30/09/2005 2005 9 11 2005
215 2005m9 2005 9 0 30/09/2005 2005 9 12 2005
215 2005m10 2005 10 0.016952 30/09/2005 2005 9 1 2006
215 2005m11 2005 11 -0.01667 30/09/2005 2005 9 2 2006
215 2005m12 2005 12 0 30/09/2005 2005 9 3 2006
215 2006m1 2006 1 0 30/09/2006 2006 9 4 2006
215 2006m2 2006 2 0.372884 30/09/2006 2006 9 5 2006
215 2006m3 2006 3 0.370371 30/09/2006 2006 9 6 2006
215 2006m4 2006 4 0.054053 30/09/2006 2006 9 7 2006
215 2006m5 2006 5 -0.06838 30/09/2006 2006 9 8 2006
215 2006m6 2006 6 0.08257 30/09/2006 2006 9 9 2006
215 2006m7 2006 7 0.593219 30/09/2006 2006 9 10 2006
215 2006m8 2006 8 -0.16489 30/09/2006 2006 9 11 2006
215 2006m9 2006 9 0.063693 30/09/2006 2006 9 12 2006
215 2006m10 2006 10 0.389222 30/09/2006 2006 9 1 2007
215 2006m11 2006 11 0.189654 30/09/2006 2006 9 2 2007
215 2006m12 2006 12 0.23913 30/09/2006 2006 9 3 2007
215 2007m1 2007 1 0.125731 30/09/2007 2007 9 4 2007
215 2007m2 2007 2 0.002611 30/09/2007 2007 9 5 2007
215 2007m3 2007 3 0.184896 30/09/2007 2007 9 6 2007
215 2007m4 2007 4 0.006594 30/09/2007 2007 9 7 2007
215 2007m5 2007 5 0.087336 30/09/2007 2007 9 8 2007
215 2007m6 2007 6 -0.04235 30/09/2007 2007 9 9 2007
215 2007m7 2007 7 0.037894 30/09/2007 2007 9 10 2007
215 2007m8 2007 8 0.103449 30/09/2007 2007 9 11 2007
215 2007m9 2007 9 -0.09375 30/09/2007 2007 9 12 2007
215 2007m10 2007 10 0.016227 30/09/2007 2007 9 1 2008
215 2007m11 2007 11 -0.00798 30/09/2007 2007 9 2 2008
215 2007m12 2007 12 -0.01207 30/09/2007 2007 9 3 2008
215 2008m1 2008 1 -0.07847 30/09/2008 2008 9 4 2008
215 2008m2 2008 2 0.051111 30/09/2008 2008 9 5 2008
215 2008m3 2008 3 0.078224 30/09/2008 2008 9 6 2008
215 2008m4 2008 4 0.066667 30/09/2008 2008 9 7 2008
215 2008m5 2008 5 0.049633 30/09/2008 2008 9 8 2008
215 2008m6 2008 6 0.102022 30/09/2008 2008 9 9 2008
215 2008m7 2008 7 -0.03834 30/09/2008 2008 9 10 2008
215 2008m8 2008 8 0 30/09/2008 2008 9 11 2008
215 2008m9 2008 9 -0.16943 30/09/2008 2008 9 12 2008
215 2008m10 2008 10 -0.272 30/09/2008 2008 9 1 2009
215 2008m11 2008 11 0.01923 30/09/2008 2008 9 2 2009
215 2008m12 2008 12 0.126685 30/09/2008 2008 9 3 2009

 

 

I wrote the following code:

 


data yearly_return;
do until (last.fyear);
set monthly_return;
by gvkey fyear;
if first.fyear then
compound_ret=1;
compound_ret=compound_ret*(1+mret);
end;
run;


data yearly_return;
set yearly_return;
yret = compound_ret - 1;
run;

I manually checked the results of my code. It seems ok. What do you think? Am I correct?

 

If I want to use your code, how the code should be modified?


 

 

Your code seems logically equivalent to mine, although

  1. You can put the "yret=compund_ret-1;" statement in the first data step, between "end" and "run".  This eliminates a superfluous pass through the data.
  2. You do not track NMONTHS (although you could), so you will have a bit more difficulty identifying occasions when companies change end-of-fiscal-year dates, i.e. instances in which the fiscal year does not have 12 months.  So there is no guarantee that all your Y_RETs will be annualized.
  3. If you want to use my code, then you can use FYEAR instead of fy_date.  There's no particular reason to do so.  I just wrote it without use of the "do until" technique, to more directly adhere to the most basic aspects of the data step.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tritringuyen
Quartz | Level 8

Thank you very much @mkeintz. I use your code. You are an excellent advisor!

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 5401 views
  • 2 likes
  • 3 in conversation