BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tritringuyen
Quartz | Level 8

Hi, I want to calculate compound yearly return for each company (gvkey) for each fiscal year (with year end is FY_date). The issue is that companies may have different year end. Monthly return is mret.

 

Basically, I want a new variable, say:

 

cum_mret = (1+ return of the first month of fiscal year)*(1+return of the second month of fiscal year)*...*(return of the 12th month of fiscal year).

 

Monthly returns used to calculate cum_mret are determined based on FY_date. For example, gvkey has the year end of 31/05/2004, then first month used to calculate cum_mret is 6/2003 and the 12th month used to calculate cum_mret is 5/2004.

 

Some months of some gvkey have missing values for mret.

 

Here are some observations. Could you please give me some suggestion about the code?

 

gvkeyyearmonthmretFY_Date
24220031-0.0952431/05/2003
242200320.34210531/05/2003
24220033-0.0196131/05/2003
24220034-0.0431/05/2003
24220035031/05/2003
24220036031/05/2003
242200370.12531/05/2003
242200380.38888931/05/2003
242200390.0831/05/2003
2422003100.04938331/05/2003
2422003110.11764731/05/2003
2422003120.01052631/05/2003
242200410.22916731/05/2004
242200420.22881431/05/2004
242200430.18620731/05/2004
24220044-0.0058131/05/2004
242200450.12280731/05/2004
24220046-0.0156331/05/2004
24220047-0.0264631/05/2004
24220048-0.0407631/05/2004
242200490.12181331/05/2004
2422004100.01010131/05/2004
2422004110.04531/05/2004
2422004120.25358931/05/2004
24220051-0.0572531/05/2005
242200520.4008131/05/2005
24220053-0.0260131/05/2005
242200540.01780431/05/2005
242200550.18075831/05/2005
24220056-0.0098831/05/2005
24220057-0.0249431/05/2005
24220058-0.0396431/05/2005
242200590.15978731/05/2005
2422005100.14580931/05/2005
242200511-0.00231/05/2005
242200512-0.0120531/05/2005
1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

@tritringuyen

 

Don't calculate cumulative products, get cumulative SUMS of log(mret +1), then take the antilog and subtract 1 at end of fiscal year:

 

data yearly_returns (keep=gvkey fy_date annual_ret nmonths);

  set have;

  by gvkey fy_date;

  if first.fy_date then do;

    sum_log_ret_plus1=0;

    nmonths=0;

  end;

 

  sum_log_ret_plus1 + log(1+mret);

   nmonhs+1;

 

  if last.fy_date;   /* editted typographical error, was  "if last_fydate;" */

  annual_ret=exp(sum_log_ret_plus1)-1;

run;

 

 

If you want SQL:

 

proc sql;

  create table want as

  select gvkey,fy_date, count(*) as nmonths, exp(sum(log(mret+1)))-1 as annual_return

  from have

  group by gvkey,fy_date;

quit;

 

I suspect the DATA step will be faster, since it takes advantage of the order of data in HAVE.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

9 REPLIES 9
PGStats
Opal | Level 21

Shouldn't FY_date change after month=5, not month=12? Please clarify.

PG
tritringuyen
Quartz | Level 8

Thanks @PGStats. FY_date of the same a company (gvkey) rarely changes. But I do not expect that it is fixed. However, different FY_dates for different companies are common. Is it clear?

 

Thank you very much! Thierry

 

 

mkeintz
PROC Star

@PGStats:

 

It's possible that YEAR and MONTH are meant to be fiscal year/fiscal month, not calendar year/month. In which case FY_DATE changes after month=12.

 

The problem is that fiscal years ending in Jan through May (actually Jan 15- Jun 14) of calendar year X are identified as fiscal year X-1.  So the fiscal year ending on 31/05/2003 should be identified as FY 2002.  But that's not what @tritringuyen's data says. It has variable YEAR=2003.

 

Perhaps the OP just assigned YEAR = YEAR(fy_date), 

If so, it should have been YEAR=ifn(month(fy_date)>5,year(fy_date),year(fy_date)-1).

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mkeintz
PROC Star

@tritringuyen

 

Don't calculate cumulative products, get cumulative SUMS of log(mret +1), then take the antilog and subtract 1 at end of fiscal year:

 

data yearly_returns (keep=gvkey fy_date annual_ret nmonths);

  set have;

  by gvkey fy_date;

  if first.fy_date then do;

    sum_log_ret_plus1=0;

    nmonths=0;

  end;

 

  sum_log_ret_plus1 + log(1+mret);

   nmonhs+1;

 

  if last.fy_date;   /* editted typographical error, was  "if last_fydate;" */

  annual_ret=exp(sum_log_ret_plus1)-1;

run;

 

 

If you want SQL:

 

proc sql;

  create table want as

  select gvkey,fy_date, count(*) as nmonths, exp(sum(log(mret+1)))-1 as annual_return

  from have

  group by gvkey,fy_date;

quit;

 

I suspect the DATA step will be faster, since it takes advantage of the order of data in HAVE.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tritringuyen
Quartz | Level 8

Thank you very much, @mkeintz. In fact, the fyear and month in my data is calendar year and month (not fiscal year). The only information about fiscal year given is FY_date (fiscal year end). I suppose that the code should be changed?

 

Thanks a lot! Thierry

mkeintz
PROC Star

@tritringuyen wrote:

.Thank you very much, @mkeintz. In fact, the fyear and month in my data is calendar year and month (not fiscal year). The only information about fiscal year given is FY_date (fiscal year end). I suppose that the code should be changed?


 

This is a good opportunity to study the logic of the program and change the variable name used if the proposed name really serves the same purpose as fy_date.  Experiment a little.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tritringuyen
Quartz | Level 8

Dear @mkeintz, thanks for your reply. I wrote the code to transform calendar year/month (year and month) to fiscal year/month (fyear and fmonth) based on FY_date (fiscal year end). Your prior code works very well with the data I posted previously. But after having fyear and fmonth, I do not rely on FY_date to compute yearly return.

 

new data looks like this (different sample, but variables are the same with the prior post):

 

gvkeymonthyearyearmonthmretFY_DateFY_yearFY_monthfmonthfyear
2152005m120051-0.0749930/09/20052005942005
2152005m2200520.3243230/09/20052005952005
2152005m320053030/09/20052005962005
2152005m420054-0.1836830/09/20052005972005
2152005m5200550.17500330/09/20052005982005
2152005m6200560.17021130/09/20052005992005
2152005m7200570.07272730/09/200520059102005
2152005m820058030/09/200520059112005
2152005m920059030/09/200520059122005
2152005m102005100.01695230/09/20052005912006
2152005m11200511-0.0166730/09/20052005922006
2152005m12200512030/09/20052005932006
2152006m120061030/09/20062006942006
2152006m2200620.37288430/09/20062006952006
2152006m3200630.37037130/09/20062006962006
2152006m4200640.05405330/09/20062006972006
2152006m520065-0.0683830/09/20062006982006
2152006m6200660.0825730/09/20062006992006
2152006m7200670.59321930/09/200620069102006
2152006m820068-0.1648930/09/200620069112006
2152006m9200690.06369330/09/200620069122006
2152006m102006100.38922230/09/20062006912007
2152006m112006110.18965430/09/20062006922007
2152006m122006120.2391330/09/20062006932007
2152007m1200710.12573130/09/20072007942007
2152007m2200720.00261130/09/20072007952007
2152007m3200730.18489630/09/20072007962007
2152007m4200740.00659430/09/20072007972007
2152007m5200750.08733630/09/20072007982007
2152007m620076-0.0423530/09/20072007992007
2152007m7200770.03789430/09/200720079102007
2152007m8200780.10344930/09/200720079112007
2152007m920079-0.0937530/09/200720079122007
2152007m102007100.01622730/09/20072007912008
2152007m11200711-0.0079830/09/20072007922008
2152007m12200712-0.0120730/09/20072007932008
2152008m120081-0.0784730/09/20082008942008
2152008m2200820.05111130/09/20082008952008
2152008m3200830.07822430/09/20082008962008
2152008m4200840.06666730/09/20082008972008
2152008m5200850.04963330/09/20082008982008
2152008m6200860.10202230/09/20082008992008
2152008m720087-0.0383430/09/200820089102008
2152008m820088030/09/200820089112008
2152008m920089-0.1694330/09/200820089122008
2152008m10200810-0.27230/09/20082008912009
2152008m112008110.0192330/09/20082008922009
2152008m122008120.12668530/09/20082008932009

 

 

I wrote the following code:

 


data yearly_return;
do until (last.fyear);
set monthly_return;
by gvkey fyear;
if first.fyear then
compound_ret=1;
compound_ret=compound_ret*(1+mret);
end;
run;


data yearly_return;
set yearly_return;
yret = compound_ret - 1;
run;

I manually checked the results of my code. It seems ok. What do you think? Am I correct?

 

If I want to use your code, how the code should be modified?

 

Thank you very much!

 

Best Regards,

 

Tri Tri

mkeintz
PROC Star

t

 


@tritringuyen wrote:

Dear @mkeintz, thanks for your reply. I wrote the code to transform calendar year/month (year and month) to fiscal year/month (fyear and fmonth) based on FY_date (fiscal year end). Your prior code works very well with the data I posted previously. But after having fyear and fmonth, I do not rely on FY_date to compute yearly return.

 

new data looks like this (different sample, but variables are the same with the prior post):

 

gvkey monthyear year month mret FY_Date FY_year FY_month fmonth fyear
215 2005m1 2005 1 -0.07499 30/09/2005 2005 9 4 2005
215 2005m2 2005 2 0.32432 30/09/2005 2005 9 5 2005
215 2005m3 2005 3 0 30/09/2005 2005 9 6 2005
215 2005m4 2005 4 -0.18368 30/09/2005 2005 9 7 2005
215 2005m5 2005 5 0.175003 30/09/2005 2005 9 8 2005
215 2005m6 2005 6 0.170211 30/09/2005 2005 9 9 2005
215 2005m7 2005 7 0.072727 30/09/2005 2005 9 10 2005
215 2005m8 2005 8 0 30/09/2005 2005 9 11 2005
215 2005m9 2005 9 0 30/09/2005 2005 9 12 2005
215 2005m10 2005 10 0.016952 30/09/2005 2005 9 1 2006
215 2005m11 2005 11 -0.01667 30/09/2005 2005 9 2 2006
215 2005m12 2005 12 0 30/09/2005 2005 9 3 2006
215 2006m1 2006 1 0 30/09/2006 2006 9 4 2006
215 2006m2 2006 2 0.372884 30/09/2006 2006 9 5 2006
215 2006m3 2006 3 0.370371 30/09/2006 2006 9 6 2006
215 2006m4 2006 4 0.054053 30/09/2006 2006 9 7 2006
215 2006m5 2006 5 -0.06838 30/09/2006 2006 9 8 2006
215 2006m6 2006 6 0.08257 30/09/2006 2006 9 9 2006
215 2006m7 2006 7 0.593219 30/09/2006 2006 9 10 2006
215 2006m8 2006 8 -0.16489 30/09/2006 2006 9 11 2006
215 2006m9 2006 9 0.063693 30/09/2006 2006 9 12 2006
215 2006m10 2006 10 0.389222 30/09/2006 2006 9 1 2007
215 2006m11 2006 11 0.189654 30/09/2006 2006 9 2 2007
215 2006m12 2006 12 0.23913 30/09/2006 2006 9 3 2007
215 2007m1 2007 1 0.125731 30/09/2007 2007 9 4 2007
215 2007m2 2007 2 0.002611 30/09/2007 2007 9 5 2007
215 2007m3 2007 3 0.184896 30/09/2007 2007 9 6 2007
215 2007m4 2007 4 0.006594 30/09/2007 2007 9 7 2007
215 2007m5 2007 5 0.087336 30/09/2007 2007 9 8 2007
215 2007m6 2007 6 -0.04235 30/09/2007 2007 9 9 2007
215 2007m7 2007 7 0.037894 30/09/2007 2007 9 10 2007
215 2007m8 2007 8 0.103449 30/09/2007 2007 9 11 2007
215 2007m9 2007 9 -0.09375 30/09/2007 2007 9 12 2007
215 2007m10 2007 10 0.016227 30/09/2007 2007 9 1 2008
215 2007m11 2007 11 -0.00798 30/09/2007 2007 9 2 2008
215 2007m12 2007 12 -0.01207 30/09/2007 2007 9 3 2008
215 2008m1 2008 1 -0.07847 30/09/2008 2008 9 4 2008
215 2008m2 2008 2 0.051111 30/09/2008 2008 9 5 2008
215 2008m3 2008 3 0.078224 30/09/2008 2008 9 6 2008
215 2008m4 2008 4 0.066667 30/09/2008 2008 9 7 2008
215 2008m5 2008 5 0.049633 30/09/2008 2008 9 8 2008
215 2008m6 2008 6 0.102022 30/09/2008 2008 9 9 2008
215 2008m7 2008 7 -0.03834 30/09/2008 2008 9 10 2008
215 2008m8 2008 8 0 30/09/2008 2008 9 11 2008
215 2008m9 2008 9 -0.16943 30/09/2008 2008 9 12 2008
215 2008m10 2008 10 -0.272 30/09/2008 2008 9 1 2009
215 2008m11 2008 11 0.01923 30/09/2008 2008 9 2 2009
215 2008m12 2008 12 0.126685 30/09/2008 2008 9 3 2009

 

 

I wrote the following code:

 


data yearly_return;
do until (last.fyear);
set monthly_return;
by gvkey fyear;
if first.fyear then
compound_ret=1;
compound_ret=compound_ret*(1+mret);
end;
run;


data yearly_return;
set yearly_return;
yret = compound_ret - 1;
run;

I manually checked the results of my code. It seems ok. What do you think? Am I correct?

 

If I want to use your code, how the code should be modified?


 

 

Your code seems logically equivalent to mine, although

  1. You can put the "yret=compund_ret-1;" statement in the first data step, between "end" and "run".  This eliminates a superfluous pass through the data.
  2. You do not track NMONTHS (although you could), so you will have a bit more difficulty identifying occasions when companies change end-of-fiscal-year dates, i.e. instances in which the fiscal year does not have 12 months.  So there is no guarantee that all your Y_RETs will be annualized.
  3. If you want to use my code, then you can use FYEAR instead of fy_date.  There's no particular reason to do so.  I just wrote it without use of the "do until" technique, to more directly adhere to the most basic aspects of the data step.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tritringuyen
Quartz | Level 8

Thank you very much @mkeintz. I use your code. You are an excellent advisor!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 6164 views
  • 2 likes
  • 3 in conversation