BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
abdulla
Pyrite | Level 9
gvkeydateRET
1000119940131-0.04762
10001199402280
1000119940331-0.00429
1000119940429-0.14493
10001199405310.067797
10001199406300.107302
10001199407290.072464
1000119940831-0.02703
10001199409300.038333
1000119941031-0.05405
1000119941130-0.04286
1000119941230-0.03343
1000119950131-0.03125
1000119950228-0.02621
10001199503310.006377
10001199504280
10001199505310.05
10001199506300.060317
10001199507310
1000119950831-0.0303
10001199509290.04375
1000119951031-0.0303
10001199511300.09375
1000119951229

0.082857

 

I have a large data set with many gvkey and monthly return(ret) like the above

To calculate the annual return from monthly return, I have used the following code. 

 

Data annual_returns;
set return;
return1= 1+ret;
if first.date then return2= return1;
else return2= return1*(lag(return2));
annual_return=return2-1;
by gvkey date;
run;

 

But I am not getting the expected result. I tried to use the codes available in the forum but none is giving me the expected result. The formula is given below.

APY = (1+r1)*(1+r2) *(1+r3) *(1+r4) *(1+r5) *(1+r6) *(1+r7) *(1+r8) *(1+r9) *(1+r10) *(1+r11) *(1+r12) – 1

Let’s say the twelve monthly returns are 2%, 2.2%, 2.1%, -1.5%, 2%, 2.4%, 1%, -1.2%, -0.5%, 0.7%, 1%, and 1.5%.

Using the above monthly returns, we can calculate the annualized returns as follows:

APY = (1.02)(1.022)(1.021)(0.985)(1.02)(1.024)(1.01)(0.988)(0.995)(1.007)(1.01)(1.015) – 1

Annualized return = 0.1223 or 12.23%

 

What mistake am I making? What should be the correct code? 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Below the code implementing the logic as you define it. The code assumes that you have data for every single month (or at least for January and December; no missings).

data have;
  infile datalines truncover;
  input gvkey date:yymmdd8. RET;
  format date date9.;
  datalines;
10001 19940131 -0.04762
10001 19940228 0
10001 19940331 -0.00429
10001 19940429 -0.14493
10001 19940531 0.067797
10001 19940630 0.107302
10001 19940729 0.072464
10001 19940831 -0.02703
10001 19940930 0.038333
10001 19941031 -0.05405
10001 19941130 -0.04286
10001 19941230 -0.03343
10001 19950131 -0.03125
10001 19950228 -0.02621
10001 19950331 0.006377
10001 19950428 0
10001 19950531 0.05
10001 19950630 0.060317
10001 19950731 0
10001 19950831 -0.0303
10001 19950929 0.04375
10001 19951031 -0.0303
10001 19951130 0.09375
10001 19951229 0.082857
;

data want;
  set have;
  by gvkey date;
  retain apy;
  if first.gvkey or month(date)=1 then apy=1+ret;
  else apy=apy*(1+ret);
  if month(date)=12 then apy=apy-1;
run;

proc print;
run;

 

View solution in original post

5 REPLIES 5
mkeintz
PROC Star

@abdulla wrote:
gvkey date RET
10001 19940131 -0.04762
10001 19940228 0
10001 19940331 -0.00429
10001 19940429 -0.14493
10001 19940531 0.067797
10001 19940630 0.107302
10001 19940729 0.072464
10001 19940831 -0.02703
10001 19940930 0.038333
10001 19941031 -0.05405
10001 19941130 -0.04286
10001 19941230 -0.03343
10001 19950131 -0.03125
10001 19950228 -0.02621
10001 19950331 0.006377
10001 19950428 0
10001 19950531 0.05
10001 19950630 0.060317
10001 19950731 0
10001 19950831 -0.0303
10001 19950929 0.04375
10001 19951031 -0.0303
10001 19951130 0.09375
10001 19951229

0.082857

 

I have a large data set with many gvkey and monthly return(ret) like the above

To calculate the annual return from monthly return, I have used the following code. 

 

Data annual_returns;
set return;
return1= 1+ret;
if first.date then return2= return1;
else return2= return1*(lag(return2));
annual_return=return2-1;
by gvkey date;
run;

 

But I am not getting the expected result. I tried to use the codes available in the forum but none is giving me the expected result. The formula is given below.

APY = (1+r1)*(1+r2) *(1+r3) *(1+r4) *(1+r5) *(1+r6) *(1+r7) *(1+r8) *(1+r9) *(1+r10) *(1+r11) *(1+r12) – 1

Let’s say the twelve monthly returns are 2%, 2.2%, 2.1%, -1.5%, 2%, 2.4%, 1%, -1.2%, -0.5%, 0.7%, 1%, and 1.5%.

Using the above monthly returns, we can calculate the annualized returns as follows:

APY = (1.02)(1.022)(1.021)(0.985)(1.02)(1.024)(1.01)(0.988)(0.995)(1.007)(1.01)(1.015) – 1

Annualized return = 0.1223 or 12.23%

 

What mistake am I making? What should be the correct code? 


 

If the program is as you posted, not only are you not getting the results you expect, you're not getting any results at all.  In particular you have

 if first.date then return2= return1;

 but there is no first.date variable, because there is not BY DATE statement.  SAS would have stopped the data step and notified you via the log.  Didn't your log show such a message.

 

Please show us an actual program used - at least the relevant parts.  And also show us the results you actually got from a completed data step (saying you didn't get what you expected is not always enough info to diagnose).

 

Help us help you.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
abdulla
Pyrite | Level 9

After using this code, I am getting return2 column is giving me similar result to return1 column. Though I have used by gvkey and date, the program doesn't recognize first.date and so probably gives me the return1 result in return2 column. I don't know how to solve this problem.

 

I have given the actual code I have used. The result I want is similar to the example I have given

Patrick
Opal | Level 21

Below the code implementing the logic as you define it. The code assumes that you have data for every single month (or at least for January and December; no missings).

data have;
  infile datalines truncover;
  input gvkey date:yymmdd8. RET;
  format date date9.;
  datalines;
10001 19940131 -0.04762
10001 19940228 0
10001 19940331 -0.00429
10001 19940429 -0.14493
10001 19940531 0.067797
10001 19940630 0.107302
10001 19940729 0.072464
10001 19940831 -0.02703
10001 19940930 0.038333
10001 19941031 -0.05405
10001 19941130 -0.04286
10001 19941230 -0.03343
10001 19950131 -0.03125
10001 19950228 -0.02621
10001 19950331 0.006377
10001 19950428 0
10001 19950531 0.05
10001 19950630 0.060317
10001 19950731 0
10001 19950831 -0.0303
10001 19950929 0.04375
10001 19951031 -0.0303
10001 19951130 0.09375
10001 19951229 0.082857
;

data want;
  set have;
  by gvkey date;
  retain apy;
  if first.gvkey or month(date)=1 then apy=1+ret;
  else apy=apy*(1+ret);
  if month(date)=12 then apy=apy-1;
run;

proc print;
run;

 

abdulla
Pyrite | Level 9
Thanks, Patrick,

This one works. But I am not still sure why apy=apy*(1+ret). Won't we need apy=lag(apy)*(1+ret)? Why doesn't lag(apy) work? I tried to use lag but got missing observations.
Patrick
Opal | Level 21

@abdulla wrote:
Thanks, Patrick,

This one works. But I am not still sure why apy=apy*(1+ret). Won't we need apy=lag(apy)*(1+ret)? Why doesn't lag(apy) work? I tried to use lag but got missing observations.

You normally use the lag() function on input variables. You could use it on calculated variables but you would need to ensure that you populate the variable with the calculated value before you call the lag() function. The lag function will write the value to a queue the moment you call it. You would also have to use the lag() function unconditionally as else not all values would get written to the queue (from which lag() then also reads the "old" values).

Using Retain is just much simpler and clearer. It keeps a value until you overwrite it.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 3370 views
  • 2 likes
  • 3 in conversation