Hello everyone,
I want to calculate buy and hold abnormal return for 252 days for daily dataset for different firms (identify by cusip), I provide my dataset bellow.
BHAR(1) = { [(1+ret) of day1] * [(1+ret) of day2] * [(1+ret) of day 3] * [(1+ret) of day4] * ... * [(1+ret) of day 252] } - { [(1+vwretd) of day1] * [(1+vwretd) of day2] * [(1+vwretd) of day3] * [(1+vwretd) of day4] * ... * [(1+vwretd) of day252 }
I really appreciate your help.
Thanks in advance
Let's say your answer to each question is yes. Then: (code amended):
data want (drop=_:);
set have;
by cusip notsorted;
if first.cusip then call missing(_obs,_sum_log_cusip_ret_plus1,_sum_log_index_ret_plus1);
_obs+1;
_sum_log_cusip_ret_plus1 + log(1+ret) - ifn(_obs>252,lag252(log(1+ret)),0) ;
_sum_log_index_ret_plus1 + log(1+vwretd) - ifn(_obs>252,lag252(log(1+vwretd)),0);
window_start = lag251(year) ;
if _obs>=252 ;
bhar_252 = exp(_sum_log_cusip_ret_plus1) - exp(_sum_log_index_ret_plus1) ;
run;
This assumes your data are sorted by cusip and date (which you have labelled as year). Now if the data are not sorted by cusip, but are sorted by date within cusip, you can change
by cusip;
to
by cusip notsorted;
Now note that this involves calculating 252-day cumulative returns for VWRETD for each CUSIP. But the VWRETD is based on a market index, and thus has the same value for every cusip on each given date. This means that if you have, say, 400 CUSIPs, you will be calculating the same values 400 times for the VWRETD components, where you really need to do it only once. After calculating those returns once, you can store them in an array indexed by date. Then, for each CUSIP, just retrieve them for calculating BHAR by date. In other words, you could cut down your total calculations by half (400 times for 400 CUSIP returns, but only once for index returns).
Edited note: I didn't notice a missing close paren in the lag252 argument. It should have been lag252(log(1+ret)), not lag252(log(1+ret). Same with the vwretd-based component. It's now corrected in the code above.
Also I have added retrieval of a window starting date, as per other comments by @Fer_.
Thanks for your reply.
Yes I have a large dataset. I provide a sample bellow as text. For every year in my sample, I have stock return and market return without missing data.( I already delete them)
cusip | year | ret | vwretd |
11001 | 19900123 | -0.055 | -0.002 |
11001 | 19900124 | -0.0098 | -0.006 |
11001 | 19900125 | -0.009 | -0.013 |
11001 | 19900128 | -0.0087 | -0.007 |
.. | … | … | … |
11001 | 20211231 | 0.003 | 0.0056 |
11002 | 19900204 | -0.0019 | -0.004 |
11002 | 19900205 | -0.014 | -0.0045 |
… | … | … | … |
11002 | 20211231 | 0.004 | -0.007 |
11003 | 19900115 | 0.0013 | 0.0016 |
11003 | 19900116 | -0.014 | 0.001 |
… | … | … | |
11003 | 20211231 | 0.005 | -0.007 |
Let's say your answer to each question is yes. Then: (code amended):
data want (drop=_:);
set have;
by cusip notsorted;
if first.cusip then call missing(_obs,_sum_log_cusip_ret_plus1,_sum_log_index_ret_plus1);
_obs+1;
_sum_log_cusip_ret_plus1 + log(1+ret) - ifn(_obs>252,lag252(log(1+ret)),0) ;
_sum_log_index_ret_plus1 + log(1+vwretd) - ifn(_obs>252,lag252(log(1+vwretd)),0);
window_start = lag251(year) ;
if _obs>=252 ;
bhar_252 = exp(_sum_log_cusip_ret_plus1) - exp(_sum_log_index_ret_plus1) ;
run;
This assumes your data are sorted by cusip and date (which you have labelled as year). Now if the data are not sorted by cusip, but are sorted by date within cusip, you can change
by cusip;
to
by cusip notsorted;
Now note that this involves calculating 252-day cumulative returns for VWRETD for each CUSIP. But the VWRETD is based on a market index, and thus has the same value for every cusip on each given date. This means that if you have, say, 400 CUSIPs, you will be calculating the same values 400 times for the VWRETD components, where you really need to do it only once. After calculating those returns once, you can store them in an array indexed by date. Then, for each CUSIP, just retrieve them for calculating BHAR by date. In other words, you could cut down your total calculations by half (400 times for 400 CUSIP returns, but only once for index returns).
Edited note: I didn't notice a missing close paren in the lag252 argument. It should have been lag252(log(1+ret)), not lag252(log(1+ret). Same with the vwretd-based component. It's now corrected in the code above.
Also I have added retrieval of a window starting date, as per other comments by @Fer_.
Thanks for your response.
I got these two errors:
ERROR 72-185: The LAG function call has too many arguments.
ERROR 71-185: The IFN function call does not have enough arguments.
for these lines of code:
sum_log_cusip_ret_plus1 + log(1+ret)- ifn(_obs>252,lag252(log(1+ret),0));
_sum_log_index_ret_plus1 + log(1+vwretd) - ifn(_obs>252,lag252(log(1+vwretd),0));
Do you know how to solve them? Thanks in advance.
This is why it's helpful to offer sample data in the form of a working data step. I usually test against sample data. In any case, this appears to be the result of a forgotten close paren, which I've changed in my initial response.
And yes I want the rolling 252-days, which means that for each date and each cusip, I want to calculate bhar for that date and 251 days after that. For example for day1, I need bhar from day1 to day 252, then day 2 to day 253, day 3 to day 254...... until then end for each cusip.
I do not have missing values. I already delete them.
yes
@Fer_ wrote:
And yes I want the rolling 252-days, which means that for each date and each cusip, I want to calculate bhar for that date and 251 days after that. For example for day1, I need bhar from day1 to day 252, then day 2 to day 253, day 3 to day 254...... until then end for each cusip.
My previous code used the end-of-window date, but I've added code to provide beginning of window date.
@mkeintz your code does a 252 observation rolling calculation. If you're missing 6 months of data in there for some reason it will use part of one year and part of the other? (6 months isn't realistic but the trading days do vary by CUSPID IIRC).
Thanks for the code, unfortunately as my dataset is too large, I just could provide a small sample of it. I have a question about the code. My dataset starts from 1990 to 2021, but by running the code the window start is from 1991, which means that for every date the code notice to 252 days before that date. but I want to see 252 days ahead of each date. BHAR for day1= (1+ret day1)(1+ret of day2)....(1+ret day252)- ((1+vwretd day1)(1+vwretd day2)...(1+vwretd day252)).
I am thankful for your help.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.