- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am trying to perform an event study.
I have a time-series data which contains the return of stocks from all the companies from 01-Jan-2021 to 31-Dec-2021.
I have put a small sample of my data and the my expected output below for your kind consideration. The dataset is attached at the end.
Here is my sample data:
Company_ID Date Return Reporting_Date Event_Window Gap Estimation_Window
1001 01JAN2021 0.01
1001 04JAN2021 0.02
1001 05JAN2021 0.03
1001 06JAN2021 0.09
1001 07JAN2021 0.05
1001 08JAN2021 0.06
1001 11JAN2021 0.07
1001 12JAN2021 0.08
1001 13JAN2021 0.09
1001 14JAN2021 0.1 14JAN2021 3 2 5
1001 15JAN2021 0.11
1001 18JAN2021 0.12
1001 19JAN2021 0.13
1001 20JAN2021 0.14
1001 21JAN2021 0.15
1001 22JAN2021 0.16
1001 25JAN2021 0.17
1001 26JAN2021 0.18
1001 27JAN2021 0.19
1001 28JAN2021 0.2
1001 29JAN2021 0.21
1001 01FEB2021 0.22
1001 02FEB2021 0.23
1001 03FEB2021 0.24
1001 04FEB2021 0.25
1001 05FEB2021 0.26
1001 08FEB2021 0.27
1001 09FEB2021 0.26 09FEB2021 3 2 5
1001 10FEB2021 0.29
1001 11FEB2021 0.3
1001 12FEB2021 0.31
1001 15FEB2021 0.32
1001 16FEB2021 0.33
1001 17FEB2021 0.34
1001 18FEB2021 0.35
1001 19FEB2021 0.36
1001 22FEB2021 0.37
1001 23FEB2021 0.38
1001 24FEB2021 0.39
1001 25FEB2021 0.4
1001 26FEB2021 0.41
1002 01JAN2021 0.22
1002 04JAN2021 0.23
1002 05JAN2021 0.24
1002 06JAN2021 0.25
1002 07JAN2021 0.26
1002 08JAN2021 0.27
1002 11JAN2021 0.28
1002 12JAN2021 0.29
1002 13JAN2021 0.3
1002 14JAN2021 0.31
1002 15JAN2021 0.32
1002 18JAN2021 0.39 18JAN2021 3 2 5
1002 19JAN2021 0.34
1002 20JAN2021 0.35
1002 21JAN2021 0.36
1002 22JAN2021 0.37
1002 25JAN2021 0.38
1002 26JAN2021 0.39
1002 27JAN2021 0.4
1002 28JAN2021 0.41
1002 29JAN2021 0.42
1003 01JAN2021 0.43
1003 04JAN2021 0.44
1003 05JAN2021 0.45
1003 06JAN2021 0.46
1003 07JAN2021 0.47 07JAN2021 3 2 5
1003 08JAN2021 0.48
1003 11JAN2021 0.49
1003 12JAN2021 0.5
1003 13JAN2021 0.51
1003 14JAN2021 0.52
1003 15JAN2021 0.53
1003 18JAN2021 0.54
1003 19JAN2021 0.55
1003 20JAN2021 0.56
1003 21JAN2021 0.57
1004 01JAN2021 0.03
1004 04JAN2021 0.06 04JAN2021 3 2 5
1004 05JAN2021 0.09
1004 06JAN2021 0.12
1004 07JAN2021 0.15
As you can see, every company has just 1 (one) reporting date in each month. That is, all the other fields in the column "Reporting_Date" are blank.
I want to perform the event study around the Reporting date of a company.
Essentially, I am taking
1. "Event_Window" of 3 days around the "Reporting_Date"
2. A "Gap" of 2 days both before and after the Event_Window, which I do not want to include in my calculation (total 4 days)
3. And an "Estimation_Window" of 5 days both before and after the Gap days (total 10 days)
For instance, for the company_ID 1001:
Event days are: Event_Date_-1(13JAN2021), Event_Date_0(14JAN2021), Event_Date_+1(15JAN2021)
Gap days are 11JAN2021, 12JAN2021, 18JAN2021, 19JAN2021
Estimation_Windows are 04, 05, 06, 07, 08, 20, 21, 22, 25, 26 JAN2021
For every Event_Date, first I need to Average the Return during the 10 days of the Estimation_Window, and then divide the Return by this Average.
For Instance, for Company_ID 1001 and Reporting_Date 2021-01-14:
Average of the Estimation window is (0.02+0.03+0.9+0.05+0.06+0.14+0.15+0.16+0.17+0.18)/10 = 0.105
The values that I want is:
Event_Date_-1: 0.09/0.105 = 0.857142857142857
Event_Date_0: 0.1/0.105 = 0.952380952380952
Event_Date_+1: 0.11/0.105 = 1.04761904761905
My output for the sample data that I have provided should be:
Company ID | Date | Event_Date_-1 | Event_Date_0 | Event_Date_+1 |
1001 | 2021-01-14 | 0.857142857 | 0.952 | 1.047619048 |
1001 | 2021-02-09 | 0.964285714 | 0.929 | 1.035714286 |
1002 | 2021-01-18 | 0.96969697 | 1.181818182 | 1.03030303 |
As you can see, in total, I need at least the data for 17 days around the Reporting_Date to perform my analysis. So, for instance, for the Company_ID 1003 and 1004, there is not enough dates (only 14 days instead of the needed 17 days for Company_ID 1003 and only 05 days instead of the needed 17 days for Company_ID 1004). Therefore, I want to drop this observation from my analysis and create a separate list of dropped observations, containing the Company_ID and Reporting_Date as column heads as follows:
Company_ID | Reporting_Date |
1003 | 2021-01-07 |
1004 | 2021-01-04 |
Please note that the days in the Estimation_Window and Gap cannot be simply calculated by considering them as continuous calendar days. As you will notice, these are trading days and the holidays are not included in the table.
Thank you so much for your time!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Another solution using a hash object:
data work.Enhanced;
set work.have;
obsnum = _n_;
run;
data
work.want(keep= Company_ID Reporting_Date Before_Event_Date At_Event_Date After_Event_Date)
work.NotEnoughData(keep= Company_ID Reporting_Date)
;
set work.Enhanced;
where not missing(Reporting_Date);
length
Average Reporting_Return 8
Before_Event_Date At_Event_Date After_Event_Date 8
;
if _n_ = 1 then do;
declare hash h(dataset: 'work.enhanced');
h.defineKey('obsnum', 'company_id');
h.defineData('Return');
h.defineDone();
end;
reporting_return = return;
average = 0;
/* check if data for the estimation windows is available */
start_before = obsnum - 8;
end_before = obsnum - 4;
before_ok = (h.check(key: start_before, key: company_id) = 0)
and (h.check(key: end_before, key: company_id) = 0);
start_after = obsnum + 4;
end_after = obsnum + 8;
after_ok = (h.check(key: start_after, key: company_id) = 0)
and (h.check(key: end_after, key: company_id) = 0);
if before_ok and after_ok then do;
do i = start_before to end_before;
rc = h.find(key: i, key: company_id);
average = average + return;
end;
do i = start_after to end_after;
rc = h.find(key: i, key: company_id);
average = average + return;
end;
average = average / 10;
At_Event_Date = reporting_return / average;
rc = h.find(key: obsnum-1, key: company_id);
Before_Event_Date = Return / average;
rc = h.find(key: obsnum+1, key: company_id);
After_Event_Date = Return / average;
output work.want;
end;
else do;
output work.NotEnoughData;
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
this should do the trick:
LIBNAME in "C:\Temp" access=readonly;
DATA h01;
length month 8;
SET in.have;
month=month(date);
RUN;
PROC SORT data=h01 out=h02; BY company_id month date; RUN;
LIBNAME in CLEAR;
*Add row number;
data h03;
length id 8;
set h02;
BY company_id month date;
if first.month then id=0;
id+1;
run;
PROC SQL;
*Merge reporting_date to all rows by company and month;
CREATE TABLE h04 AS
SELECT a.id,b.id as bid,a.company_id,month(a.date) as month, a.date,a.return
,b.reporting_date,b.event_window,b.gap,b.estimation_window,b.date as date0
FROM h03 a
LEFT JOIN h03(where=(not missing(reporting_date))) b
ON a.company_id eq b.company_id
AND month(a.date) eq month(b.date)
ORDER BY a.company_id,calculated month,a.date
;
*Check single assignment of reporting_date by company and month;
CREATE TABLE check0 AS
SELECT DISTINCT company_id,month,reporting_date
FROM h04
ORDER BY company_id,month,reporting_date
;
QUIT;
DATA _NULL_;
set check0;
BY company_id month reporting_date;
if not first.month and last.month then put 'E' 'RROR: several reporting dates by company and month found' company_id= month= reporting_date=;
RUN;
*clean up;
PROC DATASETS lib=work nolist; delete check0; RUN;QUIT;
PROC FORMAT;
value period
1='Estimation window'
2='Gap'
3.1='Event date -1'
3.2='Event date'
3.3='Event date +1'
;
RUN;
DATA h05;
SET h04;
length period rangelo rangehi 8;
*Determine period for Estimation window;
rangelo=bid-1-gap-estimation_window;
rangehi=bid+1+gap+estimation_window;
if rangelo<=id<=rangehi then period=1;
*Determine period for Gap;
rangelo=bid-1-gap;
rangehi=bid+1+gap;
if rangelo<=id<=rangehi then period=2;
*Determine period for Event;
rangelo=bid-1;
rangehi=bid+1;
if rangelo=id then period=3.1;
else if id=bid then period=3.2;
else if rangehi=id then period=3.3;
drop range:;
format period period.;
RUN;
PROC SQL;
*Calculate avg;
CREATE TABLE h06_avg_ew AS
SELECT company_id,month,period,avg(return) as avg_ew
FROM h05
WHERE period eq 1
GROUP BY company_id,month,period
;
*Merge avg to corresponding event period;
CREATE TABLE h07 AS
SELECT a.*,b.avg_ew
FROM h05 a
LEFT JOIN h06_avg_ew b
ON a.company_id eq b.company_id
AND a.month eq b.month
AND floor(a.period) eq 3
ORDER BY company_id,month,date
;
QUIT;
*Calculate results;
DATA h08;
set h07;
where floor(period) eq 3;
by company_id month date;
length result 8 periodc $50;
if not missing(avg_ew) then result=return/avg_ew;
periodc=vvalue(period);
keep company_id month date0 period: result;
RUN;
*Transpose name and label variables;
PROC TRANSPOSE data=h08 out=h09(drop=_name_) prefix=period ;
by company_id month date0 ;
var result;
id period;
idlabel periodc;
format period best.;
RUN;
DATA want;
set h09;
rename date0=date;
keep company_id date0 period:;
RUN;
PROC DATASETS lib=work nolist; delete h0:; RUN;QUIT;
- Cheers -
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello Oligolas,
Thank you so much for your time and effort.
Unfortunately, when I am trying to run the program, I have receiving the following output:
As you will notice, the result for Company_ID 1001 on 09FEB2021 is not as desired.
Also, we are receiving results for Company_ID 1003 and 1004 in our output.
Is there any way to further fix this issue?
I thank you again for your kind support!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have got the code for my problem, thanks to Andreas.
But I would like to thank you again for your time and effort.
Really appreciate your work!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@mmh oh yes, I see in february you also take dates from January in order to determine the average of the estimation window, whereas I consider only the values in each month.
- Cheers -
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
And this might come in handy for me in the future.
Thanks again!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Another solution using a hash object:
data work.Enhanced;
set work.have;
obsnum = _n_;
run;
data
work.want(keep= Company_ID Reporting_Date Before_Event_Date At_Event_Date After_Event_Date)
work.NotEnoughData(keep= Company_ID Reporting_Date)
;
set work.Enhanced;
where not missing(Reporting_Date);
length
Average Reporting_Return 8
Before_Event_Date At_Event_Date After_Event_Date 8
;
if _n_ = 1 then do;
declare hash h(dataset: 'work.enhanced');
h.defineKey('obsnum', 'company_id');
h.defineData('Return');
h.defineDone();
end;
reporting_return = return;
average = 0;
/* check if data for the estimation windows is available */
start_before = obsnum - 8;
end_before = obsnum - 4;
before_ok = (h.check(key: start_before, key: company_id) = 0)
and (h.check(key: end_before, key: company_id) = 0);
start_after = obsnum + 4;
end_after = obsnum + 8;
after_ok = (h.check(key: start_after, key: company_id) = 0)
and (h.check(key: end_after, key: company_id) = 0);
if before_ok and after_ok then do;
do i = start_before to end_before;
rc = h.find(key: i, key: company_id);
average = average + return;
end;
do i = start_after to end_after;
rc = h.find(key: i, key: company_id);
average = average + return;
end;
average = average / 10;
At_Event_Date = reporting_return / average;
rc = h.find(key: obsnum-1, key: company_id);
Before_Event_Date = Return / average;
rc = h.find(key: obsnum+1, key: company_id);
After_Event_Date = Return / average;
output work.want;
end;
else do;
output work.NotEnoughData;
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I do not know how to thank you!
The solution works perfectly fine 🙂
Do you mind if I send you a quick text if sometimes later I try to modify my event estimations and run in to any problems?
Again, I am really grateful for your kind support!
All the best!