Hello I'm working on some project and need to count number of events between any of two event dates for each firm. I have data look like this:
event date | firm cusip | number of event +/- 183 days | number of events +/-365 days | number of event +/- 548 days |
1991/2/21 | 21 | |||
1992/3/24 | 21 | |||
1994/3/7 | 361 | |||
1997/6/10 | 361 | |||
1998/9/9 | 361 | |||
2000/1/11 | 361 | |||
2009/6/15 | 361 | |||
1996/8/2 | 375 | |||
1997/1/23 | 375 | |||
1998/3/24 | 375 | |||
1999/7/21 | 375 | |||
1999/8/26 | 375 | |||
1999/11/28 | 375 | |||
2000/1/13 | 375 | |||
2000/5/8 | 375 | |||
2000/6/15 | 375 | |||
2000/6/30 | 375 | |||
2000/7/4 | 375 | |||
2000/7/6 | 375 | |||
2000/9/25 | 375 | |||
2000/10/5 | 375 | |||
2000/10/16 | 375 | |||
2001/2/19 | 375 | |||
2004/4/1 | 375 | |||
2005/6/22 | 375 | |||
2005/12/20 | 375 | |||
2007/6/14 | 375 | |||
2008/4/16 | 375 | |||
2008/8/22 | 375 | |||
2008/10/23 | 375 | |||
2009/3/31 | 375 | |||
2009/12/10 | 375 | |||
2011/6/12 | 375 | |||
2011/8/1 | 375 | |||
2012/3/6 | 375 | |||
2013/5/7 | 375 |
For column named number of event +/- , I need to count, for the same firm, number of event dates within the range of [-183,+183], [-365,+365],[-548,+548] relative to the event date on the same row. So for example, for firm with cusip=21, there are 397 days between two event date, 1991/2/21 and 1992/3/24, so in the first row, there are 0 event within the range of [-183,+183] relative to 1991/2/21, but there are 1 event within the range of [-365,+365], and 1 event within the range of [-548,+548]. In the second row, I need to count the number of event for the same firm within these three ranges relative to the date of 1992/3/24. I need to do the same for three firms. This is the first time I do this kind of counting staff and I tried several method but they do not work on all of the firms. Please help!!
Logan
data have ;
length firm event 8;
informat event yymmdd10.;
format event yymmdd10.;
input event firm @@;
cards;
1991/2/21 21 1992/3/24 21
1994/3/7 361 1997/6/10 361 1998/9/9 361 2000/1/11 361 2009/6/15 361
1996/8/2 375 1997/1/23 375 1998/3/24 375 1999/7/21 375 1999/8/26 375
1999/11/28 375 2000/1/13 375 2000/5/8 375 2000/6/15 375 2000/6/30 375
2000/7/4 375 2000/7/6 375 2000/9/25 375 2000/10/5 375 2000/10/16 375
2001/2/19 375 2004/4/1 375 2005/6/22 375 2005/12/20 375 2007/6/14 375
2008/4/16 375 2008/8/22 375 2008/10/23 375 2009/3/31 375
2009/12/10 375 2011/6/12 375 2011/8/1 375 2012/3/6 375 2013/5/7 375
;;;;
proc sql noprint ;
create table want as
select a.firm
, a.event
, sum( case when abs(a.event - b.event) <= 183 then 1 else 0 end) as n183
, sum( case when abs(a.event - b.event) <= 365 then 1 else 0 end) as n365
, sum( case when abs(a.event - b.event) <= 548 then 1 else 0 end) as n548
from have a left join have b
on a.firm = b.firm and a.event ne b.event
group by 1,2
order by 1,2
;
quit;
proc print width=min;
run;
firm event n183 n365 n548
---- ---------- ---- ---- ----
21 1991-02-21 0 0 1
21 1992-03-24 0 0 1
361 1994-03-07 0 0 0
361 1997-06-10 0 0 1
361 1998-09-09 0 0 2
361 2000-01-11 0 0 1
361 2009-06-15 0 0 0
375 1996-08-02 1 1 1
375 1997-01-23 1 1 2
375 1998-03-24 0 0 3
375 1999-07-21 3 8 12
375 1999-08-26 3 8 13
375 1999-11-28 4 11 12
375 2000-01-13 8 11 12
375 2000-05-08 9 12 12
375 2000-06-15 8 12 12
375 2000-06-30 8 12 12
375 2000-07-04 8 12 12
375 2000-07-06 8 12 12
375 2000-09-25 8 10 12
375 2000-10-05 8 10 12
375 2000-10-16 8 10 12
375 2001-02-19 3 8 11
375 2004-04-01 0 0 1
375 2005-06-22 1 1 2
375 2005-12-20 1 1 2
375 2007-06-14 0 1 4
375 2008-04-16 1 4 4
375 2008-08-22 2 3 5
375 2008-10-23 2 3 5
375 2009-03-31 1 4 4
375 2009-12-10 0 1 3
375 2011-06-12 1 2 2
375 2011-08-01 1 2 2
375 2012-03-06 0 2 3
375 2013-05-07 0 0 1
proc sort data=have;
by firm_cusip event_date;
run;
data want;
set have;
by firm_cusip event_date;
events=event_date-lag(event_date);
if first.firm_cusip then events=0;
if events>=-183 and events<=183 then numer_of_events_183=1;
if events>=-397 and events<=397 then numer_of_events_397=1;
if events>=-548 and events<=548 then numer_of_events_548=1;
run;
Thanks for your reply,but it seems doesn't work. For example, for the first event date, 21FEB1991, results show that number of events within 183 days is 1, however, it should be 0 because the time interval between this event date and next one is 397, so there is no event between 21FEB1991 and 21FEB1991+183 days. Same for 365 range. Please help!
data have ;
length firm event 8;
informat event yymmdd10.;
format event yymmdd10.;
input event firm @@;
cards;
1991/2/21 21 1992/3/24 21
1994/3/7 361 1997/6/10 361 1998/9/9 361 2000/1/11 361 2009/6/15 361
1996/8/2 375 1997/1/23 375 1998/3/24 375 1999/7/21 375 1999/8/26 375
1999/11/28 375 2000/1/13 375 2000/5/8 375 2000/6/15 375 2000/6/30 375
2000/7/4 375 2000/7/6 375 2000/9/25 375 2000/10/5 375 2000/10/16 375
2001/2/19 375 2004/4/1 375 2005/6/22 375 2005/12/20 375 2007/6/14 375
2008/4/16 375 2008/8/22 375 2008/10/23 375 2009/3/31 375
2009/12/10 375 2011/6/12 375 2011/8/1 375 2012/3/6 375 2013/5/7 375
;;;;
proc sql noprint ;
create table want as
select a.firm
, a.event
, sum( case when abs(a.event - b.event) <= 183 then 1 else 0 end) as n183
, sum( case when abs(a.event - b.event) <= 365 then 1 else 0 end) as n365
, sum( case when abs(a.event - b.event) <= 548 then 1 else 0 end) as n548
from have a left join have b
on a.firm = b.firm and a.event ne b.event
group by 1,2
order by 1,2
;
quit;
proc print width=min;
run;
firm event n183 n365 n548
---- ---------- ---- ---- ----
21 1991-02-21 0 0 1
21 1992-03-24 0 0 1
361 1994-03-07 0 0 0
361 1997-06-10 0 0 1
361 1998-09-09 0 0 2
361 2000-01-11 0 0 1
361 2009-06-15 0 0 0
375 1996-08-02 1 1 1
375 1997-01-23 1 1 2
375 1998-03-24 0 0 3
375 1999-07-21 3 8 12
375 1999-08-26 3 8 13
375 1999-11-28 4 11 12
375 2000-01-13 8 11 12
375 2000-05-08 9 12 12
375 2000-06-15 8 12 12
375 2000-06-30 8 12 12
375 2000-07-04 8 12 12
375 2000-07-06 8 12 12
375 2000-09-25 8 10 12
375 2000-10-05 8 10 12
375 2000-10-16 8 10 12
375 2001-02-19 3 8 11
375 2004-04-01 0 0 1
375 2005-06-22 1 1 2
375 2005-12-20 1 1 2
375 2007-06-14 0 1 4
375 2008-04-16 1 4 4
375 2008-08-22 2 3 5
375 2008-10-23 2 3 5
375 2009-03-31 1 4 4
375 2009-12-10 0 1 3
375 2011-06-12 1 2 2
375 2011-08-01 1 2 2
375 2012-03-06 0 2 3
375 2013-05-07 0 0 1
Hello Tom, thanks for your help and it works for most of the deals. However, it does not work so well if there are two or more events happened on the same day. The program will keep only one of the event if multiple event happened on the same day. If I have an ID variable used to identify different event happened on the same day, can this problem be solved. Thank you!
So why don't you just make an ID variable?
data want ; set have; id+1; run;
Hi AJ,
I don't understand how you are getting "1 event within the range of [-365,+365], and 1 event within the range of [-548,+548]" for firm cusip 21?
My results replicate presented in Tom's post.
Regards,
Scott
You are correct Scott, my mistake
Hello Scott, the program Tom write works well only if there is one event on the event date. The program will keep only one of the event if multiple event happened on the same day. If I have an ID variable used to identify different event happened on the same day, can this problem be solved. Thank you!
data have ; length firm event 8; informat event yymmdd10.; format event yymmdd10.; input event firm @@; cards; 1991/2/21 21 1992/3/24 21 1994/3/7 361 1997/6/10 361 1998/9/9 361 2000/1/11 361 2009/6/15 361 1996/8/2 375 1997/1/23 375 1998/3/24 375 1999/7/21 375 1999/8/26 375 1999/11/28 375 2000/1/13 375 2000/5/8 375 2000/6/15 375 2000/6/30 375 2000/7/4 375 2000/7/6 375 2000/9/25 375 2000/10/5 375 2000/10/16 375 2001/2/19 375 2004/4/1 375 2005/6/22 375 2005/12/20 375 2007/6/14 375 2008/4/16 375 2008/8/22 375 2008/10/23 375 2009/3/31 375 2009/12/10 375 2011/6/12 375 2011/8/1 375 2012/3/6 375 2013/5/7 375 ;;;; proc sql ; create table want as select a.firm , a.event ,(select count(*) from have where event between a.event and a.event+183) as n183 from have a; quit;
Xia Keshan
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.