Below I have trade-level data. Each row represents the date in which an insider made a transaction. I am trying to determine the number of trades made by an insider within a 3 year interval before the transaction date.
In the example below we can see Bowen made a number of trades in multiple years.
In 2004, he would have made 0 trades in the preceding 3 years (2001/2002/2003).
In 2007, he would have only made 1 trade in the preceding 3 years (2004/2005/2006).
I am trying to generate this information for all my observations.
I would like the frequency of trades within the preceding 3 years to be another column in my dataset because I am trying to eliminate observations in which there was less than 3 trades in the preceding 3 years.
Thank you!
See the annotated code:
/* use a data step with datalines to present data */
data have;
input @1 trans_year @7 trans_date :yymmdd8. @17 name $30.;
format trans_date yymmdd10.;
datalines;
2004 20041108 BOWEN KEVIN C
2007 20070618 BOWEN KEVIN C
2007 20070919 BOWEN KEVIN C
2007 20071116 BOWEN KEVIN C
2007 20071121 BOWEN KEVIN C
2007 20071126 BOWEN KEVIN C
2008 20081129 BOWEN KEVIN C
;
/* determine the size of the later array */
proc sql noprint;
select min(trans_year), max(trans_year) into :minyear, :maxyear
from have;
quit;
data want;
set have;
by name trans_date; /* trans_date because we want to force correct chronological order */
/* make the array start 3 years before any data, so we don't have wrong indexes later */
array years{%eval(&minyear. - 3):&maxyear.} _temporary_;
if first.name then call missing(of years{*});
years{trans_year} + 1;
prior_trans = sum(
0, /* 0 because we want to have zero even if all array members are missing */
years{trans_year - 3},
years{trans_year - 2},
years{trans_year - 1}
);
run;
Note how data is presented in a piece of code, so everybody can recreate the dataset by simply copying the code and running it, without having to make any guesses about attributes or content.
Creating data on the fly with a data step is a useful SAS skill on its own, I recommend that you acquire that skill for your own toolbox.
Can you provide that "example" as text? Or better yet, provide some example data in the form of data step code? It is awful hard to code against a picture.
Second, are your "dates" actually SAS date values? That would be numeric variable with a yymmn8. format from what you show. If not the first thing will be to create such so the concept of "with in time interval" makes sense.
It might also help to show how you expect the result to appear. Not quite clear what you want.
Hey ballardw,
Apologies for the initial post, this is my first time posting on this forum.
Does this help?
And yes the dates are actually SAS date values.
1 | 2004 | 20041108 | BOWEN KEVIN C |
2 | 2007 | 20070618 | BOWEN KEVIN C |
3 | 2007 | 20070919 | BOWEN KEVIN C |
4 | 2007 | 20071116 | BOWEN KEVIN C |
5 | 2007 | 20071121 | BOWEN KEVIN C |
6 | 2007 | 20071126 | BOWEN KEVIN C |
7 | 2008 | 20081129 | BOWEN KEVIN C |
As for expected results, this would be what I am looking for. The right column represents the number of previous trades in the preceding 3 years from the transaction date
1 | 2004 | 20041108 | BOWEN KEVIN C | 0 |
2 | 2007 | 20070618 | BOWEN KEVIN C | 1 |
3 | 2007 | 20070919 | BOWEN KEVIN C | 1 |
4 | 2007 | 20071116 | BOWEN KEVIN C | 1 |
5 | 2007 | 20071121 | BOWEN KEVIN C | 1 |
6 | 2007 | 20071126 | BOWEN KEVIN C | 1 |
7 | 2008 | 20081129 | BOWEN KEVIN C | 5 |
You really want the count of the preceding 3 calendar years, not the preceding rolling 36 months?
data want;
set have;
by name notsorted;
array tr_year{2000:2020} _temporary_;
if first.name then call missing(of tr_year{*});
tr_year{year}+1;
n_prior3=sum(0,tr_year{year-3},tr_year{year-2},tr_year{year-1});
run;
See the annotated code:
/* use a data step with datalines to present data */
data have;
input @1 trans_year @7 trans_date :yymmdd8. @17 name $30.;
format trans_date yymmdd10.;
datalines;
2004 20041108 BOWEN KEVIN C
2007 20070618 BOWEN KEVIN C
2007 20070919 BOWEN KEVIN C
2007 20071116 BOWEN KEVIN C
2007 20071121 BOWEN KEVIN C
2007 20071126 BOWEN KEVIN C
2008 20081129 BOWEN KEVIN C
;
/* determine the size of the later array */
proc sql noprint;
select min(trans_year), max(trans_year) into :minyear, :maxyear
from have;
quit;
data want;
set have;
by name trans_date; /* trans_date because we want to force correct chronological order */
/* make the array start 3 years before any data, so we don't have wrong indexes later */
array years{%eval(&minyear. - 3):&maxyear.} _temporary_;
if first.name then call missing(of years{*});
years{trans_year} + 1;
prior_trans = sum(
0, /* 0 because we want to have zero even if all array members are missing */
years{trans_year - 3},
years{trans_year - 2},
years{trans_year - 1}
);
run;
Note how data is presented in a piece of code, so everybody can recreate the dataset by simply copying the code and running it, without having to make any guesses about attributes or content.
Creating data on the fly with a data step is a useful SAS skill on its own, I recommend that you acquire that skill for your own toolbox.
Here is another solution, using proc sql:
data have;
input @1 trans_year @7 trans_date :yymmdd8. @17 name $30.;
format trans_date yymmdd10.;
datalines;
2004 20041108 BOWEN KEVIN C
2007 20070618 BOWEN KEVIN C
2007 20070919 BOWEN KEVIN C
2007 20071116 BOWEN KEVIN C
2007 20071121 BOWEN KEVIN C
2007 20071126 BOWEN KEVIN C
2008 20081129 BOWEN KEVIN C
;
proc sql;
create table mix as
select a.name
,a.trans_date
,a.trans_year
,b.trans_date as prev_trans_date
,case b.trans_date
when . then 0
else 1
end as cnt
from have a
left join have b
on (a.name = b.name and
(b.trans_year < a.trans_year
and a.trans_year - b.trans_year <= 3)
or b.trans_date = .)
order by a.name, a.trans_date, b.trans_date
;
create table want as
select name
,trans_date
,sum(cnt) as cnt_trans
from mix
group by name, trans_date
;
quit;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: