BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mistletoad
Calcite | Level 5

Below I have trade-level data. Each row represents the date in which an insider made a transaction. I am trying to determine the number of trades made by an insider within a 3 year interval before the transaction date. 

In the example below we can see Bowen made a number of trades in multiple years. 

In 2004, he would have made 0 trades in the preceding 3 years (2001/2002/2003).

In 2007, he would have only made 1 trade in the preceding 3 years (2004/2005/2006).

I am trying to generate this information for all my observations.

I would like the frequency of trades within the preceding 3 years to be another column in my dataset because I am trying to eliminate observations in which there was less than 3 trades in the preceding 3 years.

 

Thank you!

Mistletoad_0-1621542489468.png

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

See the annotated code:

/* use a data step with datalines to present data */
data have;
input @1 trans_year @7 trans_date :yymmdd8. @17 name $30.;
format trans_date yymmdd10.;
datalines;
2004  20041108  BOWEN KEVIN C  
2007  20070618  BOWEN KEVIN C  
2007  20070919  BOWEN KEVIN C  
2007  20071116  BOWEN KEVIN C  
2007  20071121  BOWEN KEVIN C  
2007  20071126  BOWEN KEVIN C  
2008  20081129  BOWEN KEVIN C  
;

/* determine the size of the later array */
proc sql noprint;
select min(trans_year), max(trans_year) into :minyear, :maxyear
from have;
quit;

data want;
set have;
by name trans_date; /* trans_date because we want to force correct chronological order */
/* make the array start 3 years before any data, so we don't have wrong indexes later */
array years{%eval(&minyear. - 3):&maxyear.} _temporary_;
if first.name then call missing(of years{*});
years{trans_year} + 1;
prior_trans = sum(
  0, /* 0 because we want to have zero even if all array members are missing */
  years{trans_year - 3},
  years{trans_year - 2},
  years{trans_year - 1}
);
run;

Note how data is presented in a piece of code, so everybody can recreate the dataset by simply copying the code and running it, without having to make any guesses about attributes or content.

Creating data on the fly with a data step is a useful SAS skill on its own, I recommend that you acquire that skill for your own toolbox.

View solution in original post

5 REPLIES 5
ballardw
Super User

Can you provide that "example" as text? Or better yet, provide some example data in the form of data step code? It is awful hard to code against a picture.

 

Second, are your "dates" actually SAS date values? That would be numeric variable with a yymmn8. format from what you show. If not the first thing will be to create such so the concept of "with in time interval" makes sense.

 

It might also help to show how you expect the result to appear. Not quite clear what you want.

Mistletoad
Calcite | Level 5

Hey ballardw,

Apologies for the initial post, this is my first time posting on this forum.

Does this help?

And yes the dates are actually SAS date values.

 

 
1200420041108BOWEN KEVIN C 
2200720070618BOWEN KEVIN C 
3200720070919BOWEN KEVIN C 
4200720071116BOWEN KEVIN C 
5200720071121BOWEN KEVIN C 
6200720071126BOWEN KEVIN C 
7200820081129BOWEN KEVIN C 

 

As for expected results, this would be what I am looking for. The right column represents the number of previous trades in the preceding 3 years from the transaction date

 

 
1200420041108BOWEN KEVIN C0
2200720070618BOWEN KEVIN C1
3200720070919BOWEN KEVIN C1
4200720071116BOWEN KEVIN C1
5200720071121BOWEN KEVIN C1
6200720071126BOWEN KEVIN C1
7200820081129BOWEN KEVIN C5
    

 

mkeintz
PROC Star

You really want the count of the preceding 3 calendar years, not the preceding rolling 36 months?

 

data want;
  set have;
  by name notsorted;

  array tr_year{2000:2020} _temporary_;
  if first.name then call missing(of tr_year{*});
  tr_year{year}+1;
  n_prior3=sum(0,tr_year{year-3},tr_year{year-2},tr_year{year-1});
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Kurt_Bremser
Super User

See the annotated code:

/* use a data step with datalines to present data */
data have;
input @1 trans_year @7 trans_date :yymmdd8. @17 name $30.;
format trans_date yymmdd10.;
datalines;
2004  20041108  BOWEN KEVIN C  
2007  20070618  BOWEN KEVIN C  
2007  20070919  BOWEN KEVIN C  
2007  20071116  BOWEN KEVIN C  
2007  20071121  BOWEN KEVIN C  
2007  20071126  BOWEN KEVIN C  
2008  20081129  BOWEN KEVIN C  
;

/* determine the size of the later array */
proc sql noprint;
select min(trans_year), max(trans_year) into :minyear, :maxyear
from have;
quit;

data want;
set have;
by name trans_date; /* trans_date because we want to force correct chronological order */
/* make the array start 3 years before any data, so we don't have wrong indexes later */
array years{%eval(&minyear. - 3):&maxyear.} _temporary_;
if first.name then call missing(of years{*});
years{trans_year} + 1;
prior_trans = sum(
  0, /* 0 because we want to have zero even if all array members are missing */
  years{trans_year - 3},
  years{trans_year - 2},
  years{trans_year - 1}
);
run;

Note how data is presented in a piece of code, so everybody can recreate the dataset by simply copying the code and running it, without having to make any guesses about attributes or content.

Creating data on the fly with a data step is a useful SAS skill on its own, I recommend that you acquire that skill for your own toolbox.

KlausBücher
Fluorite | Level 6

Here is another solution, using proc sql:

 

data have;
input @1 trans_year @7 trans_date :yymmdd8. @17 name $30.;
format trans_date yymmdd10.;
datalines;
2004 20041108 BOWEN KEVIN C
2007 20070618 BOWEN KEVIN C
2007 20070919 BOWEN KEVIN C
2007 20071116 BOWEN KEVIN C
2007 20071121 BOWEN KEVIN C
2007 20071126 BOWEN KEVIN C
2008 20081129 BOWEN KEVIN C
;

proc sql;
create table mix as
select a.name
,a.trans_date
,a.trans_year
,b.trans_date as prev_trans_date
,case b.trans_date
when . then 0
else 1
end as cnt
from have a
left join have b
on (a.name = b.name and
(b.trans_year < a.trans_year
and a.trans_year - b.trans_year <= 3)
or b.trans_date = .)
order by a.name, a.trans_date, b.trans_date
;
create table want as
select name
,trans_date
,sum(cnt) as cnt_trans
from mix
group by name, trans_date
;
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 863 views
  • 1 like
  • 5 in conversation