BookmarkSubscribeRSS Feed
Neal3321
Fluorite | Level 6

Hello all, 

 

I am trying to identify traders who place transactions in the same month in each of three consecutive years in one company. Once a trader meets the criteria, these three transactions and all his subsequent transactions in that same month in that company should be identified.

Assume I have a sample data below.

data have;
input ID  STOCK   trandate $12.;
datalines;
1    1   10/15/2009
1    1   01/01/2010
1    1   01/10/2011
1    1   01/15/2012
1    1   01/01/2013
1    2   01/30/2011
1    2   01/30/2012
1    2   01/30/2012
1    2   01/30/2013
1    2   01/30/2014
1    2   01/30/2015
2    1   01/20/2010
2    1   01/15/2011
2    1   01/16/2012
2    1   02/01/2013
2    2   02/01/2010
2    2   02/10/2011
2    2   02/10/2012
2    2   02/10/2013
2    2   02/10/2014
2    2   01/10/2015
; run;

What I need:

 

ID Stock  trandate type
1    1   10/15/2009 0
1    1   01/01/2010 1
1    1   01/10/2011 1
1    1   01/15/2012 1
1    1   01/01/2013 1
1    2   01/30/2011 1
1    2   01/30/2012 1
1    2   01/30/2012 1
1    2   01/30/2013 1
1    2   01/30/2014 1
1    2   01/30/2015 1
2    1   01/20/2010 0
2    1   01/15/2011 0
2    1   01/16/2012 0
2    1   02/01/2013 0
2    2   02/01/2010 1
2    2   02/10/2011 1
2    2   02/10/2012 1
2    2   02/10/2013 1
2    2   02/10/2014 1
2    2   01/10/2015 0

I attach the code I use to make the classification. The code is very long and I do not think it is efficient. Could you help me out?

 
6 REPLIES 6
Astounding
PROC Star

Why aren't you trying to flag ID=2, STOCK=1 observations?  It looks like those meet the criteria you set up for same month, consecutive years.

mkeintz
PROC Star

@Neal3321 wrote:

Hello all, 

 

I am trying to identify traders who place transactions in the same month in each of three consecutive years in one company. Once a trader meets the criteria, these three transactions and all his subsequent transactions in that same month in that company should be identified.

Assume I have a sample data below.

data have;
input ID  STOCK   trandate $12.;
datalines;
1    1   10/15/2009
1    1   01/01/2010
1    1   01/10/2011
1    1   01/15/2012
1    1   01/01/2013
1    2   01/30/2011
1    2   01/30/2012
1    2   01/30/2012
1    2   01/30/2013
1    2   01/30/2014
1    2   01/30/2015
2    1   01/20/2010
2    1   01/15/2011
2    1   01/16/2012
2    1   02/01/2013
2    2   02/01/2010
2    2   02/10/2011
2    2   02/10/2012
2    2   02/10/2013
2    2   02/10/2014
2    2   01/10/2015
; run;

What I need:

 

ID Stock  trandate type
1    1   10/15/2009 0
1    1   01/01/2010 1
1    1   01/10/2011 1
1    1   01/15/2012 1
1    1   01/01/2013 1
1    2   01/30/2011 1
1    2   01/30/2012 1
1    2   01/30/2012 1
1    2   01/30/2013 1
1    2   01/30/2014 1
1    2   01/30/2015 1
2    1   01/20/2010 0
2    1   01/15/2011 0
2    1   01/16/2012 0
2    1   02/01/2013 0
2    2   02/01/2010 1
2    2   02/10/2011 1
2    2   02/10/2012 1
2    2   02/10/2013 1
2    2   02/10/2014 1
2    2   01/10/2015 0

I attach the code I use to make the classification. The code is very long and I do not think it is efficient. Could you help me out?


 

As observed by @Astounding , you may have overlooked the succession of January trades for ID=1, stock=1.

 

Here is a program the meets your objectives:

 

data have;
input ID  STOCK   trandate :mmddyy10.;
format trandate date9.;
datalines;
1    1   10/15/2009
1    1   01/01/2010
1    1   01/10/2011
1    1   01/15/2012
1    1   01/01/2013
1    2   01/30/2011
1    2   01/30/2012
1    2   01/30/2012
1    2   01/30/2013
1    2   01/30/2014
1    2   01/30/2015
2    1   01/20/2010
2    1   01/15/2011
2    1   01/16/2012
2    1   02/01/2013
2    2   02/01/2010
2    2   02/10/2011
2    2   02/10/2012
2    2   02/10/2013
2    2   02/10/2014
2    2   01/10/2015
;

data want (drop=_:);
  array trades {2008:2015,12} _temporary_;
  array type1_start_year {1:12} _temporary_;

  call missing(of type1_start_year{*});
  call missing(of trades{*});

  /* Read a set of transactions for one stock, ...
     ... and fill in trade dummies by year and month */
  do until (last.stock);
    set have;
    by id stock;
    trades{year(trandate),month(trandate)}=1;
  end;

  /* Look at the 2-way dummy array for 3 consecutive 1's in ...
     ... any month column, and identify earliest qualifying year*/
  do _y=2008 to 2013;
    do _m=1 to 12;
      if n(trades{_y,_m},trades{_y+1,_m},trades{_y+2,_m})=3 then type1_start_year{_m}=min(_y,type1_start_year{_m});
    end;
  end;

  /* Reread the same transactions, and for qualifying months, 
     compare the transaction year to the earliest qualifying year */
  do until (last.stock);
    set have;
    by id stock;
    if .<type1_start_year{month(trandate)}<=year(trandate) then type=1;
    else type=0;
    output;
  end;
run;

Notes that it reads data in clusters of ID/STOCK, tracking year and month of trades in a 2-way array named TRADES.  At the end of the cluster, it examines every column (i.e. month) of TRADES looking for a sequence of 3 ones in 3 successive rows (years).  If such a sequence is found then the one-way array TYPE1_START_YEAR has a starting year set in the element corresponding to the qualifying month.

 

Then the data are re-read, and after examining the TRANDATE for qualifying month and year, is output.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PGStats
Opal | Level 21

First step: make tranDate a proper SAS date. Then use function intnx to move the dates by month intervals. Here is one way to do this:

 

data have;
input ID STOCK trandate :mmddyy10.;
format tranDate yymmdd10.;
datalines;
1    1   10/15/2009
1    1   01/01/2010
1    1   01/10/2011
1    1   01/15/2012
1    1   01/01/2013
1    2   01/30/2011
1    2   01/30/2012
1    2   01/30/2012
1    2   01/30/2013
1    2   01/30/2014
1    2   01/30/2015
2    1   01/20/2010
2    1   01/15/2011
2    1   01/16/2012
2    1   02/01/2013
2    2   02/01/2010
2    2   02/10/2011
2    2   02/10/2012
2    2   02/10/2013
2    2   02/10/2014
2    2   01/10/2015
;

proc sql;
create table firsts as
select unique
	a.ID, a.STOCK,
	intnx("month", a.tranDate, 0) as firstMonth format=mmyy.
from have as a
where 
	exists (select * from have where ID=a.ID and STOCK=a.STOCK and 
		intnx("month", a.tranDate, 12) = intnx("month", tranDate, 0) ) and
	exists (select * from have where ID=a.ID and STOCK=a.STOCK and 
		intnx("month", a.tranDate, 24) = intnx("month", tranDate, 0) ) and
	exists (select * from have where ID=a.ID and STOCK=a.STOCK and 
		intnx("month", a.tranDate, 36) = intnx("month", tranDate, 0) );
quit;
		
data allmonths;
set firsts;
output;
do i = 1 to 3;
	firstMonth = intnx("month", firstMonth, 12);
	output;
	end;
drop i;
run;

proc sort data=allMonths nodupkey; by id stock firstMonth; run;
		
proc sql;
create table want as
select 
	a.*,
	b.id is not missing as type
from 
	have as a left join 
	allMonths as b on a.id=b.id and a.stock=b.stock and
		intnx("month", a.tranDate, 0) = b.firstMonth
order by id, stock, tranDate;
quit;
PG
Ksharp
Super User
data have;
input ID  STOCK   trandate : mmddyy12.;
format trandate mmddyy10.;
datalines;
1    1   10/15/2009
1    1   01/01/2010
1    1   01/10/2011
1    1   01/15/2012
1    1   01/01/2013
1    2   01/30/2011
1    2   01/30/2012
1    2   01/30/2012
1    2   01/30/2013
1    2   01/30/2014
1    2   01/30/2015
2    1   01/20/2010
2    1   01/15/2011
2    1   01/16/2012
2    1   02/01/2013
2    2   02/01/2010
2    2   02/10/2011
2    2   02/10/2012
2    2   02/10/2013
2    2   02/10/2014
2    2   01/10/2015
;
run;
proc sort data=have;
by id stock trandate;
run;
data temp;
 set have;
 by id stock;
 year=year(trandate);
 month=month(trandate);
 if first.stock or dif(year) >1 then group+1;
run;
data temp1;
 set temp;
 by group month notsorted;
 group1+first.month;
run;
proc sql;
create table want as
select id,stock,trandate,ifn(count(*)>3,1,0) as type
 from temp1
  group by group1
   order by 1,2,3;
quit;
Neal3321
Fluorite | Level 6

Hello all,

 

Thank you for your suggestions. I did not flag ID2, Stock1 observations because they are in different month (the first three is Jan and the last one is Feb). I attempt to flag transactions in same month for at least three consecutive years.

 

Thanks again.

hashman
Ammonite | Level 13

@Neal3321:

It looks like a nice DoW-loop exercise. Note that I've purposely changed the year in the 3rd record for (ID,stock)=(2,1) to 2013 to reveal a gap in consecutive years between this and the prior record. Otherwise, as @Astounding has noted, you end up with type=1.

data have ;                                                                                                                             
  input id stock trandate :mmddyy10. ;                                                                                                  
  format trandate yymmd7. ;                                                                                                             
  cards ;                                                                                                                               
1  1  10/15/2009                                                                                                                        
1  1  01/01/2010                                                                                                                        
1  1  01/10/2011                                                                                                                        
1  1  01/15/2012                                                                                                                        
1  1  01/01/2013                                                                                                                        
1  2  01/30/2011                                                                                                                        
1  2  01/30/2012                                                                                                                        
1  2  01/30/2012                                                                                                                        
1  2  01/30/2013                                                                                                                        
1  2  01/30/2014                                                                                                                        
1  2  01/30/2015                                                                                                                        
2  1  01/20/2010                                                                                                                        
2  1  01/15/2011                                                                                                                        
2  1  01/16/2013                                                                                                                        
2  1  02/01/2014                                                                                                                        
2  2  02/01/2010                                                                                                                        
2  2  02/10/2011                                                                                                                        
2  2  02/10/2012                                                                                                                        
2  2  02/10/2013                                                                                                                        
2  2  02/10/2014                                                                                                                        
2  2  01/10/2015                                                                                                                        
;                                                                                                                                       
run ;                                                                                                                                   
                                                                                                                                        
data v / view = v ;                                                                                                                     
  set have ;                                                                                                                            
  _yr  = year  (trandate) ;                                                                                                             
  _mon = month (trandate) ;                                                                                                             
run ;                                                                                                                                   
                                                                                                                                        
data want (drop = _:) ;                                                                                                                 
  do _m = 1 by 1 until (last._mon) ;                                                                                                    
    do until (last._yr) ;                                                                                                               
      set v ;                                                                                                                           
      by id stock _mon _yr notsorted ;                                                                                                  
      _q = sum (_q, 1) ;                                                                                                                
    end ;                                                                                                                               
    _yr_dif = dif (_yr) ;                                                                                                               
    if 1 < _m <= 3 and _yr_dif ne 1 then _yr_gap = 1 ;                                                                                  
    if _m = 3 and not _yr_gap then type = 1 ;                                                                                           
  end ;                                                                                                                                 
  type = sum (type, 0) ;                                                                                                                
  do _q = 1 to _q ;                                                                                                                     
    set v ;                                                                                                                             
    output ;                                                                                                                            
  end ;                                                                                                                                 
run ;                                                    

The purpose of the innder DoW-loop by _YR is to scramble consecutive records with the same year withing the same month into counting it as a single year-month occurrence by letting _M increment only for each iteration of the outer DoW by _MON.

 

Kind regards

Paul D. 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1611 views
  • 0 likes
  • 6 in conversation