Help using Base SAS procedures

How to select rows having transactions on three consecutive days

Accepted Solution Solved
Reply
Contributor
Posts: 38
Accepted Solution

How to select rows having transactions on three consecutive days

Below is my sample data. I want to select those rows which have at least 3 consecutive transactions. For example, for id 11 we have transaction on 5th, 6th and 7th day of January. But 12,13,14 does not have 3 consecutive transactions. I want to create a code which will select only id 11 and its all 3 consecutive transactions. I think it can be done in PROC SQL using some kind of counter. But not sure how to go about it.  Any idea about it will be highly appreciated.

 

data transaction;
input @1 id 2.
	  @3 amount 3.
	  @8 date MMDDYY10. ;
format date date9.;
datalines;
11 100 01/05/2015
11 150 01/06/2015
11 200 01/07/2015
12 150 01/25/2015
12 300 01/15/2015
13 100 01/08/2015
14 400 01/02/2015 
;
run;

Accepted Solutions
Solution
‎12-12-2017 01:26 PM
Esteemed Advisor
Posts: 5,627

Re: How to select rows having transactions on three consecutive days

Two methods:

 


proc sql;
create table consec as
select unique
    a.*
from
    transaction as a inner join
    (select b.id, b.date, d.date 
     from 
        transaction as b inner join
        transaction as c on b.id=c.id and intnx('day',b.date,1)=c.date inner join
        transaction as d on c.id=d.id and intnx('day',c.date,1)=d.date) 
     on a.id=b.id and a.date between b.date and d.date
order by id, date;
select * from consec;
quit;

data sequences;
retain series 0;
set transaction; by id;
prevDate = lag(date);
if first.id then series = series + 1;
else if date ne intnx('day', prevDate, 1) then series = series + 1;
drop prevDate;
run;

data consecutive;
do c = 1 by 1 until(last.series);
    set sequences; by id series;
    end;
do until(last.series);
    set sequences; by id series;
    if c >= 3 then output;
    end;
drop series c;
run;

proc print data=consecutive noobs; run;
PG

View solution in original post


All Replies
Contributor
Posts: 35

Re: How to select rows having transactions on three consecutive days

A couple of questions: 1) do you need to handle instances of multiple transactions on the same day? and 2) how do you want to handle instances where transactions occur on 4 or more consecutive days.

 

Intuitively, a self-join would be better than any form of counter, or alternatively, multi-pass data step solutions. 

Contributor
Posts: 38

Re: How to select rows having transactions on three consecutive days

Posted in reply to DaveBirch

1. Yes. I also need to find situations where there are multiple transactions in one day. But I can find it easily by using a group by and count function within proc sql. But not within same query.

 

2. So, the aim is to get a general process in the end, like 3/4/5 or any number of consecutive day transactions. For time being, I am just starting with 3.

Solution
‎12-12-2017 01:26 PM
Esteemed Advisor
Posts: 5,627

Re: How to select rows having transactions on three consecutive days

Two methods:

 


proc sql;
create table consec as
select unique
    a.*
from
    transaction as a inner join
    (select b.id, b.date, d.date 
     from 
        transaction as b inner join
        transaction as c on b.id=c.id and intnx('day',b.date,1)=c.date inner join
        transaction as d on c.id=d.id and intnx('day',c.date,1)=d.date) 
     on a.id=b.id and a.date between b.date and d.date
order by id, date;
select * from consec;
quit;

data sequences;
retain series 0;
set transaction; by id;
prevDate = lag(date);
if first.id then series = series + 1;
else if date ne intnx('day', prevDate, 1) then series = series + 1;
drop prevDate;
run;

data consecutive;
do c = 1 by 1 until(last.series);
    set sequences; by id series;
    end;
do until(last.series);
    set sequences; by id series;
    if c >= 3 then output;
    end;
drop series c;
run;

proc print data=consecutive noobs; run;
PG
Contributor
Posts: 38

Re: How to select rows having transactions on three consecutive days

Thanks a lot. This solves the problem perfectly. 

Contributor
Posts: 38

Re: How to select rows having transactions on three consecutive days

I understood the proc sql method. While data step method works fine too, I am still not sure how it is working. I get the first part of the data step method where this is mentioned.

if date ne intnx('day', prevDate, 1) then series = series + 1;

  

but the next part after that where you are using do until, I am not being able to understand how its working, specially why do until is being used twice. Can you give me some hints kindly ?

Esteemed Advisor
Posts: 5,627

Re: How to select rows having transactions on three consecutive days

In the first loop, counter c counts the number of rows with the same id and series number, i.e. the length of the sequence. In the second loop, going over exactly the same sequence, the rows are output if the sequence length is >= 3.

PG
Contributor RM6
Contributor
Posts: 24

Re: How to select rows having transactions on three consecutive days

hi,

 

I know this is not your requirement, but I have added some more lines of data , hope it helps

data transaction;

input @1 id 2.

@3 amount 3.

@8 date MMDDYY10. ;

format date date9.;

datalines;

 

11 100 01/05/2015

11 150 01/06/2015

11 200 01/07/2015

11 200 01/10/2015

11 100 01/15/2015

11 150 01/16/2015

11 200 01/17/2015

12 150 01/25/2015

12 300 01/15/2015

13 100 01/08/2015

14 400 01/02/2015

;

run;

proc sort data=transaction; by id date; run;

data transaction_1 ;

set transaction;

by id date;

 

dt= lag(date);

if first.id then dt=.;

 

diff = date-dt;

format dt date9.;

run;

data transaction_2;

set transaction_1(where=(diff=1));

if not missing(date) then do;

 

dummy = date;

output;

end;

if not missing(dt) then do;

 

dummy = dt;

output;

end;

format dummy date9.;

run;

proc sql;

create table fin as

select a.* from transaction a

inner join (select distinct id,dummy from transaction_2) b

on a.id= b.id

and a.date=b.dummy

 

;

quit;

 

PROC Star
Posts: 277

Re: How to select rows having transactions on three consecutive days

Here is a solution using SET with POINT=:

proc sort data=transaction;
  by id date;
run;

data want;
  set transaction(keep=id date);
  by id date;
  if first.id then
    call missing(date2,date3);
  if last.date and not missing(date3) then do;
    if date-date3=2 then do p2=p3 to _N_;
      set transaction point=p2;
      output;
      end;
    end;
  if first.date then do;
    date3=date2;
    p3=p2;
    date2=date;
    p2=_N_;
    end;
  retain date2 date3 p3 p2;
  drop date2 date3 p3;
run;

 

Contributor
Posts: 38

Re: How to select rows having transactions on three consecutive days

Thanks for providing the solution. I am going through the code and can not understand p2/p3 part 

 

if date-date3=2 then do p2=p3 to _N_;
      set transaction point=p2;

 

Have not encountered this before p2 = p3 or p3 =p2. Can you please give some clues about it ?

Super User
Posts: 10,860

Re: How to select rows having transactions on three consecutive days

data transaction;
input @1 id 2.
	  @3 amount 3.
	  @8 date MMDDYY10. ;
format date date9.;
datalines;
11 100 01/05/2015
11 150 01/06/2015
11 200 01/07/2015
12 150 01/25/2015
12 300 01/15/2015
13 100 01/08/2015
14 400 01/02/2015 
;
run;
data temp;
 set transaction;
 by id;
 dif=dif(date);
 if first.id then dif=1;
run;
data temp1;
 set temp;
 by id dif notsorted;
 group+first.dif;
run;
proc sql;
create table want as
 select *
  from temp1
   group by group
    having sum(dif=1) ge 3;
quit;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 561 views
  • 0 likes
  • 6 in conversation