DATA Step, Macro, Functions and more

Calculate the Days Between Two Transactions for 4 Cohorts

Reply
New Contributor JFM
New Contributor
Posts: 4

Calculate the Days Between Two Transactions for 4 Cohorts

I’m trying to calculate the days between two transactions for 4 cohorts.


The cohorts defined by all those which purchased on on these dates:


'28SEP2014'd, '28DEC2014'd, '29MAR2015'd, '28JUN2015'd


And it’s the difference between the last transaction before the transaction date which define the cohorts. Here is an example:  


I start with this.


CUSTOMER_IDtransaction_date
231254412-Oct-14
231254421-Oct-14
231254419-Nov-14
231254411-Dec-14
231254428-Dec-14
23125449-Jan-15
231254410-Jan-15
231254411-Jan-15
261613110-Aug-08
26161317-Dec-08
261613116-Dec-12
261613112-Apr-13
261613124-Jul-13
261613128-Sep-14
26161313-Nov-14
26161319-Mar-15
2452666728-Jun-15
8489539829-Mar-15
555566616-Dec-12
555566612-Apr-13
555566630-Jul-13
555566628-Sep-14
555566630-Sep-14
555566615-Oct-14


Then I should get this.


CUSTOMER_IDtransaction_date
231254411-Dec-14
231254428-Dec-14
261613124-Jul-13
261613128-Sep-14
555566630-Jul-13
555566628-Sep-14


Noticed these two observations were dropped from above because these customers had no previous transactions

24526667

28-Jun-15

84895398

29-Mar-15

And then the final  output should look like this:

CUSTOMER_IDtransaction_dateDays Between
231254428-Dec-1417
261613128-Sep-14431
555566628-Sep-14425

Any assistance will be greatly appreciated. Thanks!!!

New Contributor
Posts: 4

Re: Calculate the Days Between Two Transactions for 4 Cohorts

Hi. You can try this solution. Maybe not the most sophisticated but it should work.

data TestData;

infile datalines dlm=' ';

input Var1  date date9. ;

format date date9.;

datalines;

2312544 12-Oct-14

2312544 21-Oct-14

2312544 19-Nov-14

2312544 11-Dec-14

2312544 28-Dec-14

2312544 9-Jan-15

2312544 10-Jan-15

2312544 11-Jan-15

2616131 10-Aug-08

2616131 7-Dec-08

2616131 16-Dec-12

2616131 12-Apr-13

2616131 24-Jul-13

2616131 28-Sep-14

2616131 3-Nov-14

2616131 9-Mar-15

24526667 28-Jun-15

84895398 29-Mar-15

5555666 16-Dec-12

5555666 12-Apr-13

5555666 30-Jul-13

5555666 28-Sep-14

5555666 30-Sep-14

5555666 15-Oct-14

;

data testdata;

set testdata;

if date in('28SEP2014'd, '28DEC2014'd, '29MAR2015'd, '28JUN2015'd)

  then flag=1;

  run;

Proc Sort Data=testdata Out= test_sorted;

By var1 date ;

Run;

data Want;

set test_sorted;

by var1;

if first.var1 and last.var1 then delete; * singel observations;

lag_date=lag(date);

if flag then days=date-lag_date;

if flag;

drop flag;

run;

Super User
Super User
Posts: 7,401

Re: Calculate the Days Between Two Transactions for 4 Cohorts

Much the same really, just one datastep rather than 2.

proc sort data=testdata;

  by var1 date;

run;

data want (drop=lstdate);

  set testdata;

  retain lstdate;

  by var1;

  if first.var1 then lstdate=date;

  if date in ('28sep2014'd,'28dec2014'd,'29mar2015'd,'28jun2015'd) and lag(var1)=var1 then do;

    days_between=date - lstdate;

    output;

  end;

  lstdate=date;

run;

Ask a Question
Discussion stats
  • 2 replies
  • 279 views
  • 3 likes
  • 3 in conversation