BookmarkSubscribeRSS Feed
RobertNYC
Obsidian | Level 7

Hi,

I have a data set with customer IDs and transaction dates. I need to calculate 5 variables using this data. They are.

1.  Average Days Between Each Transaction

2.  Average Days Between Transactions at 6 month intervals

3.  Average Days Between Transactions at 12 Month intervals

4.  Average Days First and Last Transaction

5.  Average Total Transactions

And then I need to aggregate these all together.


And then group by year.


I start with this data:


CUSTOMER_IDtransaction_date
44076713-Jan-07
4407675-Jan-12
4407673-Jun-09
44076717-Jan-12
4407679-Apr-09
44076710-Aug-08
179502310-Jul-12
179502311-Aug-11
179502330-Nov-09
17950239-Jan-12
179502320-Jan-11
179502310-Apr-09
17950232-Nov-11
17950238-Feb-12
179502317-Aug-11
179502319-Nov-13
362769529-May-06
362769515-May-03
362769523-Mar-15
362769518-May-06


here are some examples (they counts don't match the sample data)


One output would look like thisAvery Days Between Transaction Counts

avr_Days_Betw_Each_transAvrDaysBetw_trans_sixMonth_IntervalsavrDaysBetw_trans_year_Intervalsavr_Days between_first_last_transTotal_Trans_Count
1030100100050


and one would look like this


Yearavr_Days_Betw_Each_transAvrDaysBetw_trans_sixMonth_IntervalsavrDaysBetw_trans_year_Intervalsavr_Days between_first_last_transTotal_Trans_Count
20121030100100050
2013306020055585
2014407030045690



Any assistance would be appreciated. Thanks

15 REPLIES 15
Steelers_In_DC
Barite | Level 11

This should be a pretty good start.

data have;

infile cards dsd;

informat customer_id $7. transaction_date date8.;

format customer_id $7. transaction_date date8.;

input customer_id$    transaction_date;

cards;

440767,13-Jan-07

440767,5-Jan-12

440767,3-Jun-09

440767,17-Jan-12

440767,9-Apr-09

440767,10-Aug-08

1795023,10-Jul-12

1795023,11-Aug-11

1795023,30-Nov-09

1795023,9-Jan-12

1795023,20-Jan-11

1795023,10-Apr-09

1795023,2-Nov-11

1795023,8-Feb-12

1795023,17-Aug-11

1795023,19-Nov-13

3627695,29-May-06

3627695,15-May-03

3627695,23-Mar-15

3627695,18-May-06

;

proc sort data= have;by customer_id transaction_date;

data prep;

set have;

format l_transaction_date date8.;

retain first_transaction;

by customer_id transaction_date;

l_transaction_date = lag(transaction_date);

if not first.customer_id then do;

date_diff = transaction_date-l_transaction_date;

end;

if first.customer_id then first_transaction = transaction_date;

if last.customer_id then last_transaction = transaction_date;

if last.customer_id then first_last_diff = last_transaction-first_transaction;

run;

proc sql;

create table prep2 as

select distinct customer_id,transaction_date,date_diff,first_last_diff,avg(date_diff) as avg_days

from prep

group by customer_id;

RobertNYC
Obsidian | Level 7

Thanks this, its helpful  help. I'm still not sure  how to do  by 6 month intervals and and by 12 month intervals and do a transaction count.

Steelers_In_DC
Barite | Level 11

You have to explain what you mean with 6 and 12 month intervals.  I added an intck() to get the month difference and fields to get the difference but your explanation isn't very thorough about what you mean.  Take a look at the month6_diff and month12_diff fields.

data have;

infile cards dsd;

informat customer_id $7. transaction_date date8.;

format customer_id $7. transaction_date date8.;

input customer_id$    transaction_date;

cards;

440767,13-Jan-07

440767,5-Jan-12

440767,3-Jun-09

440767,17-Jan-12

440767,9-Apr-09

440767,10-Aug-08

1795023,10-Jul-12

1795023,11-Aug-11

1795023,30-Nov-09

1795023,9-Jan-12

1795023,20-Jan-11

1795023,10-Apr-09

1795023,2-Nov-11

1795023,8-Feb-12

1795023,17-Aug-11

1795023,19-Nov-13

3627695,29-May-06

3627695,15-May-03

3627695,23-Mar-15

3627695,18-May-06

;

proc sort data= have;by customer_id transaction_date;

data prep;

set have;

format l_transaction_date date8.;

retain first_transaction;

by customer_id transaction_date;

l_transaction_date = lag(transaction_date);

if not first.customer_id then do;

date_diff = transaction_date-l_transaction_date;

month_diff = intck('month',l_transaction_date,transaction_date);

end;

if first.customer_id then first_transaction = transaction_date;

if last.customer_id then last_transaction = transaction_date;

if last.customer_id then first_last_diff = last_transaction-first_transaction;

if month_diff >= 6 then do;

month6_diff = transaction_date-l_transaction_date;

end;

if month_diff >= 12 then do;

month12_diff = transaction_date-l_transaction_date;

end;

run;

proc sql;

create table want as

select distinct customer_id,transaction_date,date_diff,first_last_diff,avg(date_diff) as avg_days,month_diff,month6_diff,month12_diff

from prep

group by customer_id;

RobertNYC
Obsidian | Level 7

Yes sorry I should have been more specific.

So in a 6 month period (intervals) – So I guess you would have to say what are the average days between purchases during every 182.5 days and for year 365 days.

Steelers_In_DC
Barite | Level 11

data prep;

set have;

format l_transaction_date date8.;

retain first_transaction;

by customer_id transaction_date;

l_transaction_date = lag(transaction_date);

if not first.customer_id then do;

date_diff = transaction_date-l_transaction_date;

end;

if first.customer_id then first_transaction = transaction_date;

if last.customer_id then last_transaction = transaction_date;

if last.customer_id then first_last_diff = last_transaction-first_transaction;

if 182.5 <= date_diff <= 365 then month6_diff = date_diff;

if date_diff > 365 then month12_diff = date_diff;

year = year(transaction_date);

run;

proc sql;

create table want as

select distinct customer_id,transaction_date,date_diff,first_last_diff,avg(date_diff) as avg_days,avg(month6_diff) as avg_6month,avg(month12_diff) as avg_12month

from prep

group by customer_id;

RobertNYC
Obsidian | Level 7

I'm not sure thats right.  Its the average days between purchases every during a 182.5 window

First 182.5 days: I had 3 purchases and the average time between them was 27 days

Second 182.5 days: I had 5 purchases and then the average time between them was 15 days

Third  182.5 days: I had 2 purchases and then the average time between them was 30 days

So 6 moth average is 24.

Does that make more sense?

Vish33
Lapis Lazuli | Level 10

Try this. Need to do some modifications at your end.

data have;
input CUSTOMER_ID transaction_date date9.;
format transaction_date date9.;
cards;
440767 13-Jan-07
440767 5-Jan-12
440767 3-Jun-09
440767 17-Jan-12
440767 9-Apr-09
440767 10-Aug-08
1795023 10-Jul-12
1795023 11-Aug-11
1795023 30-Nov-09
1795023 9-Jan-12
1795023 20-Jan-11
1795023 10-Apr-09
1795023 2-Nov-11
1795023 8-Feb-12
1795023 17-Aug-11
1795023 19-Nov-13
3627695 29-May-06
3627695 15-May-03
3627695 23-Mar-15
3627695 18-May-06
;
run;

proc sort data=have;
by CUSTOMER_ID transaction_date;
run;

data want;
set have;
by customer_id ;
   retain transaction_date;
  dt=lag(transaction_date);
avr_Days_Betw_Each_trans=mean(intck('day',transaction_date,dt));
AvrDaysBetw_trans_sixMonth=mean(intck('QTR2',transaction_date,dt));
avrDaysBetw_trans_year_Intervals=mean(intck('year',transaction_date,dt));
if first.customer_id then ftran=transaction_date;
if last.customer_id then ltran=transaction_date;
avr_Days_first_last_trans=mean(intck('day',ftran,ltran));
format dt ftran ltran date9.;
drop ftran ltran dt;
run;

proc sql;
select avg(avr_Days_Betw_Each_trans) as avr_Days_Betw_Each_trans,
avg(AvrDaysBetw_trans_sixMonth) as AvrDaysBetw_trans_sixMonth,
avg(avrDaysBetw_trans_year_Intervals) as avrDaysBetw_trans_year_Intervals,
avg(avr_Days_first_last_trans) asavr_Days_first_last_trans
from want;
quit;

Rgds,

Vish

RobertNYC
Obsidian | Level 7

Hi Vish33,

I'm just wondering  why you used the 'QTR'  option to calculate the 6 momth averages.

As I mentioned above, Its the average days between purchases every during a 182.5 window

First 182.5 days: I had 3 purchases and the average time between them was 27 days

Second 182.5 days: I had 5 purchases and then the average time between them was 15 days

Third  182.5 days: I had 2 purchases and then the average time between them was 30 days

So 6 moth average is 24.

ndp
Quartz | Level 8 ndp
Quartz | Level 8

do you need this for each customer? or just each year?

RobertNYC
Obsidian | Level 7

Hi Aggregated by each year and then all years aggregated. Thanks !

ndp
Quartz | Level 8 ndp
Quartz | Level 8

One more question day 1= first transaction date or day 1= 1 Jan of first transcation year?

RobertNYC
Obsidian | Level 7

Hi ndp,

First transaction date or day. Thanks!!!

ndp
Quartz | Level 8 ndp
Quartz | Level 8

lets asay we have transaction dates dt. Then does the table below make sense esspecially for mon6/12 vars?

dt                _dt           day      mon6      mon12

13JAN07 13JAN07      1           1           1

10AUG08 13JAN07      576      4           2

09APR09 10AUG08      243      2           1

03JUN09 09APR09      56         1           1

05JAN12 03JUN09      947         6           3

17JAN12 05JAN12      13           1           1

RobertNYC
Obsidian | Level 7

Yes it does!!!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 3638 views
  • 6 likes
  • 4 in conversation