Solved: Re: How to flag a customer based on previous 5 yr rolling sales

deep3 · Posted 05-07-2020 07:40 AM

Hi,

I am trying to find for the year 2019 sales, if a customer has made a purchase in previous rolling 5 years by creating a flag variable (1 = No purchase in past 5 yrs 0 = Purchase in past 5 yrs).

Year 2019

Cust A, time gap between last 2 transactions is more than 5 years , so flag is 1 in 2019.

Cust B time gap between last 2 transactions is less than 5 years, so flag is zero

Cust C has made first and only transaction in 2019 so flag is 1

I have tried by using lag however it doesn't consider rolling five years

Input

CUSTOMER_ID Sale_Date Flag Year

A 02/03/2010 . 2010

A 03/05/2013 . 2013

A 01/11/2019 1 2019

B 06/08/2017 . 2017

B 04/08/2019 . 2019

C 05/09/2019 1 2019

SALE_DT1= INTNX('MONTH',Sale_Date,0,"BEGINNING");
MONTHS_SINCE = INTCK('MONTH', LAG(SALE_DT1),SALE_DT1);
IF CUSTOMER_ID ^= LAG(CUSTOMER_ID) THEN MONTHS_SINCE = .;
IF MISSING(MONTHS_SINCE) THEN FLAG = 1;
ELSE IF MONTHS_SINCE > 60 THEN FLAG =1;

ELSE FLAG =0;

RichardDeVen · Posted 05-07-2020 09:26 AM

You can use a SQL self join to select, within ID, the row with the nearest prior date.

I use the aliases EACH and SELF when doing self joins and further call it a 'triangular' join because the query contains an inequality operator (>) in the join criteria.

Example:

data have;
input ID $ Date: mmddyy10.; format date mmddyy10.;
datalines;
A  02/03/2010
A  03/05/2013
A  01/11/2019
B  06/08/2017
B  04/08/2019
C  05/09/2019
;

proc sql;
  create table want as
  select 
    *
    , intck('month', first_prior, date) as months_apart
    , year(date) = 2019 and calculated months_apart not between 0 and 60 as y2019_5yr_flag
  from 
    ( 
        select 
          each.id
        , each.date
        , max(self.date) as first_prior format=mmddyy10.
        from 
          have as each
        left join 
          have as self
        on
          each.id = self.id
        & each.date > self.date
        group by 
          each.id, each.date
    ) as subselect 
  order by 
    id, date
  ;

Result

View solution in original post

Ksharp · Posted 05-07-2020 08:04 AM

data have;
input CUSTOMER_ID  $    Sale_Date : mmddyy10.;
format sale_date mmddyy10.;
cards;
A                              02/03/2010      .         2010
A                              03/05/2013      .         2013
A                              01/11/2019     1          2019
B                              06/08/2017     .           2017
B                              04/08/2019     .           2019
C                              05/09/2019    1           2019
;
data want;
 set have;
 by CUSTOMER_ID;
 dif=intck('year',lag(sale_date),sale_date,'c');
 if first.customer_id then dif=9999;
 if last.customer_id and dif>=5 then flag=1;
drop dif;
run;

RichardDeVen · Posted 05-07-2020 09:26 AM

You can use a SQL self join to select, within ID, the row with the nearest prior date.

I use the aliases EACH and SELF when doing self joins and further call it a 'triangular' join because the query contains an inequality operator (>) in the join criteria.

Example:

data have;
input ID $ Date: mmddyy10.; format date mmddyy10.;
datalines;
A  02/03/2010
A  03/05/2013
A  01/11/2019
B  06/08/2017
B  04/08/2019
C  05/09/2019
;

proc sql;
  create table want as
  select 
    *
    , intck('month', first_prior, date) as months_apart
    , year(date) = 2019 and calculated months_apart not between 0 and 60 as y2019_5yr_flag
  from 
    ( 
        select 
          each.id
        , each.date
        , max(self.date) as first_prior format=mmddyy10.
        from 
          have as each
        left join 
          have as self
        on
          each.id = self.id
        & each.date > self.date
        group by 
          each.id, each.date
    ) as subselect 
  order by 
    id, date
  ;

Result

Kurt_Bremser · Posted 05-07-2020 09:27 AM

Try this:

data have;
input CUSTOMER_ID $ Sale_Date :ddmmyy10.;
format sale_date yymmddd10.;
datalines;
A 02/03/2010
A 03/05/2013
A 01/11/2019
B 06/08/2017
B 04/08/2019
C 05/09/2019
;

data want;
set have;
by customer_id;
dif = intck('year',lag(sale_date),sale_date,'c');
if first.customer_id
then do;
  if last.customer_id then flag = 1;
end;
else if dif > 5 then flag = 1;
drop dif;
run;

Note how I presented the input dataset in a data step with datalines; this leaves no ambiguities about attributes and contents, and allows everybody else to recreate the dataset as-is with a simple copy/paste and submit.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

deep3 · Posted 05-08-2020 03:31 AM

Thank you sir. This works and the run time is very optimal

How to flag a customer based on previous 5 yr rolling sales

Re: How to flag a customer based on previous 5 yr rolling sales

Re: How to flag a customer based on previous 5 yr rolling sales

Re: How to flag a customer based on previous 5 yr rolling sales

Re: How to flag a customer based on previous 5 yr rolling sales

Re: How to flag a customer based on previous 5 yr rolling sales

Registration is open

SAS Training: Just a Click Away