Hello,
I need to find if there are returning customers after 45days but within 90days.
1. how to calculate every 45days?
2. how to find returning customers after 45days but within 90days
sample data:
customer date
Alex 2020-01-01
James 2020-01-03
Alex 2020-02-20
James 2020-06-03
want data: (Alex was the only returning customer after 45 days but within 90days):
Period 1 | Period 2 | Period 3 | Period 4 | |
2020-01-01 to 2020-02-14 | 2020-02-15 to +45 Days | Period 2 Date to + 45 Days | Period 3 Date to + 45 Days | |
Returning Customers | 0 | 1 |
|
OK, here's an example of how to determine if a Customer's subsequent visit is within a 45 to 90 day window of their most recent past visit. I'm using a hash table to build a look up by customer. This current coding only works if you have unique customer names. If you have multiple customers with the same name, you'd have to do something more sophisticated like a customer id or some such. The hash table saves Day45 and Day90 from the first visit and those values are retrieved by the FIND() and then compared to the customer's current visit date. If the customer's current visit is withing the 45 to 90 day range, they are marked as "Y", they are in fact a Returnee.
DATA Customers_And_Dates;
INFILE DATALINES MISSOVER;
FORMAT Customer $8.
Date_Of_Visit YYMMDDD10.
Day45 YYMMDDD10.
Day90 YYMMDDD10.
;
INPUT Customer $
Date_of_Visit : ANYDTDTE10.
;
Day45 = INTNX('DAY', Date_of_Visit, 45);
Day90 = INTNX('DAY', Date_of_Visit, 90);
DATALINES;
Alex 2020-01-01
James 2020-01-03
Alex 2020-02-20
James 2020-06-03
;
RUN;
PROC SORT DATA = Customers_And_Dates;
BY Customer Date_of_Visit;
RUN;
DATA Returnees;
DROP _:;
SET Customers_And_Dates;
BY Customer;
IF _N_ = 1 THEN
DO;
DECLARE HASH Hsh_Cust_Visits (MULTIDATA: 'N');
Hsh_Cust_Visits.DefineKey ('Customer');
Hsh_Cust_Visits.DefineData ('Day45', 'Day90');
Hsh_Cust_Visits.DefineDone ();
END;
IF FIRST.Customer THEN
DO;
_RC = Hsh_Cust_Visits.ADD();
Returnee = 'N';
END;
ELSE
DO;
_Save45 = Day45;
_Save90 = Day90;
_RC = Hsh_Cust_Visits.FIND();
IF Day45 <= Date_of_Visit <= Day90 THEN
Returnee = 'Y';
ELSE
Returnee = 'N';
Day45 = _Save45;
Day90 = _Save90;
_RC = Hsh_Cust_Visits.REPLACE();
END;
RUN;
Results:
I think that's basically what you're looking for, yes?
Jim
Figuring out date intervals is pretty easy with the INTNX function. Here's code to read everything in and calculate Day45 and Day90. It'll take me a bit more to get the code to determine if the person is a returnee within the 45 to 90 day period.
Jim
DATA Returnees;
INFILE DATALINES MISSOVER;
FORMAT Customer $8.
Date_Of_Visit YYMMDDD10.
Day45 YYMMDDD10.
Day90 YYMMDDD10.
;
INPUT Customer $
Date_of_Visit : ANYDTDTE10.
;
Day45 = INTNX('DAY', Date_of_Visit, 45);
Day90 = INTNX('DAY', Date_of_Visit, 90);
DATALINES;
Alex 2020-01-01
James 2020-01-03
Alex 2020-02-20
James 2020-06-03
;
RUN;
OK, here's an example of how to determine if a Customer's subsequent visit is within a 45 to 90 day window of their most recent past visit. I'm using a hash table to build a look up by customer. This current coding only works if you have unique customer names. If you have multiple customers with the same name, you'd have to do something more sophisticated like a customer id or some such. The hash table saves Day45 and Day90 from the first visit and those values are retrieved by the FIND() and then compared to the customer's current visit date. If the customer's current visit is withing the 45 to 90 day range, they are marked as "Y", they are in fact a Returnee.
DATA Customers_And_Dates;
INFILE DATALINES MISSOVER;
FORMAT Customer $8.
Date_Of_Visit YYMMDDD10.
Day45 YYMMDDD10.
Day90 YYMMDDD10.
;
INPUT Customer $
Date_of_Visit : ANYDTDTE10.
;
Day45 = INTNX('DAY', Date_of_Visit, 45);
Day90 = INTNX('DAY', Date_of_Visit, 90);
DATALINES;
Alex 2020-01-01
James 2020-01-03
Alex 2020-02-20
James 2020-06-03
;
RUN;
PROC SORT DATA = Customers_And_Dates;
BY Customer Date_of_Visit;
RUN;
DATA Returnees;
DROP _:;
SET Customers_And_Dates;
BY Customer;
IF _N_ = 1 THEN
DO;
DECLARE HASH Hsh_Cust_Visits (MULTIDATA: 'N');
Hsh_Cust_Visits.DefineKey ('Customer');
Hsh_Cust_Visits.DefineData ('Day45', 'Day90');
Hsh_Cust_Visits.DefineDone ();
END;
IF FIRST.Customer THEN
DO;
_RC = Hsh_Cust_Visits.ADD();
Returnee = 'N';
END;
ELSE
DO;
_Save45 = Day45;
_Save90 = Day90;
_RC = Hsh_Cust_Visits.FIND();
IF Day45 <= Date_of_Visit <= Day90 THEN
Returnee = 'Y';
ELSE
Returnee = 'N';
Day45 = _Save45;
Day90 = _Save90;
_RC = Hsh_Cust_Visits.REPLACE();
END;
RUN;
Results:
I think that's basically what you're looking for, yes?
Jim
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.