Hi Experts,
I am trying to create a flag variable based on transaction date. If customer has bought something within 6 months of the last transaction date then it will be flagged as existing customer otherwise acquired customer.
e.g.
For a particular customer last trans_date is 2012-01-29 then I need to see sale within last 6 months of the trans_date to identify this customer as acquired or existing.
Thanks in advance for your suggestions.
Regards,
Naeem
use proc sql to group all sales by customer id then find the most recent purchase.
Hi,
Several ways of doing it. First you would need to merge the last_trans_date onto your current data. Then you could go with INTNX - this function does intervals between time points. So if current trans date < = last trans date + 6 months then Y else N.
Can run up some code tomorrow if still unclear.
Rob
Thanks jf and Rob.
Rob - If you could please provide some code whenever you get a chance that would be a great help.
Regards,
Naeem
Here is some code (sorry, spacing seems a bit weird on here). Have just done it quickly, so there is no checking done (i.e. if there are not previous transactions etc.) but it works. There are other methods, such as datdiff if you can specify a number of days rather than months.
---
/* Create some test data */
data previous_transactions;
attrib cust_id format=8.
tran_date format=date9.;
cust_id=1; tran_date='01mar13'd; output;
cust_id=1; tran_date='06jan13'd; output;
cust_id=1; tran_date='15may13'd; output;
cust_id=2; tran_date='01mar13'd; output;
cust_id=2; tran_date='03mar13'd; output;
run;
data new_transactions;
attrib cust_id format=8.
tran_date format=date9.;
cust_id=1; tran_date='30may13'd; output;
cust_id=2; tran_date='10oct13'd; output;
run;
/* New table with flag. */
proc sql;
create table new_transaction_with_flag as
select A.*,
/* If the maximum date from previous transactions + 6 months is after the date we are looking at then we know this is within 6 months */
case when intnx('month',B.TRAN_DATE,6)>=A.TRAN_DATE then "Y"
else "N" end as FLAG
from WORK.NEW_TRANSACTIONS A
/* Join on the maximum data from previous transactions */
left join (
select distinct
CUST_ID,
MAX(TRAN_DATE) as TRAN_DATE
from WORK.PREVIOUS_TRANSACTIONS
group by CUST_ID
) B
on A.CUST_ID=B.CUST_ID;
quit;
Thanks RW9 - Everything is perfect but I am looking for trans_date which is within past 6 months from the new transaction date. I am not taking into account future transactions for this analysis. How can I modify suggested code for this one please?
Regards,
Naeem
A couple of basic assumptions let a DATA step accomplish this fairly easily.
First, your dates have to be actual SAS dates.
Second, you have to be able to sort your data.
So the full solution could be as easy as this:
proc sort data=have;
by cust_id tran_date;
run;
data want;
set have;
by cust_id;
previous_purchase = lag(tran_date);
if first.cust_id then new_customer_flag=1;
else if (tran_date - previous_purchase) < 183 then new_customer_flag=0;
else new_customer_flag=1;
run;
If you don't like the 183-day cutoff, you can always use some of the more complex methods (such as INTNX) to define the 6-month window.
Hi Astounding,
Thanks for providing solution. Actually I've two different datasets where one dataset contains campaign data which was executed within a specific time period. My problem is to take trans_date from campaign data and compare this with most previous date from the bigger dataset that contains all past information regarding customers . If difference between two dates is within six months window then I need to flag customer as retention and if the difference is zero or greater than six months customer would be acquisition.
Solution provided by RW9 is very close to what I was looking for but this is taking maximum date which is most recent transaction date for a particular customer. I know lag function is not going to work in sql but not sure how to tackle this problem with data step.
Any suggestion about this will be highly appreciated.
Regards,
Naeem
Well, this is one way, checking for the existence of a record where the old transaction date is between new transaction date - 6 months and new transaction date.
proc sql;
create table new_transaction_with_flag as
select A.*,
/* So if there is a record with a date between the new transaction - 6 months and new transaction
then flag it */
case when exists(
select distinct THIS.CUST_ID
from WORK.PREVIOUS_TRANSACTIONS THIS
where THIS.CUST_ID=A.CUST_ID
and THIS.TRAN_DATE between intnx('month',A.TRAN_DATE,-6) and A.TRAN_DATE
) then "Y"
else "N" end as FLAG
from WORK.NEW_TRANSACTIONS A;
quit;
OK, just to make sure I understand the problem correctly ... it sounds like you only need one result per CUST_ID with the FLAG. You don't actually need to flag each purchase, you only need to flag each CUST_ID. If that's right, here's one way (assuming both data sets are sorted by CUST_ID and date):
data want;
set purchases
transaction (in=in2 rename=(transaction_date = purchase_date));
by CUST_ID PURCHASE_DATE;
prior_date = lag(purchase_date);
if in2 then do;
if first.cust_id then new_customer_flag=1;
else if (purchase_date - prior_date) > 182 then new_customer_flag=1;
else new_customer_flag=0;
output;
end;
keep cust_id new_customer_flag;
run;
I hope we're both trying to solve the same problem.
Thanks Guys - This is exactly that I was looking for. Really appreciate for suggesting solution.
Regards,
Naeem
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.