Flag variable

Reply
Trusted Advisor
Posts: 1,204

Flag variable


Hi Experts,

I am trying to create a flag variable based on transaction date. If customer has bought something within 6 months of the last transaction date then it will be flagged as existing customer otherwise acquired customer.

e.g.

For a particular customer last trans_date is 2012-01-29  then I need to see sale within last 6 months of the trans_date to identify this customer as acquired or existing.

Thanks in advance for your suggestions.

Regards,

Naeem

Contributor jf
Contributor
Posts: 22

Re: Flag variable

use proc sql to group all sales by customer id then find the most recent purchase.

Super User
Super User
Posts: 7,413

Re: Flag variable

Hi,

Several ways of doing it.  First you would need to merge the last_trans_date onto your current data.  Then you could go with INTNX - this function does intervals between time points.  So if current trans date < = last trans date + 6 months then Y else N.

Can run up some code tomorrow if still unclear.

Rob

Trusted Advisor
Posts: 1,204

Re: Flag variable

Thanks jf and Rob.

Rob - If you could please provide some code whenever you get a chance that would be a great help.

Regards,

Naeem

Super User
Super User
Posts: 7,413

Re: Flag variable

Here is some code (sorry, spacing seems a bit weird on here).  Have just done it quickly, so there is no checking done (i.e. if there are not previous transactions etc.) but it works.  There are other methods, such as datdiff if you can specify a number of days rather than months.

---

/* Create some test data */
data previous_transactions;
  attrib  cust_id format=8.
          tran_date format=date9.;
  cust_id=1; tran_date='01mar13'd; output;
  cust_id=1; tran_date='06jan13'd; output;
  cust_id=1; tran_date='15may13'd; output;
  cust_id=2; tran_date='01mar13'd; output;
  cust_id=2; tran_date='03mar13'd; output;
run;
data new_transactions;
  attrib cust_id format=8.
          tran_date format=date9.;
  cust_id=1; tran_date='30may13'd; output;
  cust_id=2; tran_date='10oct13'd; output;
run;

/* New table with flag. */
proc sql;
  create table new_transaction_with_flag as
  select  A.*,

/* If the maximum date from previous transactions + 6 months is after the date we are looking at then we know this is within 6 months */
          case  when intnx('month',B.TRAN_DATE,6)>=A.TRAN_DATE then "Y"
                    else "N" end as FLAG
  from    WORK.NEW_TRANSACTIONS A
  /* Join on the maximum data from previous transactions */
  left join (
              select  distinct
                      CUST_ID,
                      MAX(TRAN_DATE) as TRAN_DATE
              from    WORK.PREVIOUS_TRANSACTIONS
              group by CUST_ID
            ) B
  on      A.CUST_ID=B.CUST_ID;
quit;

Trusted Advisor
Posts: 1,204

Re: Flag variable

Thanks RW9 -  Everything is perfect but I am looking for trans_date which is within past 6 months from the new transaction date. I am not taking into account future transactions for this analysis. How can I modify suggested code for this one please?

Regards,

Naeem

Super User
Posts: 5,090

Re: Flag variable

A couple of basic assumptions let a DATA step accomplish this fairly easily.

First, your dates have to be actual SAS dates.

Second, you have to be able to sort your data.

So the full solution could be as easy as this:

proc sort data=have;

by cust_id tran_date;

run;

data want;

set have;

by cust_id;

previous_purchase = lag(tran_date);

if first.cust_id then new_customer_flag=1;

else if (tran_date - previous_purchase) < 183 then new_customer_flag=0;

else new_customer_flag=1;

run;

If you don't like the 183-day cutoff, you can always use some of the more complex methods (such as INTNX) to define the 6-month window.

Trusted Advisor
Posts: 1,204

Re: Flag variable

Hi Astounding,

Thanks for providing solution. Actually I've two different datasets where one dataset contains campaign data which was executed within a specific time period. My problem is to take trans_date from campaign data and compare this with most previous date from the bigger dataset that contains all past information regarding customers . If difference between two dates is within six months window then I need to flag customer as retention and if the difference is zero or greater than six months customer would be acquisition.

Solution provided by RW9 is very close to what I was looking for but this is taking maximum date which is most recent transaction date for a particular customer. I know lag function is not going to work in sql but not sure how to tackle this problem with data step.

Any suggestion about this will be highly appreciated.

Regards,

Naeem

Super User
Super User
Posts: 7,413

Re: Flag variable

Well, this is one way, checking for the existence of a record where the old transaction date is between new transaction date - 6 months and new transaction date.


proc sql;
  create table new_transaction_with_flag as
  select  A.*,

/* So if there is a record with a date between the new transaction - 6 months and new transaction
   then flag it */
          case  when exists(
                              select  distinct THIS.CUST_ID
                              from    WORK.PREVIOUS_TRANSACTIONS THIS
                              where   THIS.CUST_ID=A.CUST_ID
                                and   THIS.TRAN_DATE between intnx('month',A.TRAN_DATE,-6) and A.TRAN_DATE
                            ) then "Y"
                else "N" end as FLAG
  from    WORK.NEW_TRANSACTIONS A;
quit;

Super User
Posts: 5,090

Re: Flag variable

OK, just to make sure I understand the problem correctly ... it sounds like you only need one result per CUST_ID with the FLAG.  You don't actually need to flag each purchase, you only need to flag each CUST_ID.  If that's right, here's one way (assuming both data sets are sorted by CUST_ID and date):

data want;

   set purchases

         transaction (in=in2 rename=(transaction_date = purchase_date));

   by CUST_ID PURCHASE_DATE;

   prior_date = lag(purchase_date);

   if in2 then do;

      if first.cust_id then new_customer_flag=1;

      else if (purchase_date - prior_date) > 182 then new_customer_flag=1;

      else new_customer_flag=0;

      output;

   end;

   keep cust_id new_customer_flag;

run;

I hope we're both trying to solve the same problem.

Trusted Advisor
Posts: 1,204

Re: Flag variable

Thanks Guys -  This is exactly that I was looking for. Really appreciate for suggesting solution.

Regards,

Naeem

Ask a Question
Discussion stats
  • 10 replies
  • 4373 views
  • 0 likes
  • 4 in conversation