BookmarkSubscribeRSS Feed
Question
Fluorite | Level 6
Hi All, I have a quick question regarding on propensity model build. I have identified customers who didn't have any subscription then move to subscription . Each customer will have a sub_start_date and I have like 2 years data. The target=1 represent customers who were transacting normal then move to subscription and target=0 , customers who haven't subscribed, but still transacting normal. My question, is how to calculate the variables to use in the model? For target=1, I can look at any behavior before their sub_date, (transac in last week, 4 weeks ec.. from sub_date)..., but for Target=0, how do I calculate the variables, as they don't have a reference date (sub_date) etc...? Your help would be much appreciated. Thank You very much customer_id min_sub_date 50001 29/09/2016 50008 30/11/2014 50087 23/11/2014 50103 01/04/2017
1 REPLY 1
DougWielenga
SAS Employee

If you only have the beginning subscription dates, you will need some additional information to help identify predictors for someone subscribing.  Assuming you have such variables, you have several options in how you prepare your data.  One approach is to start by setting an observation window and a target window.   For example, you might consider starting looking at the data available for potential subscribers from January through March to predict who would subscribe during May or June.   The missing month (April) is intended to provide you some time to take action on those people the model identifies as being more likely to subscribe.   If you have monthly data available, you can record those variables at lag1_var (end of March), lag 2_var (end of February), and lag3_var(end of January) to try and capture changes in behavior that might make someone more likely to respond/subscribe.  For more distant time periods, you might average together behaviors (e.g. lag46_var for the average of the variables for October/November/December).   You could then see how many of those people who had not subscribed by the end of the target period ended up subscribing in the 2-month period of May and June.   The beauty of this type of approach is that it relies on recent behavior to predict future behavior.  Since you are using rolling time periods, you can validate the model's performance at any time by updating the time intervals, and you can score current data to project subscriptions that will current in the 2-month period starting 30 days later.     As new data becomes available, you update the corresponding lag variables so that you can score the newest data.

 

Hope this helps!

Doug 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 726 views
  • 0 likes
  • 2 in conversation