BookmarkSubscribeRSS Feed
Question
Fluorite | Level 6
Hi All, I have a quick question regarding on propensity model build. I have identified customers who didn't have any subscription then move to subscription . Each customer will have a sub_start_date and I have like 2 years data. The target=1 represent customers who were transacting normal then move to subscription and target=0 , customers who haven't subscribed, but still transacting normal. My question, is how to calculate the variables to use in the model? For target=1, I can look at any behavior before their sub_date, (transac in last week, 4 weeks ec.. from sub_date)..., but for Target=0, how do I calculate the variables, as they don't have a reference date (sub_date) etc...? Your help would be much appreciated. Thank You very much customer_id min_sub_date 50001 29/09/2016 50008 30/11/2014 50087 23/11/2014 50103 01/04/2017
1 REPLY 1
DougWielenga
SAS Employee

If you only have the beginning subscription dates, you will need some additional information to help identify predictors for someone subscribing.  Assuming you have such variables, you have several options in how you prepare your data.  One approach is to start by setting an observation window and a target window.   For example, you might consider starting looking at the data available for potential subscribers from January through March to predict who would subscribe during May or June.   The missing month (April) is intended to provide you some time to take action on those people the model identifies as being more likely to subscribe.   If you have monthly data available, you can record those variables at lag1_var (end of March), lag 2_var (end of February), and lag3_var(end of January) to try and capture changes in behavior that might make someone more likely to respond/subscribe.  For more distant time periods, you might average together behaviors (e.g. lag46_var for the average of the variables for October/November/December).   You could then see how many of those people who had not subscribed by the end of the target period ended up subscribing in the 2-month period of May and June.   The beauty of this type of approach is that it relies on recent behavior to predict future behavior.  Since you are using rolling time periods, you can validate the model's performance at any time by updating the time intervals, and you can score current data to project subscriptions that will current in the 2-month period starting 30 days later.     As new data becomes available, you update the corresponding lag variables so that you can score the newest data.

 

Hope this helps!

Doug 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 809 views
  • 0 likes
  • 2 in conversation