The issue you are describing illustrates the problem known as 'temporal infidelity'. This problem occurs when the relationships modeled for the time periods you had available during modeling had shifted by the time the model was applied to new data. In general, models will not perform on newer data as well as they performed on your historical data. You need to monitor the amount of the change and the nature of the change to assess when a model needs to be refit. Using out-of-time samples to validate your model is a reasonable practice and gives you a more realistic assessment of how your model will perform, but do not be surprised when it does not perform as well. Simply including all of the training data will make some of your metrics look better but will be misleading as they also mask the temporal infidelity which you seem to have identified. Tools such as SAS Model Manager allow you to monitor the performance of a model over time so that you can refit the model when the performance has degraded too much.
I hope this helps!
Doug
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.