BookmarkSubscribeRSS Feed

Machine Learning and Explainable AI in Forecasting - Part II

Started ‎02-23-2022 by
Modified ‎02-23-2022 by
Views 2,642

Explainability Methods in Forecasting


Table of Content








In this second part of our blog series we focus on SHAP[1] as local explanation method and how to apply this method to the ABT table we created from the time series data. As we mentioned in part I the challenge we are facing is that this method works well when the variables we are using are independent from each other, so we have to find a way to adapt our methodology to overcome this issue.
But before we start, let's give you a short introduction into Shapley Values.


What are Shapley Values?

The concept of Shapley Values came from economists[2] for game theory. They tried to solve the problem of award distribution among multiple team members.


How to fairly attribute member's contribution? The solution by Lloyd Shapley satisfies the following properties:

  • EFFICIENCY: All individual awards should add up to the total earning
  • DUMMY: If including an individual brings no additional earning in any situation, then this individual should receive zero award
  • SYMMETRY: If including two individuals add the same amount of additional earnings, then they should receive the same award
  • ADDITIVITY: If including individual A inceases the earning by the same amount of two other individuals B and C, then A should receive the sum of B's and C's award.
The Shapley Value is the ONLY solution that satisfies all constraints! It is based on a weighted marginal contribution of a member among all possible coalitions.

But wait what is a coalition and what is a marginal contribution?


What would be the weighted marginal contribution among all possible coalitions? Here is an example Shapley Value for member A:



Written as formula:






In the formula above, p is the total number of members and S is the number of members in the coalition excluding the member of interest.

The weight is inversely proportional to the size of a coalition “group” where each “group” includes all coalitions with the same number of members.

So, in our example above we have 4 groups:
  • Group 1: Adding 0 other person, size 1 
  • Group 2: Adding 1 other person, size 3 
  • Group 3: Adding 2 people, size 3  
  • Group 4: Adding 3 people, size 1 

Each group ends up in having the same total weight of 1/4 and all weights add up to 1.

This approach can be transferred to explain the prediction for a (local) observation. Each feature value of the observation is a member in a game where the prediction is the award.

The calculation of Shapley Values is computationally expensive as it requires the evaluation of the model with all possible coalitions/combinations of features. There are faster approximation methods available, like SHAP[1].

This SHAP method is implemented in the SAS action linearExplainer which is one action of the Explain Model action set.

A good explanation of the SHAP method can be found in the book Interpretable Machine Learning.

In the next section we will explain how to adapt the linearExplainer action for time series data.

Using the linearExplainer Action for Time Series Data


The standard KERNELSHAP preset implementation of the SAS action is following these steps:

  1. Pick a single observation (query)
  2. Generate random observations by sampling from each variable's distribution separately
  3. Apply the model score code that was generated by a previous step to the new observations
  4. Weight the observations based on their coalitions
  5. Run a weighted linear regression on model's prediction
  6. Interpret the linear regression model coefficients  

Because of the way we built our analytical base table (ABT) - transferring it from transactional to one row per subject - our features are not independent from each other. For details, please refer to part I of our blog series. When we would apply step 2 - generating random observations - the dependency among the features would get lost.

To preserve the dependency structure, it is possible to suppress the random sampling process 😊. Here is an example code:
proc cas;
explainModel.linearExplainer result=shapr / table = {name='PRICEDATA_ID', caslib='PUBLIC'}
 query = {name='QUERY', caslib='CASUSER'}
 modelTable = {name='GB_PRICEDATA_MODEL_ID', caslib='MODELS'}
 modelTableType = 'ASTORE'
 predictedTarget = 'P_sale'
 seed =1234
 preset = 'KERNELSHAP'
 dataGeneration = {method='None'}
inputs= {{name = "sale_lag3"},
{name = "sale_lag2"},
{name = "sale_lag1"},
{name = "discount"},
{name = "price"}}
So, by adding the line of code "dataGeneration = {method='None'}", random sampling will be suppressed and the model's score code will be applied to the original observations.

This preserves the feature dependencies and let you explain the prediction of machine learning models like Gradient Boosting or other tree based algorithms like LightGBM in our Forecasting case.
However, please note that 
  • the accuracy depends on how well the original data cover the coalitions,
  • the Shapley values of highly correlated features may bleed into each other,
  • this method can be seen as approximation of the Shapley coalition/cohort values in [3].

Note: If you are interested in a global explanation of your machine learning model for time series data, you can just adapt the preset parameter to 'GLOBALREG' to create a surrogate model for a global explanation of your model.

In our third and last part of this blog series, we will show how to explain forecasting models globally and locally in an application.
See you in part III!


[2] Shapley, Lloyd S. “A value for n-person games.” Contributions to the Theory of Games 2.28 (1953): 307-317


Version history
Last update:
‎02-23-2022 10:37 AM
Updated by:



Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags