SAS for AI/ML Bias Detection and Mitigation in Customer Analytics

4 Likes

As marketers and advertisers utilize artificial intelligence (AI) and machine learning (ML) to elevate their brands, understanding the potential for detecting and mitigating bias within predictive analytic procedures is crucial. Today, AI plays a progressive role in advancing the marketing and advertising mandate. Data is generated constantly in the digital ecosystem, associated with real-time customer engagement behavior, web/mobile interactions and incremental streams of desired revenue year-after-year. This gives brands the chance to apply a variety of analytical techniques to draw "ah-ha" insights that can be used in segmentation, targeting, recommendations, next-best-experiences and other forms of customer interactions.

At a fundamental level, AI/ML helps brands improve their viewpoints on customers and subsequently, running their business. When using data science for these intentions, brands are placing immense trust that their modeling IP is providing useful insights that subsequently influence marketing treatments. Keep in mind, not every single human being on planet Earth has the same level of data/analytical literacy, and some may consider applications of AI/ML as opportunistic innovation, free from the mistakes commonly attributed in human-driven decisions.

Image 1 - Fact or Fake Image 1 - Fact or Fake

I hate to spoil the fun, but we need to be more prudent as practitioners. AI/ML can contain bias. Instead of ushering in a utopian era of fair decisions, AI/ML have the potential to exacerbate the impact of biases. As innovations help with everything from the identification of attractive prospects to predicting who should receive a marketing stimuli, it is important to understand every modeling application has the potential to affect separate segments of a customer population differently. When applied in martech, biased AI/ML can negate efforts to learn, understand and anticipate consumer behavior. Brands should improve their understanding of how AI bias impacts them, how to detect it and ultimately, mitigate.

I'm a Marketer. Not a Data Scientist. Why Should I Care?

There are numerous ways in which bias can slip into customer data. Although marketers themselves may not build analytical models, it's hard to find a use case these days that doesn't benefit from propensities or probabilities. There are many perspectives to take into account when describing bias in data science. Bias can happen during data collection, data processing, sampling, model building, and so on. However, when AI/ML is applied to data that is inaccurate, it has the potential to magnify the errors and cause unintended bias in campaigns, personalization or testing. If that wasn't enough, bias can lead to irrelevant results from severely impacted KPIs, such as failing to reach the correct audience or serving up the wrong offers to a particular demographic. Ultimately, this means wasted money and resources, failure to reach relevant customers, and potential harming the reputation of a brand.

Perhaps I have your attention now? While most brands readily promote fairness in AI/ML as a principle, putting the processes in place to execute it consistently is an ongoing obstacle. There are multiple dimensions for evaluating the fairness of AI/ML, and determining the correct approach depends on the use case.

As a practitioner in the application of data-driven methods in martech, it's astounding that in 2023, numerous brands are acknowledging a trust problem with AI/ML. As a result, brands will lose the benefit from the innovation and derived insights of advanced analytical methods to make better decisions. In short, I see two parallels at the moment:

Data scientists and analysts need to continue their focus on translating the output of AI/ML in business language and storytelling to reduce stakeholder intimidation. Remember, if your models are not put into action, what was the point of your effort?
Marketing and CX decision makers may not be passionate about statistics, but nearly every use case can elevate through the usage of propensities and probabilities. While AI/ML is advertised as next-level precision, it is not 100% perfect. Therefore, the translation of propensities and probabilities into business context must be interrogated, transparent and understood.

Detecting Bias for Fair AI/ML

In predictive modeling, bias occurs when a model’s prediction or performance differs for unique values of a given variable, referred to here as the "sensitive" variable. Prediction bias is measured by calculating the difference in average model prediction for values of the sensitive variable. Performance bias is measured by calculating the difference in model performance for values of the sensitive variable.

SAS provides users the assessBias action which calculates performance statistics and average model predictions for each level of a nominal (categorical) variable. The difference in these statistics is then calculated either by reporting the maximum pairwise difference among all levels or by comparing each level to a reference group in order to report a single number that represents model bias.

Turning this feature on is straightforward. As an analyst prepares to build a model in SAS, they simply need to indicate which variable(s) they would like assessed for bias. This can be done in a single-click.

Image 2 - Detect Bias Feature in SAS Image 2 - Detect Bias Feature in SAS

For the purposes of this example, we’ll be creating a simple pipeline with two Gradient Boosting models. The model node on the left will be used for assessing bias detection.

Image 3 - Model Pipelining Example - Assessing for Bias Image 3 - Model Pipelining Example - Assessing for Bias

Once the pipeline has been trained, the auto-generated Bias Assessment report can be viewed by opening the model’s results and selecting “Fairness and Bias."

Image 4 - Fairness & Bias Report Image 4 - Fairness & Bias Report

There are multiple plots in Image 4, and we’ll review each of them one at a time. Before we interpret, let's summarize this modeled data. It represents a financial services brand concerned with extending loans to it's customers, and potential default. The focus of bias assessment will be centered on a variable entitled Reason, which indicates why the customer applied for a loan. The values for the variable are Debt Consolidation, Home Improvement or Missing.

Image 5 - Prediction Bias Assessment Image 5 - Prediction Bias Assessment

Prediction bias represents how much greater the model's probability to predict the event is for one group over another on average. In this modeling project, the target event level is “1” indicating when a customer defaulted on their loan. The bars in this plot represent the target event's average predicted probability for each level of the variable REASON. Large differences in bar size indicate that the model predicts the event at considerably different rates for unique levels of REASON, and analysts should be aware of this before using the model.

This prediction bias plot in Image 5 shows that the loans with a REASON code set to Home Improvement are higher than that of Debt Consolidation, meaning that the model predicts a higher default probability for loans related to Home Improvement.

Image 6 - Prediction Bias Parity Image 6 - Prediction Bias Parity

The bar in the plot of Image 6 represents the maximum pairwise difference in the target event's average predicted probability between levels of the REASON variable. It shows that the maximum prediction difference is 7.6% which occurs between the levels of Home Improvement and Missing.

Image 7 - Performance Bias Plot Image 7 - Performance Bias Plot

Performance bias is a representation of how accurate the model is for one group over another. The bars in this plot, displayed in Figure 7, represent the values of performance (or accuracy) metrics for each level of the sensitive variable REASON. The performance metrics included here are Multi-Class Log Loss (MCLL), True Positive Rate (TPR), and Best Kolmogorov-Smirnov Along ROC (maxKS). Higher values of TPR and maxKS, and lower values of MCLL indicate a better fit. Large differences in bar size for any specific metric indicates that the model performance (accuracy) is not consistent across all levels of the variable REASON, and analysts should be aware of this before using the model.

Image 8 - Performance Bias Parity Image 8 - Performance Bias Parity

The bars in this plot of Image 8 represent the maximum pairwise difference in performance metrics between levels of the REASON variable. For example, the bar for maxKS shows that the greatest difference in maxKS is 11.3%, which occurs between the levels Missing and Debt Consolidation. The maximum pairwise difference in True Positive Rate (TPR) is also known as "Equal Opportunity".

Mitigating Bias for Fair AI/ML

Now that we have taken an introductory tour of assessing bias for sensitive variables, let's pivot and discuss mitigation.

With the widespread use of AI/ML models in decision-making processes, there is increasing interest about the impact of these models. Bias can be embedded in both the training data and the trained model, either of which can lead to unfair or biased predictions for certain groups or populations. To reduce the bias in the data and models, there are three categories of bias mitigation methods that can be considered.

Preprocess methods, which transform the data prior to model training. These methods try to eliminate the correlations between the input features and the sensitive feature while retaining as much information from the input data as possible.
In-process methods, which consider fairness constraints during the model training process. They actively adjust the model parameters during training in order to create a model that produces fair predictions and classifications.
Postprocess methods, which do not directly alter the input data or the model training process. They adjust the predictive outputs from the learned models to compensate for the bias in the model.

SAS provides users the mitigateBias action which implements the exponentiated gradient reduction (EGR) algorithm, which uses an in-process method of mitigating bias. It considers fairness constraints during the model training process. The EGR algorithm has three main advantages.

First, it is model-agnostic, which means that analysts can apply it to a wide variety of ML models. The EGR algorithm works like a wrapper that is built around existing ML models. The wrapper iteratively reweights the data, and then it trains a new classifier on the basis of reweighted data in each iteration. By doing so, it reduces the complex optimization problem of training a ML model with fairness constraints to a series of standard model training problems without the fairness constraints.
Second, the EGR algorithm supports various fairness measurements, such as demographic parity, equal opportunity, and equalized odds.
Finally, the EGR algorithm has strong theoretical support in terms of its optimization trade-off between fairness and accuracy. In theory, the EGR algorithm returns solutions that satisfy all the fairness constraints and also achieve the optimal error rate of the best fair solution.

Training a fair ML model is in general a nonlinear programming problem. In nonlinear optimization tasks, it isn't uncommon to get suboptimal solutions because the constraints are infeasible or because the algorithm is stuck in a local minimum. As with many other ML algorithms, the effectiveness of the EGR algorithm depends on many factors, such as the quality of the input data, the selection of fairness constraints, the quality of the ML models that it wraps around, and the values of the hyperparameters.

The mitigateBias action supports multilevel nominal target variables and multilevel nominal sensitive variables. Note that when analysts include a large number of levels in the sensitive variable, the EGR algorithm searches more complex optimization scenarios, which can result in suboptimal solutions. Therefore, the solutions from the EGR algorithm might not always be desirable when the number of levels in the sensitive variable is high. You should review and validate your sensitive variable, training models, and fairness constraints before applying the EGR algorithm.

In Image 5 above, we observed the prediction bias plot showing that the loans with a REASON code set to Home Improvement are higher than that of Debt Consolidation. Let's apply the mitigateBias action and review the subsequent results.

Image 9 - Model Pipelining Example - Mitigating for Bias Image 9 - Model Pipelining Example - Mitigating for Bias

We are simply shifting our focus from the left modeling node to the right modeling node (which was run in parallel) to obtain comparative results. Just like we did in the previous example, once the pipeline has been trained, the auto-generated Bias Assessment report can be used to review the impact of the mitigateBias action through viewing the model’s results and selecting “Fairness and Bias."

Image 10 - Prediction Bias Assessment After Mitigation Image 10 - Prediction Bias Assessment After Mitigation

Readers will now observe that Image 10 shows a prediction bias plot loans with a REASON code of Home Improvement or Debt Consolidation are producing similar results. Assuming as analysts we are satisfied with the results, users can pivot to topics like model interpretability reviewing PD, ICE, LIME Explanation and HyperSHAP Value plots.

Image 11 - Model Interpretability Image 11 - Model Interpretability

...as well as general model assessment for Lift, Gain, ROC, F1, Accuracy and other essential diagnostics.

Image 12 - Model Assessment Image 12 - Model Assessment

Effective risk management is increasingly being brought to the frontline rather than functioning in the back office. When using advanced analytics, it’s becoming increasingly important to understand and measure fairness risk to avoid exploiting vulnerable customers.

For those who prefer to see live demos, check this out:

Customer Analytics - AI Bias & Mitigation.mp4

Video Player is loading.

Current Time 0:00

Duration 4:09

Loaded: 0%

Stream Type LIVE

Remaining Time 4:09

(view in My Videos)

Although our demo story in this article focused on one example of a sensitive variable, consider other candidate variables that a brand could (and should) assess. Creating frameworks and processes to mitigate bias and address fairness risk will mean that it can be expanded to other models with rigor in the future.

As stated earlier, if your models are not put into action across the touchpoints that are relevant to your brand's strategy in orchestrating customer journeys...

Image 13 - SAS Customer Intelligence 360 - Multichannel Orchestration Image 13 - SAS Customer Intelligence 360 - Multichannel Orchestration

What was the point of your effort? Don't let misaligned perceptions and distrust of business stakeholders hold back the innovation that AI/ML can bring forth within martech.

We look forward to what the future brings in our development process – as we enable technology users to access all of the most recent SAS analytical developments. Learn more about how SAS can be applied for customer analytics, journey personalization and integrated marketing here.

SAS Communities Library