Mitigating Bias Using SAS Viya Fair AI Tools

5 Likes

Background

Issues with biased data collection and bias magnification by algorithms has plagued AI since its inception. Biased collection has been well understood since the study of statistics began. For example, data collected only on one segment of the people in one country may not extrapolate well to the full population of 7.9 billion people on earth.

Historic biases can also be perpetuated through automated algorithms. Here’s one example.

The Diner’s Club cardboard credit card started in the United States in 1949. It was only available to a select group of 200 white men, essentially friends of the founders. Two years later the membership had grown to 42,000. Plastic credit cards issued by banks came out in 1958. Yet it wasn’t until mid 1970s that women and minorities were able to get credit cards on their own. But of course, at that point, it was still much more difficult for them to get credit because as a group they lacked any credit history. Formulas that considered credit history were biased against those who had not even been permitted to have a credit history.

SAS Viya has been able to assess bias for a while. And even better, SAS Viya can now help you mitigate bias inside your modeling process! SAS Viya lets you:

Assess a sensitive variable for bias using the SAS Model Studio interface
Assess a sensitive variable for bias using by programming in CASL, Lua, Python or R
Mitigate bias using by programming in CASL, Lua, Python or R (NOTE: It is on the Roadmap for this capability to be added to SAS Model Studio also!)
Explore your data to find bias using SAS Visual Analytics

Understanding the Fairness Metrics Used to Assess and Mitigate Bias in SAS Viya

The plots on the Fairness and Bias tab in the Results section highlight potential differences in model performance for different groups within specified sensitive variables.

The assessBias action provides a number of useful measures of bias. These include performance bias, performance bias parity, prediction bias, and prediction bias parity.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

The mitigateBias action addresses bias using the exponentiated gradient reduction (EGR) algorithm. EGR runs multiple iterations. In each iteration it can reweight and relabel the data, and then trains a new classification model. So it actually incorporates fairness constraints within the modeling process!

The SAS Viya mitigateBias action with EGR supports a number of fairness measures: demographic parity, equalized opportunity and equalized odds. It will take these measures into account in finding your best model, and will provide you detailed results of not only the model performance but also the bias metrics.

Let’s explain some of the fairness statistics used to assess and mitigate bias by using a simple example. Say we are using an algorithm to determine hiring rates. We want to check for any bias related to whether someone is Latinx or not. Let’s consider a binary sensitive variable LatinxStatus, which can be either LATINX or NONLATINX.

Performance bias parity compares the fit statistics of the model for the LATINX group versus for the NONLATINX group. If the model is not a good fit for both groups, that is a problem. Try some other models to find one that works well for both groups. It’s easy to compare models with SAS Viya, the fastest, most productive AI and Analytics.

Prediction bias parity compares the probability of hire between the LATINX and NONLATINX groups. The measure is the difference between those two probabilities. This should be low; the best model would not show a different average predications among goups if the sensitive variable (or its surrogates…but that’s a topic for another day) does not play a role in predicting hiring.

Demographic parity compares the selection rate of each category (LATINX and NONLATINX) of the sensitive variable LatinxStatus. The measure is the difference between those selection rates. Demographic parity can help an organization balance out historical biases that impact the data.

Equalized opportunity compares the true positive rate for the LATINX group versus the NONLATINX group. Be sure you are getting a high true positive rate for each group.

Equalized odds looks at the maximum difference in true positive rate OR false positive rate between the NONLATINX and LATINX groups. You want to be sure there that both the TPR and FPR are similar between the two groups.

Watch how this works in action in SAS Viya:

To see a demonstration of assessing bias using programming, watch my December 2021 YouTube SAS Fair Artificial Intelligence Tools.
To see an example of assessing bias using SAS Model Studio (pipeline interface) and mitigating bias using CASL in SAS Studio, watch my April 2023 YouTube mitigateBias with SAS.

The screen capture below shows how sensitive variables can be designated in the Data pane of SAS Model Studio.

The screen capture below shows example results of the assessBias action with graphs of performance bias and performance bias parity, prediction bias and prediction bias parity, and bias metrics and bias parity metrics.

CODE examples:

Here is the code I used with screen shots of the results interspersed:

/* BethHeartSexMITIGATE */

cas MySession sessopts=(caslib=casuser timeout=1800 locale="en_US");
libname casuser cas caslib="casuser";

proc casutil;
droptable casdata="casuser.heartone" quiet;
load data=sashelp.heart outcaslib="casuser"
casout="HEARTone" promote;
run;
quit;

proc contents data=casuser.heartone;
run;

proc print data = casuser.heartone (obs = 100);
run;

data casuser.hearttwo;
set casuser.heartone;
where DeathCause = "Cancer" or DeathCause = "Coronary Heart Disease";
run;

data casuser.heartthree;
set casuser.hearttwo;
if DeathCause = "Coronary Heart Disease" then DeathCause = "Heart";
run;

proc print data=casuser.heartthree (obs = 50);
run;

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

Train a Gradient Boosting Model

proc cas;
decisionTree.gbtreeTrain /
inputs={"Systolic", "Diastolic", "Weight", "Height", "Sex"},
maxLevel="5",
saveState={name="gbtreeASTORE", replace="True"},
seed=1234,
table="HEARTthree",
target="DeathCause";
run;

Assess Bias

proc cas;
fairAITools.assessBias /
modelTable="gbtreeASTORE",
modelTableType="ASTORE",
event = "Heart",
predictedVariables={"P_DeathCauseCancer", "P_DeathCauseHeart"},
response="DeathCause",
responseLevels={"Cancer", "Heart"},
sensitiveVariable="Sex",
table="HEARTthree";
run;

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

Mitigate Bias

proc cas;
fairAITools.mitigateBias /
biasMetric="DEMOGRAPHICPARITY",
event="Heart",
learningRate="0.01",
maxIters="10",
predictedVariables={"P_DeathCauseCancer", "P_DeathCauseHeart"},
response="DeathCause",
responseLevels={"Cancer", "Heart"},
sensitiveVariable="Sex",
table="HEARTthree",
tolerance="0.005",
trainProgram="
decisionTree.gbtreeTrain result=train_res /
table=table,
weight=weight,
target=""DeathCause"",
inputs= {
""Systolic"", ""Diastolic"", ""Weight"", ""Height"", ""Sex""
},
nominals={""DeathCause"",""Sex""},
nBins=50,
quantileBin=True,
maxLevel=5,
maxBranch=2,
leafSize=5,
missing=""USEINSEARCH"",
minUseInSearch=1,
binOrder=True,
varImp=True,
mergeBin=True,
encodeName=True,
nTree=15,
seed=12345,
ridge=1,
savestate={
name=""HEART_gb_astore"",
replace=True
}
;
astore.score result=score_res /
table=table,
casout=casout,
copyVars=copyVars,
rstore=""HEART_gb_astore""
;
",
tuneBound="True";
run;

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

A number of additional code samples are available in the documentation.

These code examples allow for various combinations of feature (input) and target (outcome) types, as shown in the slide image below, and repeated as text for anyone who cannot read the image:

Assess bias:
- For nominal target
- For interval target
- Using DATA step or DS2 scoring
- Using prescored data and reference level
- With no response variable
Mitigate bias:
- Mitigate demographic parity for a binary target with a binary or nominal sensitive input
- Mitigate equal opportunity for a binary target with a binary or nominal sensitive input

Always Always Always Explore Your Data

In this blog I’ve shown you how to assess and mitigate bias as part of the modeling process using coding and SAS Model Studio. But remember how much you can learn about your data and how much time and headaches you can save yourself down the road if you first explore your data using SAS Visual Analytics. Here’s an example using the same HEART data I used in the mitigateBias video I pointed you to.

(view in My Videos)

Release History

The assessBias action became available via programming (in CASL, Lua, Python or R) in October 2021. The following month (November 2021) the Fairness and Bias tab became available in SAS Model Studio. Most recently, in October 2022, the mitigateBias action became available via programming (in CASL, Lua, Python or R). This information is also shown in the table below.

FOR MORE INFORMATION Videos

Tamara Fischer and Veronique Van Vlasselelaer’s webinar on Creating Fair Machine Learning Models
Hiwot Tesfaye Responsible AI in Practice using SAS Visual Analytics (length 22 minutes)
Joe Madden SAS Viya 2022.10 Release October 2022 (very brief overview) start minute 1:57

Blogs

Jagruti Kanjia and Ricky Tharrington’s blog Assess model bias in SAS Viya 4 May 13, 2022
Suneel Grover’s blog AI/ML Bias Detection and Mitigation in Customer Analytics

Find more articles from SAS Global Enablement and Learning here.

Mitigating Bias Using SAS Viya Fair AI Tools

Free course: Data Literacy Essentials

Get Started