BookmarkSubscribeRSS Feed

Mitigating Bias Using SAS Viya Fair AI Tools

Started ‎04-21-2023 by
Modified ‎04-21-2023 by
Views 2,445

Background

 

Issues with biased data collection and bias magnification by algorithms has plagued AI since its inception. Biased collection has been well understood since the study of statistics began. For example, data collected only on one segment of the people in one country may not extrapolate well to the full population of 7.9 billion people on earth.

 

Historic biases can also be perpetuated through automated algorithms. Here’s one example.

 

The Diner’s Club cardboard credit card started in the United States in 1949. It was only available to a select group of 200 white men, essentially friends of the founders. Two years later the membership had grown to 42,000. Plastic credit cards issued by banks came out in 1958. Yet it wasn’t until mid 1970s that women and minorities were able to get credit cards on their own. But of course, at that point, it was still much more difficult for them to get credit because as a group they lacked any credit history. Formulas that considered credit history were biased against those who had not even been permitted to have a credit history.

 

SAS Viya has been able to assess bias for a while. And even better, SAS Viya can now help you mitigate bias inside your modeling process! SAS Viya lets you:

 

  1. Assess a sensitive variable for bias using the SAS Model Studio interface
  2. Assess a sensitive variable for bias using by programming in CASL, Lua, Python or R
  3. Mitigate bias using by programming in CASL, Lua, Python or R (NOTE: It is on the Roadmap for this capability to be added to SAS Model Studio also!)
  4. Explore your data to find bias using SAS Visual Analytics

 

Understanding the Fairness Metrics Used to Assess and Mitigate Bias in SAS Viya

 

The plots on the Fairness and Bias tab in the Results section highlight potential differences in model performance for different groups within specified sensitive variables.

 

The assessBias action provides a number of useful measures of bias. These include performance bias, performance bias parity, prediction bias, and prediction bias parity.

 

image001.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

The mitigateBias action addresses bias using the exponentiated gradient reduction (EGR) algorithm. EGR runs multiple iterations. In each iteration it can reweight and relabel the data, and then trains a new classification model. So it actually incorporates fairness constraints within the modeling process!

 

The SAS Viya mitigateBias action with EGR supports a number of fairness measures: demographic parity, equalized opportunity and equalized odds. It will take these measures into account in finding your best model, and will provide you detailed results of not only the model performance but also the bias metrics.

 

image003.png

 

Let’s explain some of the fairness statistics used to assess and mitigate bias by using a simple example. Say we are using an algorithm to determine hiring rates. We want to check for any bias related to whether someone is Latinx or not. Let’s consider a binary sensitive variable LatinxStatus, which can be either LATINX or NONLATINX.

 

Performance bias parity compares the fit statistics of the model for the LATINX group versus for the NONLATINX group. If the model is not a good fit for both groups, that is a problem. Try some other models to find one that works well for both groups. It’s easy to compare models with SAS Viya, the fastest, most productive AI and Analytics.

 

Prediction bias parity compares the probability of hire between the LATINX and NONLATINX groups. The measure is the difference between those two probabilities. This should be low; the best model would not show a different average predications among goups if the sensitive variable (or its surrogates…but that’s a topic for another day) does not play a role in predicting hiring.

 

Demographic parity compares the selection rate of each category (LATINX and NONLATINX) of the sensitive variable LatinxStatus. The measure is the difference between those selection rates. Demographic parity can help an organization balance out historical biases that impact the data.

 

Equalized opportunity compares the true positive rate for the LATINX group versus the NONLATINX group. Be sure you are getting a high true positive rate for each group.

 

Equalized odds looks at the maximum difference in true positive rate OR false positive rate between the NONLATINX and LATINX groups. You want to be sure there that both the TPR and FPR are similar between the two groups.

 

Watch how this works in action in SAS Viya:

 

 

The screen capture below shows how sensitive variables can be designated in the Data pane of SAS Model Studio.

 

image005.png

 

The screen capture below shows example results of the assessBias action with graphs of performance bias and performance bias parity, prediction bias and prediction bias parity, and bias metrics and bias parity metrics.

 

image007.png

 

CODE examples:

 

Here is the code I used with screen shots of the results interspersed:

 

/* BethHeartSexMITIGATE */

cas MySession sessopts=(caslib=casuser timeout=1800 locale="en_US");
libname casuser cas caslib="casuser";

proc casutil;
droptable casdata="casuser.heartone" quiet;
load data=sashelp.heart outcaslib="casuser"
casout="HEARTone" promote;
run;
quit;

proc contents data=casuser.heartone;
run;

proc print data = casuser.heartone (obs = 100);
run;

data casuser.hearttwo;
set casuser.heartone;
where DeathCause = "Cancer" or DeathCause = "Coronary Heart Disease";
run;

data casuser.heartthree;
set casuser.hearttwo;
if DeathCause = "Coronary Heart Disease" then DeathCause = "Heart";
run;

proc print data=casuser.heartthree (obs = 50);
run;

 

image009.png

 

 

image011.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

Train a Gradient Boosting Model

 

proc cas;
decisionTree.gbtreeTrain /
inputs={"Systolic", "Diastolic", "Weight", "Height", "Sex"},
maxLevel="5",
saveState={name="gbtreeASTORE", replace="True"},
seed=1234,
table="HEARTthree",
target="DeathCause";
run;

 

image013.png

 

 

Assess Bias

 

proc cas;
fairAITools.assessBias /
modelTable="gbtreeASTORE",
modelTableType="ASTORE",
event = "Heart",
predictedVariables={"P_DeathCauseCancer", "P_DeathCauseHeart"},
response="DeathCause",
responseLevels={"Cancer", "Heart"},
sensitiveVariable="Sex",
table="HEARTthree";
run;

 

image015.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

image017.png

 

image019.png

 

 

Mitigate Bias

 

proc cas;
fairAITools.mitigateBias /
biasMetric="DEMOGRAPHICPARITY",
event="Heart",
learningRate="0.01",
maxIters="10",
predictedVariables={"P_DeathCauseCancer", "P_DeathCauseHeart"},
response="DeathCause",
responseLevels={"Cancer", "Heart"},
sensitiveVariable="Sex",
table="HEARTthree",
tolerance="0.005",
trainProgram="
decisionTree.gbtreeTrain result=train_res /
table=table,
weight=weight,
target=""DeathCause"",
inputs= {
""Systolic"", ""Diastolic"", ""Weight"", ""Height"", ""Sex""
},
nominals={""DeathCause"",""Sex""},
nBins=50,
quantileBin=True,
maxLevel=5,
maxBranch=2,
leafSize=5,
missing=""USEINSEARCH"",
minUseInSearch=1,
binOrder=True,
varImp=True,
mergeBin=True,
encodeName=True,
nTree=15,
seed=12345,
ridge=1,
savestate={
name=""HEART_gb_astore"",
replace=True
}
;
astore.score result=score_res /
table=table,
casout=casout,
copyVars=copyVars,
rstore=""HEART_gb_astore""
;
",
tuneBound="True";
run;

 

image021.png

 

image023.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

image025.png

 

image027.png

 

A number of additional code samples are available in the documentation.

 

image029.png

 

These code examples allow for various combinations of feature (input) and target (outcome) types, as shown in the slide image below, and repeated as text for anyone who cannot read the image:

 

image031.png

  

  • Assess bias:
    • For nominal target
    • For interval target
    • Using DATA step or DS2 scoring
    • Using prescored data and reference level
    • With no response variable
  • Mitigate bias:
    • Mitigate demographic parity for a binary target with a binary or nominal sensitive input
    • Mitigate equal opportunity for a binary target with a binary or nominal sensitive input

 

Always Always Always Explore Your Data

 

In this blog I’ve shown you how to assess and mitigate bias as part of the modeling process using coding and SAS Model Studio. But remember how much you can learn about your data and how much time and headaches you can save yourself down the road if you first explore your data using SAS Visual Analytics. Here’s an example using the same HEART data I used in the mitigateBias video I pointed you to.

 

 

 

Release History

 

The assessBias action became available via programming (in CASL, Lua, Python or R) in October 2021. The following month (November 2021) the Fairness and Bias tab became available in SAS Model Studio. Most recently, in October 2022, the mitigateBias action became available via programming (in CASL, Lua, Python or R). This information is also shown in the table below.

 

image033.png

 

 

FOR MORE INFORMATION Videos

 

 

Blogs

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
‎04-21-2023 11:13 AM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags