Enforce Responsible AI Best Practices: Trustworthy AI Life Cycle Workflow Available

4 Likes

SAS has just released an experimental version of our Trustworthy AI Life Cycle Workflow for use with SAS® Model Manager and SAS® Workflow Manager on SAS Viya 2024.01 and later. Our Trustworthy AI Life Cycle workflow enforces standards and best practices set by the AI Risk Management Framework defined by the National Institute of Standards and Technology (NIST). The workflow allows organizations to document their considerations of AI systems’ impact on human lives. Our workflow includes steps to ensure that the training data is representative of the population that is impacted, as well as that the model predictions and performance are similar across protected classes These steps help ensure that the model is not causing disparate impact or harm to a specific group. Furthermore, you can ensure that your model remains accurate over time by creating human-in-the-loop tasks to act when additional attention is needed.

The aim of the workflow is to make NIST’s recommendations easier to adopt for organizations by specifying individual roles and expectations, gathering required documentation, outlining factors for consideration, and leveraging automation to ease adoption. The result of the workflow is a production model with documentation to support the assertion that the organization has done its due diligence to provide evidence that the model is fair, and their processes do not cause harm.

Given the growing landscape of Responsible and Trustworthy AI, this workflow has been marked as experimental to gather your feedback. As you use the sample workflow, we would like to know what works well and what can be improved upon. Feedback can be provided in the Issues tab of our SAS Model Management Resources GitHub page.

To get started using this workflow, perform the following steps:

Perform first-time configuration steps for SAS Workflow Manager, if not already configured.
Download the sample workflow definition from the SAS Model Management Resources GitHub page.
Import and activate the workflow definition using SAS Workflow Manager. Workflow administrators should review timer and role defaults to ensure the values match their organizational processes.
When a modeling project is available within SAS Model Manager, you can start the workflow. You will be prompted to specify the model owner.
The workflow consists of user tasks, which will appear within the Task category in SAS Model Manager. Tasks only appear for the users specified in the task role to ensure that right individuals are providing documentation and approvals. The first task will be for the Model Owner in the Tasks category. Note: Users might need to refresh the tab to see new tasks.
Several of the tasks involve providing documentation, aligning with NIST recommendations. To help compile this documentation, a companion template document in provided in the SAS Model Management Resources GitHub page, so be sure to download it and store it among the model files.

This workflow has several pieces to support the life cycle of a model. Let’s dive into each section.

Identify Stakeholders

Analytics is a team sport with data scientists, IT/engineering, risk analysts, data engineers, business representatives and other groups all coming together to build an AI system. Once the model owner is identified, they must then specify the other project roles. This workflow supports both single users and groups in the stakeholder roles. Additionally, users or groups can be specified for multiple roles. The roles used in the workflow are:

Model Owner – the decision maker for the analytics project. They will be responsible for documenting the purpose of the project and will provide approval or feedback as each piece of the project is assessed for Trustworthy AI practices.
Model Developer – the data scientist(s) that develops the model.
Model Engineer – the engineering or IT resource(s) that deploys the model.
Model Risk Owner – the risk analyst(s) or risk manager(s) that manages and document the model risks.
Data Engineer – the engineering resource(s) that prepares the data for modeling.
Domain Expert – a resource that is available to address business or domain questions.

Verify Project

This section consists of mostly system tasks to confirm that the project metadata has been populated and that models exist within the project. If information or models are missing, users are promoted to correct the problem. Consider this section a completion check prior to advancing further.

Documents Project

The stakeholders listed previously are responsible for documenting various pieces of the project in alignment with NIST’s AI Risk Management Framework. In this section, responsible parties will be prompted to provide documentation via the tasks in SAS Model Manager. To aid in this compilation, we recommend using the provided template.

The documentation compiled in this step examines the proposed model usage, end users, expected performance of the AI system, strategies for resolving issues, potential negative impacts, deployment strategies, data limitations, potential privacy concerns, testing strategies, and more. Users can add additional documentation to cover any additional needs or use cases not covered by the workflow. By compiling this documentation, the various stakeholders can align on potential risks posed by the project as well as confirm the project plan.

Assess Data

The section for assessing data addresses two key factors: privacy and bias. Data privacy and bias are not concerns for all analytical projects. For example, some machine predictive maintenance projects rely on telemetry data and will not include information about individuals. For these cases, there is an option to by-pass the data privacy and bias questions.

For use-cases where Personally Identifiable Information (PII) is not required, the workflow prompts the data engineer to remove or mask PII. Otherwise, the risks of leveraging PII for modeling must be documented. Keeping PII data within the analytical base table increases risk of private information being leaked to bad actors.

Next, the workflow examines potential data biases, such as the inclusion of protected class variables or proxy variables for modeling as well as the representativeness of the data. Potential bias risks must be addressed or documented before moving forward. Using training data that is not representative of the target population can create a model that is less accurate for specific groups, resulting in undue harm for those groups. Including proxy variables or protected class variables in the training data can create a model that treats groups differently. For some use cases, such as health care, these variables may be important predictors for disease risk. For others, such as models aimed to predict who should receive a service or a benefit, this can lead to discrimination.

Assess Model

Models are assessed against three factors: performance, fairness, and explainability. Performance data provides information on how well the model can predict the desired event, which helps us understand how useful the model may be for our AI system. For gathering model performance metrics on training, testing, and validation sets, SAS provides functionality in support of several modeling types. For models developed in the graphical pipelines of SAS Model Studio, this information is calculated automatically and will be registered with the model. For models developed programmatically in SAS, we recommend providing the scored data to the macro at registration time. Similarly, this information can be calculated using the scored data using functions provided in the python-sasctl and r-sasctl open-source packages for python and R models.

For models that require an evaluation of potential bias, the model developer is prompted to calculate and provide the model’s performance and average prediction for each group of the protected class. A model that performs worse on specific groups may point to problems with a lack of representation within the training data. Additionally, a model that is less accurate on specific groups may cause disparate impact or harm to those groups. Depending on the use case, a model that predicts different outcomes based on those groups may also cause undue harm. For models developed in SAS Model Studio, data scientists can select which variables to asses for bias in just a few clicks. Outside of SAS Model Studio, users can leverage the Assess Bias Action Set. Python-sasctl provides a function wrapping this action set for Python developers. When bias is detected in a model, model developers can leverage the Mitigate Bias Action Set to help mitigate model bias.

The last piece models are assessed against is model explainability. Model explainability helps stakeholders understand how the inputs to a model affect the model’s prediction. By understanding the relationship between the inputs and prediction, data scientists and domain experts can determine if the model captures well-known connections between factors. Additionally, explainability can provide evidence that a model is fair by demonstrating that a model is using factors that are unrelated to protected group status. Some models lend themselves to easy explainability. For example, the form of regressions and decision trees directly define how input variables are used to calculate the model prediction. Other models, such as gradient boosting or random forest, are hard to interpret and require additional methods to explain the model. There are several model interpretability techniques available including:

In SAS Model Studio, users can specify to run model interpretability techniques in the Post Training Properties of the Supervised Machine Learning nodes. For programming based models, users can explore the Explain Model Action Set.

Select Champion Model

The champion model is the best candidate for production. Users can mark a single model as the champion model but can mark multiple challenger models within the project in SAS Model Manager. This section of the workflow checks if a champion model exists, and if not, prompts the Model Owner to select a champion model before moving forward. When selecting a champion model, it is important to keep in mind performance and fairness assessment results from the previous steps. Additionally, the Model Owner should ensure the model qualifies based on qualities like the appropriateness of variables, explainability, and others. This model will undergoing further testing and approval in the next step before deployment to production.

Test and deploy model

Prior to production, the champion model should be tested using the framework specified in the project documentation. These tests may involve ensuring that the model can score data within SAS Viya and can score data within the publishing destination. Models should also be tested against unseen values and unseen data sets, such as a testing or validation set, to confirm performance continues to meet expectations.

After a model is tested and approved for production, it can be deployed for production. To prevent a model from going to production before approvals are attained, we recommend leveraging permissions within SAS Viya to prohibit user publishing. Additionally, administrators may need to do one-time configuration of their publishing destinations if they are not using an out-of-the-box destination. SAS Model Manager supports a variety of destinations, including a REST API endpoint hosted within SAS Viya, called MAS. SAS Model Manager also supports containerizing models, which also leverage a REST API endpoint, but will be hosted outside of SAS Viya. Other destinations include CAS and select databases.

Once the model is deployed, the model engineer will need to ensure that the model is incorporated in their business processes, whether that be using the REST API call within their web applications or creating a dashboard to review the results of batch scoring.

Monitor model and review project

All models decay, but models don’t decay at the same rate. Model decay leads to a decrease in model accuracy, so any decision being made using that model will be wrong more often. SAS Model Manager supports Performance Monitoring to pinpoint when decay begins to occur. Performance Monitoring can be combined with Key Performance Indicators (KPIs) to create thresholds for alerting. For this section, data scientists or engineers should ensure project performance monitoring and KPIs are defined. When model performance no longer meets the predefined threshold, the model owner must decide which step to take next, whether it be to retrain the model, select a new champion model, end the workflow, or retire the project.

Recurring Audit

The NIST AI Risk Management Framework recommends reviewing the project on a recurring basis to validate that the project parameters, assumptions, stakeholders, use case, data, and model are still valid. This provides a regular audit of AI Systems, allowing organizations to remove systems no longer in use. Retiring unused systems help save on cloud costs, reduces the number of models managed, and reduces the attack space for bad actors.

Once you have tried out and reviewed the workflow, we ask that you provide your feedback in the Issues tab of our SAS Model Management Resources GitHub page. Did this workflow fit well in your ModelOps processes and if not, what can we improve upon? Were the documentation and tests valuable to supporting your organization’s needs for Responsible AI? Are there factors that you would like to see considered for the next iteration? What additional can we provide to ease adoption? We will actively review provided feedback and incorporate what we can into the next iteration of this workflow!

Want to learn more about building Responsible AI applications? Check out the following resources: