SWAT Code Generation and Execution in SAS Viya with Azure OpenAI and LangChain

The purpose of this custom agent is to create an interactive, AI-powered assistant that can simplify the process of generating and executing SAS code for users who may not be experts in SAS, Python or may not know anything about CAS and CAS actions. This can be particularly useful for data analysts and scientists who want to leverage the power of SAS Viya for complex data processing and analytics tasks without having to delve deeply into SWAT or SAS programming. The SWAT (SAS Scripting Wrapper for Analytics Transfer) package is a Python interface to SAS Cloud Analytic Services (CAS).

By leveraging the natural language understanding capabilities of Azure's OpenAI large language model (LLM), users can interact with SAS Viya in a conversational manner, asking questions or requesting actions to be taken on their data in SAS Viya. This custom agent serves as an interface that abstracts the complexity of coding by generating the necessary SWAT code from user prompts and executing it, thus allowing users to focus more on the analytical aspects of their work rather than the technical details of code syntax and structure.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

The Big Picture

This custom agent integrates multiple technologies, including SAS Viya, Azure OpenAI, and open-source Python packages like SWAT and LangChain.

The diagram depicts two circuits: the blue circuit and the red circuit.

In the blue circuit:

An LLM is at the core of LangChain custom agent. I am using GPT-4, from Azure OpenAI.
The Generator tool inside the LangChain custom agent leverages a large language model (LLM), GPT-4, from Azure OpenAI, to generate SWAT code, basically Python, based on user prompts. I specifically chose a 1106-Preview model version. The 1106-Preview was trained on Dec 2023 data. I noticed this version "understands SAS better" and is naturally proficient at generating Python code, including SWAT. Choosing your model is as important as the programming behind the scenes.
The Executor tool receives the generated SWAT code, creates a SWAT connection and is instructed to execute the generated Python code, the close the SWAT connection.

In the red circuit:

SWAT code execution logs and results are parsed and passed to the LangChain custom agent.
At this point the LLM decides if the execution is successful. If so, it formulates a final result and prints it on the screen.
If the execution results in errors or could not happen for any reason, a feedback loop starts.
The Generator tool receives new instructions and new SWAT code is generated, and the whole process continues, like in the "blue circuit".

I like this approach, because of its simplicity. The custom agent requires only two custom tools, the generator and the executor.

Video

Watch the following video, where we explore the approach:

(view in My Videos)

Possible Applications

The applications of the custom SWAT agent cover could enhance the user experience with SAS Viya. Here's a possible list of applications:

Automated Code Generation: This feature significantly streamlines the process of writing SWAT code. It's especially beneficial for users who may not be well-versed in SAS syntax, as it allows them to interact with SAS Viya through simple English prompts, making the analytics platform more user-friendly.

Power Under the Hood: This agent taps into the robust capabilities of the CAS engine for efficient data management tasks, such as loading, managing, and saving data from various sources. It can also utilize predefined connections to data sources known as caslibs, maximizing the utilization of existing infrastructure.

Accessibility and Inclusivity: By simplifying the complex aspects of programming, the agent lowers the barrier to entry for advanced analytics tools. This democratization of data science ensures that a broader audience can access and leverage powerful analytics capabilities.

Self-Correcting Feedback Loop: The inclusion of a self-correcting feedback loop where the agent reviews execution results and, in the case of errors, restarts the generation and execution cycle, introduces a layer of resilience and efficiency to the analytics process.

Interactive Analytics: The agent's conversational interface fosters an engaging and real-time interaction with the SAS Viya platform. Users can ask questions and receive immediate feedback, which enhances the analytical experience and accelerates decision-making.

Memory and Plasticity

The agent displays “plasticity”. The term "plasticity" is often associated with flexibility and adaptability. Here, it's used to describe the remarkable ability of an intelligent agent to adjust its actions and remember previous instructions.

Imagine you're working with an intelligent agent, and you've given it two tasks. The first task (Task A) is to load a table into memory, and the second task (Task B) is to query that table once it's loaded. If the agent encounters a problem with Task A, you can instruct it to try an alternative approach. Once Task A is successfully completed, the agent doesn't forget about Task B. Instead, it recalls that Task B was the next step and proceeds to perform it. This retention and sequential execution of tasks showcases the agent's memory capabilities.

This form of adaptability stems from the design of the LangChain agent. In this architecture, a language model (LLM) acts as a conduit between various tools and the custom behaviors you've programmed into those tools. In this scenario, I was clear about needing a code rewrite if the initial attempt failed. The agent's ability to handle this instruction is a proof to the flexibility built into its operation.

@tool
def execute_swat_code(swat_code: str) -> str:
    """Useful when you need to execute generated Python SWAT code.
        Input to this tool is correct Python SWAT code, ready to be executed in a SAS Viya environment CAS session.
        Assume an already established SWAT connection, called 'conn'.
        Identify the results, summarize and provide a short explanation.
        If an error is returned, you may need to rewrite the SWAT code, using the generate_SWAT_code tool.
        Provide a Final Result: Summarize and provide a short explanation of what was done and what the outcome was.
        If you encounter an issue, detail what the issue is.
    Args:
        swat_code: The SWAT Python code which will be submitted to a CAS session in SAS Viya.
    """

Conclusions

By integrating the intuitive language understanding of Azure OpenAI's GPT-4 with the practical functionality of SWAT and LangChain, this agent removes the barrier of complex code, allowing users to command their simple data management tasks through simple conversation. The agent displays some "plasticity" – its ability to remember and adapt to tasks – when given a memory of previous conversations. This innovative approach may democratize data processing, empowering a broader range of professionals to make data-driven decisions swiftly and confidently.

I hope you enjoyed this article. And please contact me if you have any feedback or any ideas on how to improve the agent or take it one level further.

What's Next

Discover the inner workings of SWAT Code Generation and Execution in SAS Viya with Azure OpenAI and LangChain: Behind the Scenes. I'll unveil the custom agent Python program and detail the prerequisites for running it in your own environment.

In future posts, I will:

Add RAG to the custom agent. RAG or Retrieval-Augmented Generation, is a technique used in natural language processing (NLP) that combines the power of language models with the ability to retrieve information from relevant documents. Specifically for SWAT, I would be using Peter Styliadis' series and see if we achieve better results at SWAT code generation.
Compare the "vanilla" custom agent code generation ability with the one powered by a RAG and discuss the results.

Acknowledgements

A big thank you to SAS Project Anemoi 's members, selflessly sharing their GenAI experiences.

Additional Resources

Thank you for your time reading this post. If you liked the post, give it a thumbs up! Please comment and tell us what you think about having conversations with your data. If you wish to get more information, please write me an email.

Find more articles from SAS Global Enablement and Learning here.