Agentic AI doesn’t have to be complex. With SAS® Retrieval Agent Manager, our low‑code interface, connecting LLM/RAG workflows to MCP servers and tools is seamless.
💡 Picture this: a Risk Management team wants to assess the climate risk exposure of major financial institutions. A plain LLM query won’t cut it—it lacks context and domain expertise. That’s why we supercharge LLM with a curated knowledge base: public ESG reports in PDF form. Now, business experts can query these documents in natural language—through a chat interface or an agent embedded in a web app—delivering accurate, relevant insights at speed.
💡 If you don’t see the point behind the ‘napkin’ reference in the title … read the amazing article of my colleague @Sundaresh1 about Back of the napkin: How can synthetic data enhance machine learning outcomes? I learnt that apparently, "people have made MILLIONS starting with a back-of-the-napkin sketch …" so I thought, why not me? 😉
👉 And here it is THE napkin ….
👉In this article you will learn - as a practitioner - how to move from this napkin sketch into a running LLM/RAG Agent in less than 15 minutes!
First, let’s connect to SAS® Retrieval Agent Manager and go to the ‘Sources’ menu and add a new Source (1) and name it (2). Select the type of source (3): local’ to upload the files directly from your machine, git’ to fetch the files from a Git repository URL or ‘custom’ to use Python code to create and save files to the system or use your own package. If you have selected ‘local’ then select the documents to be uploaded (4).
SAS® Retrieval Agent Manager: create a new source
Click ‘OK’ to save the source. Now your ‘Source’ is ready to be used.
SAS® Retrieval Agent Manager: new source created
Now, let’s move to the ‘Collections’ menu.
Create a new Collection, name it (1) and select the 'Source' you’ve just created (2).
SAS® Retrieval Agent Manager: new collection creation
Then open your new 'Collection' and create a new configuration (1). Each configuration is related to an embedding model used for converting the chunks of documents into vector representations called ‘embeddings’. Here we’re going to use IBM’s Granite Embedding model (3) designed for semantic search and RAG applications. Then choose the embeddings destination: here the vector database Weaviate (4).
SAS® Retrieval Agent Manager: create a new configuration for your collection
If necessary, you can configure the Settings and Text Extraction options. The latter option will give you access to the OCR (Optical Character Recognition for scanned or image-based content) and table extraction settings. Then run the vectorization (6). This step may take some time depending on the number of documents to be vectorized. Check the vectorization status to be sure the vectorization has been completed.
Let’s move to the ‘Retrieval Settings’ menu.
If you want to improve response accuracy and efficiency, this is the place to be! How do you want your conversational agent to behave? Do you expect a ‘I don’t know’ type of response when there is not enough information in the context to answer a question? Do you want answers which provide relevant context, background and insights to explain the answer? Create a new system prompt (1), name it (2) and describe the expected behavior (3). You could specify the query result limit (4), meaning the Top K chunks of text used to select the best answer.
SAS® Retrieval Agent Manager: retrieval settings configuration
One more task to do before going back to our Collection: create the user evaluation tests. They can be used on any collection in the system. Let’s move to the ‘User Eval Tests’ tab.
Evaluation tests are reusable scenarios, with questions and expected responses used for manual evaluations. Let’s create a new one (1) for our ‘Climate Risks’ topic (2). In the ‘Prompt’ field, enter your question like in our case “What type of physical risks, acute hazards and chronic hazards are mentioned and related to climate risks?” (3), the ‘Passing threshold”(4) as a percentage and add a few assertions to validate the result (5).
SAS® Retrieval Agent Manager: user evaluation tests
It’s time to finalize the ‘Collection’ configuration and set the LLMs to be used within the ‘Collection’.
Go back to the ‘Collections’ menu then the ‘LLMs’ tab. Click the ‘+’ button to add a LLM from those already available (1). For the added model, choose the Retrieval settings you have previously created (2) or a default one.
SAS® Retrieval Agent Manager: LLMs used within a Collection
In the ‘User Eval’ pane, select an ‘Eval LLM’ from the list to use with user evaluation tasks.
In the ‘Auto Eval’ pane, select a Data generation LLM, Critic LLM, and an Eval LLM to use with auto evaluation tasks. Here we’re not going to use it today, but the ‘Auto Eval’ functionality is pretty cool. It automatically generates questions and answers by using information from your Source. Questions are passed to the RAG pipeline. The results are compared to the ground truth answers by using a critic large language model (Critic LLM).
Now, get to the ‘User Eval’ tab then run the tests to benchmark your configurations.
On the ‘User Eval’ tab, create a New Evaluation (1). On the ‘Configuration’ tab, choose the configuration you’ve created (2). On the ‘Tests’ tab, select the User Eval Test to use (3). The job is created and runs.
SAS® Retrieval Agent Manager: User Eval Tests running within the Collection
Once completed, the ‘Score’ column (1) gives you an overview of the configuration performance and response quality. For each ‘User eval prompt’ you can view: the answer, and whether the assertion passed.
SAS® Retrieval Agent Manager: User Eval Tests results
Let’s imagine you’ve already created and compared different test scenarios. It’s time to designate your ‘Champion‘ model. Go ack to the ‘Summary’ and select the champion model (1).
SAS® Retrieval Agent Manager: Champion model selection
Well, you’re ready to Chat with your Collection !
Here we’re going to use the Chat UI but the Python API is also an option. Let’s navigate to the ‘Chat’ view.
On the ‘Collections’ tab, select a configuration for the Collection you want to query (1) then the LLM to use (2) and the Retrieval settings (3). You’re ready to start a query (4). Then you can see manage your previous chats, and view response details like the response cost and duration (5).
SAS® Retrieval Agent Manager: Chat Interface
Well, we can call it a day at this point! But if you have more time and want to try more complex workflows, you can go for the Agents in the next paragraph 😉
8. Create Agents and configure MCP Server Tools
First, what is the role of a RAM Agent? The Agent automates complex workflows and integrates tasks such as post-processing or calling MCP tools. Furthermore, RAM’s Python API allows you to query collections and process responses from another application such as Streamlite web application. SAS® Retrieval Agent Manager monitors agent performance, provide logs, and health metrics in real-time.
SAS® Retrieval Agent Manager: Agent creation
The ‘Code’ tab, select ‘Edit run.py’ to enter custom Python code or click ‘Upload package.zip’ to select a ZIP file that contains custom code. The Python code for every agent must include the exec() function.
SAS® Retrieval Agent Manager: Agent code
The ‘LLM’ tab, select the default LLM that the agent should use (1) and the default Retrieval settings (2).
SAS® Retrieval Agent Manager: LLM used by the Agent
Then go to ‘Collections’ tab, select at least one Collection with which the agent should interact.
Finally, on the ‘Tools’ Tab (1), select one or more MCP tools that the agent can use (2).
SAS® Retrieval Agent Manager: MCP Tools selection
We won't be addressing this topic in this article (which is already far too long, I know, I'm sorry !😇), perhaps in another article someday... Let’s just say that to integrate MCP tools with your agents, you need to create an MCP tool server template in SAS® Retrieval Agent Manager (See ‘Code Template’ tab and ‘MCP Tools’ tab if you are authorized).
Now let’s start our Agent by clicking on the button in the upper right corner (1). The Agent is now running. To be able to call your Agent from your webapp in Python, you need its ID. Click on the 3 dots on the right of the table header and select ‘Choose Columns’ to display and select ‘ID’.
SAS® Retrieval Agent Manager: Start your Agent
Now you’re ready to use your Agent in SAS® Retrieval Agent Manager (see the ‘Automation’ tab) or call it from another application.
👉You can now query documents in plain language—via chat or embedded agents—and get accurate and relevant insights. Enjoy !
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.