About Bogdan_Teleuca

Bogdan_Teleuca

@AndreasMenrath You're welcome. Not to mention you will have costs for your LLM... 😁 The point of the post is the prompt and the prompting technique. And the post definitely not intended for users who want to click. It's intended for automatic processes that can use code to replicate a SAS Studio flow. Obviously, I failed to explain that. I should do a better job.

Neca

Hi @Bogdan_Teleuca, appreciate your help and answers. Unfortunately, it is still not working as wished. Job dependency is by default "Completes successfully" and I even tried some other dependencies but none of them worked. I will continue to try wit different job flow attributes.

Bogdan_Teleuca · ‎09-02-2025

@touwen_k that's the ultra secure version. Try it. Working on 8080 inside the cluster should be ok for most of the use cases. If you have ultra sensitive data and you don't want users with cluster access to potentially snoop it, working with 8443 should be the way to go forward. Bear in mind that when you open 8443, 8080 stays open. The doc explains how to close it.

Bogdan_Teleuca · ‎08-29-2025

Welcome back, SAS Agentic AI explorers! Today, we’re going forward with Agentic AI workflows in SAS Intelligent Decisioning—focusing on the build. We'll detail how you can integrate Large Language Models (LLMs), deterministic models, code files, and custom rules into a governed, modular SAS Agentic AI workflow. Where We Are in the Series In Part 1, Register and Publish Models, we introduced code-wrapped LLMs and showed how you can register them in SAS Model Manager, then how to publish them as Docker images using SAS Container Runtime (SCR). In Part 2, SAS Agentic AI – Deploy and Score Models – The Big Picture, we compared deployment options, costs, and performance trade-offs in Azure. In Part 3.1, SAS Agentic AI – Deploy and Score Models – Containers, we got our hands dirty deploying Azure Container Instances. In Part 3.2, SAS Agentic AI – Deploy and Score Models – Apps, we discovered Azure Container Apps and Web Apps for scalable, secure LLM deployments. In Part 3.3, Deploy and Score Models – Kubernetes, we deployed an open source LLM to the same Azure Kubernetes Service cluster as SAS Viya. Why SAS Agentic AI Workflows? Here’s a very quick overview of agentic AI workflows with SAS Viya: SAS Agentic AI workflows give you a governed platform where LLMs, traditional machine learning, rule sets, and human review steps all play together. The Call LLM node is a key piece—it lets you connect any LLM via API. All you need is the container’s URL and a payload with prompts and options. SAS Agentic AI Accelerator standardizes the payload. Flexibility is built-in: swap models or endpoints, run experiments with Prompt Builder, and publish workflows to containers or push them to production. SAS tracks each step with visual diagrams and versioning, so you always know which logic drove the decision. SAS Agentic AI Workflow Lifecycle Here’s the typical lifecycle for your Agentic AI workflow: Build: Assemble your decision workflow using SAS Intelligent Decisioning. Integrate LLMs (via Call LLM node), deterministic models, code files, and rule sets. To prep prompts, use either prompt models (from Prompt Builder) or code files. Publish: Publish your workflow as a container image. Deploy: Deploy it to your target cloud or on-prem environment (e.g., Azure Container Instance, Kubernetes, or VM). Score: Run scoring using a REST API, passing inputs (data and prompts) and retrieving LLM or model-driven decisions. Integrate: Connect your workflow to real-time systems or integrate in applications. How to Build an Agent? Let’s focus on the build with an example. Overview Suppose your goal is to support credit officers and make their job easier when assessing client credit requests. Their main pain point? They perform steps across multiple systems and lose time personalizing rejection emails. Templates exist, but personalization is still semi-automated. They’re considering LLMs—but messages must remain compliant and tone-appropriate. That’s where you come in. You’re setting up a workflow that automates communication with human review checkpoints. Using SAS Intelligent Decisioning, you can build an Agentic AI workflow: Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. Deterministic Models Start with their trusted, governed, versioned model to determine credit approval or rejection based on verified inputs. Branches Next, branch the logic: one for approval messages, one for rejection. Each branch uses different LLMs and prompts. In this context, prompts act as templates that drive the message structure, the tone. LLMs On the rejection branch, the credit team uses an open-source model hosted in their infrastructure. The Call LLM node is key—it connects any LLM via API. Just provide the container’s URL llmURL and a payload llmBody with prompts and options. How Do You Specify Prompts? With SAS, your LLM prompts can come from: Models: The result of Prompt Builder experiments using the LLM Portal Builder. You version your best prompt and manifest it as a portable model, ideal for governed, repeatable experiments. Prompt Builder streamlines prompt development from experimentation to deployment. With project organization, experiment tracking, and integration, teams can confidently develop, test, and operationalize LLM prompts in their business processes. We will discuss the Prompt Builder in a future post. Code File: A Python script that defines your prompt inline. Quick and direct, great for prototyping. You can swap it out later for a full model. Either way, connect them to the Call LLM node, which handles the API call to your deployed LLM endpoint. See deploy the LLM to a Private Azure Container Instance, or deploy the LLM to a Kubernetes pod, the choice is in your hands. Flexibility is built-in: swap models or endpoints, run experiments, and push workflows to production. SAS tracks each step with visual diagrams and versioning, so you always know which logic drove the decision. Evaluate Output Would you blindly trust the LLM output? No. Especially when the outcome is sensitive or could damage the client relationship. You could use a SAS sentiment analysis model to evaluate the LLM-generated message and make the user’s job easier. Human in the Loop Depending on the detected sentiment, add rule sets to determine whether human review is needed, or if the message can be sent as-is. Test Before You Deploy Before going live, score records and review LLM outputs to ensure they meet business requirements. This gives you full control over quality and output, no surprises in production. Visual Path Tracking: See What’s Happening SAS Intelligent Decisioning provides visual diagrams that show how records flow through your workflow. You’ll see: Which LLM processed each case. Where automation succeeded. Where human review was triggered. It’s transparency and governance, built in. Summary SAS Intelligent Decisioning lets you build Agentic AI workflows that combine models, rules, and LLMs. Modular design supports rule sets, prompt engineering, and flexible deployment. The Prompt Builder user interface allows you to save your most successful prompt engineering experiment as a model. The Call LLM node makes it easy to integrate any language model via API. The node is part of the SAS Agentic AI Accelerator and tailored to work with models registered from the SAS Model Manager LLM Model Project. Workflows are fully testable and traceable—ready for production. Discussion SAS is well-positioned to be a significant player in the Agentic AI space. The SAS Viya platform is already trusted by a wide range of companies and users, providing a solid foundation to build upon. In my view, our greatest potential lies in developing workflows specifically tailored to address distinct customer business problems and solving them exceptionally well. While generative AI models are widely accessible, SAS’s value is in being model agnostic, integrating these models within our robust, trusted platform. This allows customers to leverage cutting-edge AI in their existing environments, alongside traditional Machine Learning models developed over the years. What Should You Do Next? Stay tuned for upcoming posts detailing SAS Agentic AI Workflow Lifecycle, about publishing, deploying, scoring and integrating Agentic AI workflows in enterprise applications. Share your workflow stories, experiments, and best tips in the comments below! Try deploying LLMs and building an Agentic AI workflow using the recently updated Agentic AI – How to with SAS® Viya® workshop. If you liked this guide, give it a thumbs up! Acknowledgment Thanks to David Weik and Xin Ru Lee for sharing their time and resources. Additional Resources SAS Agentic AI Accelerator – Register and Publish Models. SAS Agentic AI – Deploy and Score Models – The Big Picture. SAS Agentic AI – Deploy and Score Models – Containers. SAS Agentic AI – Deploy and Score Models – Apps. SAS Agentic AI - Deploy and Score Models – Kubernetes. SAS Container Runtime – SAS Documentation. Want More Hands-On Guidance? SAS offers a full workshop with step-by-step exercises for deploying and scoring models using Agentic AI and SAS Viya on Azure. Access it on learn.sas.com in the SAS Decisioning Learning Subscription. This workshop environment provides step-by-step guidance and a bookable environment for creating agentic AI workflows. For further guidance, reach out for assistance. Find more articles from SAS Global Enablement and Learning here.

Bogdan_Teleuca · ‎07-18-2025

Welcome back to the SAS Agentic AI Accelerator series! If you’ve followed the journey so far—through registration, publishing, and container deployments—then you’re ready for the next stop: deploying your Large Language Models (LLMs) in Azure using Container Apps and Web Apps. If deploying with Docker containers felt like making espresso, get ready for an espresso macchiato. A bit more involved (and maybe some extra foam). Let’s get brewing! Where We Are In The Series In Part 1, Register and Publish Models, we introduced code-wrapped LLMs and showed how you can register them in SAS Model Manager, then how to publish them as Docker images using SAS Container Runtime (SCR). In Part 2, SAS Agentic AI – Deploy and Score Models – The Big Picture, we compared deployment options, costs, and performance trade-offs in Azure. In Part 3.1, SAS Agentic AI – Deploy and Score Models – Containers, we got our hands dirty deploying Azure Container Instances. Part 3.2 (this post), we’ll discover Azure Container Apps and Web Apps for scalable, secure LLM deployments. Example We’ll deploy the open-source Qwen-25-05b LLM (by Alibaba Cloud), included as a code-wrapper in the SAS Agentic AI Accelerator repo. “Qwen” is the short for 'Tongyi Qianwen' in Chinese, meaning “comprehensive understanding of a thousand questions.” The “05b” means 500 million parameters—think of them as brain cells. Code-wrappers come with the SAS Agentic AI Accelerator code repository. Code-wrappers standardize LLM inputs and outputs, can be switched easily into agentic AI workflows and are easy to deploy, thanks to SCR. Deploying Azure Container Apps Set Up Your Environment First, make sure you have the Azure CLI (Command Line Interface) with the containerapp extension: az extension add --name containerapp --upgrade Define Your Variables # Variables to set CONT="my-qwen-app" # Name/DNS label for your app RG="my-resource-group" # Azure Resource Group ACR_NAME="myacr" # Azure Container Registry name IMAGE="qwen_25_05b" # Image name IMAGE_TAG="latest" # Image tag/version ACR_PASS=$(az acr credential show -n $ACR_NAME --query "passwords[0].value" -o tsv)# ACR password or service principal LOCATION="westus3" # Azure region – choose one that suits you ENVIRONMENT="llms" # Azure Container App environment You’re defining basic settings—like the names of your Azure Container Registry, resource group, container app, and image. You’re also grabbing your registry password securely. Create the Container Apps Environment az containerapp env create --name $ENVIRONMENT --resource-group $RG --location $LOCATION This creates a secure environment in Azure where your container apps will live. Think of it as making a “home” for your deployments. Deploy Your LLM as a Container App az containerapp create \ --name $CONT \ --resource-group $RG \ --cpu 4.0 \ --memory 8.0Gi \ --environment $ENVIRONMENT \ --registry-server ${ACR_NAME}.azurecr.io \ --registry-username $ACR_NAME \ --registry-password $ACR_PASS \ --image "${ACR_NAME}.azurecr.io/$IMAGE:$IMAGE_TAG" \ --target-port 8080 \ --ingress external \ --query properties.configuration.ingress.fqdn This command launches your LLM model as a container app in Azure. It connects to your Docker image (containing SAS Container Runtime code), gives it CPU and memory, and makes it accessible over the web (via HTTPS). Check If It’s Alive az containerapp list --resource-group $RG --output table # Or to find your specific app az containerapp show --name $CONT --resource-group $RG --query properties.configuration.ingress.fqdn -o tsv This is listing the deployed apps and retrieving the Fully Qualified Domain Name (FQDN) of your app, which we’ll use during scoring. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. Scoring the Container App Azure Container Apps give you HTTPS endpoints out of the box, auto-scaling, and a slick way to isolate your resources. Think of this as your go-to for agile, event-driven workloads and secure external access. It’s a good compromise between Container Instances and Kubernetes Cluster. Now, let’s wake up your deployed LLM: FQDN=$(az containerapp show --name $CONT --resource-group $RG --query properties.configuration.ingress.fqdn -o tsv); echo $FQDN curl --location --request POST "https://${FQDN}/${IMAGE}" \ --header 'Content-Type: application/json' \ --header 'Accept: application/vnd.sas.microanalytic.module.step.output+json' \ --data-raw '{ "inputs": [ {"name":"userPrompt","value":"customer_name: Xin Little; loan_amount: 20000.0; customer_language: EN"}, {"name":"systemPrompt","value":"You are tasked with drafting an email to respond to a customer whose mortgage loan application has been accepted by the SAS AI Bank. You will be provided with customer_name, loan_amount, customer_language. Follow the guidelines for a professional, friendly response."}, {"name":"options","value":"{temperature:1,top_p:1,max_tokens:800}"} ] }' | jq This curl command sends a request to your deployed LLM, providing text and instructions. The LLM reads your input, processes it, and returns a response (such as a drafted email). Scaling Up: The Easy Way Need to handle more requests? Azure Container Apps make it effortless: az containerapp update \ --name $CONT \ --resource-group $RG \ --min-replicas 1 \ --max-replicas 10 \ --scale-rule-name http-scale \ --scale-rule-http-concurrency 1 You’re telling Azure to automatically add more app instances if traffic increases (auto-scaling). This helps your LLM handle multiple requests at once. Deploying to Azure Web Apps Azure Web Apps are a managed way to host your containerized LLMs. These shine when you need robust hosting for production APIs, HTTPS, deployment slots, and more “set-it-and-forget-it” operations. Set Up Your Web App First, create an App Service Plan (this controls the steam pressure of your espresso machine). Second, create the Web App for Containers. Third, set the SCR scoring port: 8080 or 8443 for TLS. Lastly, get your app’s endpoint to start scoring. # Create an App Service Plan az appservice plan create \ --name my-llm-plan \ --resource-group $RG \ --location $LOCATION \ --sku P1mv3 \ --is-linux # Create the Web App for Containers: az webapp create \ --resource-group $RG \ --plan my-llm-plan \ --name my-qwen-app \ --container-image-name ${ACR_NAME}.azurecr.io/$IMAGE:$IMAGE_TAG \ --container-registry-user $ACR_NAME \ --container-registry-password "$ACR_PASS" \ --https-only true # Set the port az webapp config appsettings set \ --name my-qwen-app \ --resource-group $RG \ --settings WEBSITES_PORT=8080 #Get your app’s endpoint APP_FQDN=$(az webapp show --resource-group $RG --name my-qwen-app --query defaultHostName -o tsv) Scoring the Web App The scoring call is nearly identical to the Azure Container Apps: curl --location --request POST "https://${APP_FQDN}/${IMAGE}" \ --header 'Content-Type: application/json' \ --header 'Accept: application/vnd.sas.microanalytic.module.step.output+json' \ --data-raw '{ "inputs": [ {"name":"userPrompt","value":"customer_name: Xin Little; loan_amount: 20000.0; customer_language: EN"}, {"name":"systemPrompt","value":"You are tasked with drafting an email to respond to a customer whose mortgage loan application has been accepted by the SAS AI Bank."}, {"name":"options","value":"{temperature:1,top_p:1,max_tokens:800}"} ] }' | jq The response times, however, couldn’t be more different. Performance and Scaling Notes I did not find the Azure Web Apps to perform well for scoring open source LLMs. Azure Web Apps are great; they might be suitable for proprietary LLM code wrappers where the deployed image is light. With open source LLMs that download gigabytes of data to a container, they don't really shine. I prefer the Azure Container Apps for cost and response time, for now. Container Apps: Fast deployment, auto-scaling, and HTTPS. Great for lightweight/medium models and variable workloads. Web Apps: Managed, reliable API hosting—good for high-availability, but expect higher costs and slower responses for large LLM images unless you scale up resources. Example: 2 CPU, 16 GB: ~90s/response 8 CPU, 64 GB: ~25s 16 CPU: ~15s 32 CPU: ~10s Security Corner Don’t skip this: your data (and your Security Officer) will thank you. Always use HTTPS for requests. Protect sensitive data and ACR credentials. For sensitive/internal workloads: use VNET, private endpoints, or internal-only ingress. Regularly check logs: Container Apps: az containerapp logs show --name $CONT --resource-group $RG Web Apps: az webapp log download --name my-qwen-app --resource-group $RG Recommendations For quick tests and demos, use Container Apps with public HTTPS—fast, simple, and safe for non-sensitive data. For internal production needs, use VNET (Virtual Network) integration or internal ingress, and plan scaling based on your LLM’s requirements. For robust, high-availability APIs, Web Apps deliver managed hosting and scaling but be prepared to adjust resources and costs. Monitor and tune performance: More CPUs and RAM usually help, but test and adjust for best results. Container Apps have a 2 CPU, 8GB RAM cap. Summary Azure Container Apps and Web Apps both give you flexible, secure options to deploy SAS Agentic AI Accelerator LLMs. Container Apps are ideal for experimentation and lighter workloads; Web Apps offer managed, stable hosting for API-heavy use cases, though faster responses come with a higher price tag. Choose based on your needs, balance security and cost, and don’t forget to experiment and optimize as you go. Thanks for following along! If you find this post helpful, give it a thumbs up, share your stories or questions in the comments, and let’s keep building better AI workflows together. Stay tuned for more! Additional Resources SAS Agentic AI Accelerator – Register and Publish Models. SAS Agentic AI – Deploy and Score Models – The Big Picture. SAS Agentic AI – Deploy and Score Models – Containers. SAS Container Runtime – SAS Documentation. Want More Hands-On Guidance? SAS offers a full workshop with step-by-step exercises for deploying and scoring models using Agentic AI and SAS Viya on Azure. Access it on learn.sas.com in the SAS Decisioning Learning Subscription. This workshop environment provides step-by-step guidance and a bookable environment for creating agentic AI workflows. For further guidance, reach out for assistance. Find more articles from SAS Global Enablement and Learning here.

Bogdan_Teleuca · ‎07-09-2025

it's great, thanks!

Bogdan_Teleuca · ‎07-09-2025

super useful, merci!

Bogdan_Teleuca · ‎07-09-2025

Welcome back to the SAS Agentic AI Accelerator series! If you’ve made it this far in the series, you’ve survived the high-level overviews and cost comparisons—give yourself a pat on the back (or at least a fresh cup of coffee). Now, let’s roll up our sleeves and actually deploy something.; Ever tried to deploy a Large Language Model (LLM) and felt like you were assembling IKEA furniture with missing instructions? Today, I’ll walk you through the nuts and bolts—so you can get your models running in Azure with minimal head-scratching. Today, we’re going to take all that theory and put it into practice: deploying a code-wrapped LLM as a container in Azure. We’ll start simple (public IP), then get secure (private IP), and make sure you know what to watch out for at every step. Where We Are In The Series In Part 1, Register and Publish Models, we introduced code-wrapped LLMs and showed how you can register them in SAS Model Manager, then how to publish them as Docker images using SAS Container Runtime (SCR). In Part 2, SAS Agentic AI – Deploy and Score Models – The Big Picture, we compared deployment options, costs, and performance trade-offs in Azure. In Part 3.1, (that’s this post), let’s get hands-on: actual deployment and scoring scripts, with extra tips on security. Example As an example, let’s walk through deploying an open source model: the phi-3-mini-4k LLM code wrapper. Code-wrappers come with the SAS Agentic AI Accelerator code repository. Code-wrappers standardize LLM inputs and outputs, can be switched easily in agentic AI workflows and are easy to deploy, thanks to SCR. This phi-3-mini-4k model, developed and released by Microsoft, is a lightweight large language model designed for efficiency and quick responses—think of it as a compact, agile AI that doesn’t need a supercomputer to run. If you’re wondering about the quirky name “phi-3-mini-4k,” you’re not alone—it sounds like it could be R2-D2’s distant cousin from the Star Wars universe! There’s just something about AI and robotics that inspires these metallic, alphanumeric names. Maybe it’s a subtle nod to our sci-fi dreams, or perhaps it’s just because “Bob the Bot” doesn’t sound as futuristic or impressive. Either way, let’s see how to get our own “phi” up and running in the cloud—no droids, no bots required! Deploy to Azure Container Instances (Public IP) In Azure, that’s the fastest way to test or demo your LLM. It’s not meant for production or anything sensitive. Michael Goddard wrote about Deploying SAS Container Runtime models on Azure Container Instances. So far the code-wrapped LLM follow the same guidelines. Deployment Script For Azure deployment scripts, you can use the Azure Command Line Interface (CLI). # Variables to set CONT="myprefix-phi" # Name/DNS label for your container RG="my-resource-group" # Azure Resource Group ACR_NAME="myacr" # Azure Container Registry name IMAGE="phi_3_mini_4k" # Image name IMAGE_TAG="latest" # Image tag/version ACR_PASS=$(az acr credential show -n $ACR_NAME --query "passwords[0].value" -o tsv)# ACR password or service principal LOCATION="westus3" # Azure region – choose one that suits you az container create -n $CONT -g $RG \ --image "${ACR_NAME}.azurecr.io/$IMAGE:$IMAGE_TAG" \ --registry-username $ACR_NAME \ --registry-password $ACR_PASS \ --ports 80 8080 \ --protocol TCP \ --dns-name-label $CONT \ --location $LOCATION \ --cpu 4 \ --memory 16 What’s Happening? You’re spinning up a container with your LLM, exposing ports for API access (SCR needs 8080), and giving it enough juice (4 CPUs, 16 GB RAM) to keep things snappy. But it’s public! Anyone with the endpoint can poke your model. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. Scoring Script Time to see if your LLM is awake! Here’s a sample curl command to send a scoring request: # Variables curl -X POST "http://${CONT}.${LOCATION}.azurecontainer.io:8080/${IMAGE}" \ -H 'Content-Type: application/json' \ -d '{ "inputs": [ {"name":"userPrompt","value":"customer_name: X Y; loan_amount: 20000.0; customer_language: EN"}, {"name":"systemPrompt","value":"You are tasked with drafting an email to respond to a customer whose mortgage loan application has been accepted by the SAS AI Bank. You will be provided with three pieces of information: customer_name, loan_amount, customer_language. Use the provided customer name and loan amount to personalize the email."}, {"name":"options", "value":"{temperature:0.7,top_p:1,max_tokens:800}"} ] }' | jq What’s Going On Here? userPrompt: The data you want your LLM to use (customer name, loan amount, etc.). systemPrompt: The “instructions manual” for your LLM. Think of it like setting your model’s GPS—so it doesn’t drive off into the weeds and start talking about quantum computing when you just wanted an email template. options: Controls for creativity, length, and other LLM behaviors. These three inputs are set by the LLM code wrapper from the SAS Agentic AI Accelerator. Deploy with Private IP (VNET Integration) Suitable for internal, secure deployments—perfect for real workflows where you care about data privacy. Deployment Script # New Variables CONTP="myprefix-phi-private" # Name/DNS label for your container VNET="SAS-Viya-azure-vnet" # Virtual Network name SUBNET="llm-subnet" # Subnet name # Step 1: Create a dedicated subnet for containers within your existing VNET az network vnet subnet create \ --resource-group $RG \ --vnet-name $VNET \ --name $SUBNET \ --address-prefix 192.168.3.0/26 \ # adapt it to match your VNET ip range --delegations Microsoft.ContainerInstance/containerGroups # Deploy the container with a private IP az container create \ --resource-group $RG \ --name $CONTP \ --image "${ACR_NAME}.azurecr.io/$IMAGE:$IMAGE_TAG" \ --registry-username $ACR_NAME \ --registry-password $ACR_PASS \ --ports 80 8080 \ --protocol TCP \ --location $LOCATION \ --vnet $VNET \ --subnet $SUBNET \ --ip-address private \ --cpu 4 \ --memory 16 # Retrieve the Private IP of your container az container show --resource-group $RG --name $CONTP --query "ipAddress.ip" --output tsv Make sure your SAS Viya and the deployed containers subnet are on the same VNET! Otherwise, your scoring requests will be like postcards sent to a house with no mailbox. Scoring Script (Private IP Version) The only thing changing in the scoring is the usage of a container IP instead of a DNS label (FQDN) for the public IP container: CONT_IP=$(az container show --resource-group $RG --name $CONTP --query "ipAddress.ip" --output tsv) echo "CONT_IP=${CONT_IP}" curl -X POST "http://${CONT_IP}:8080/${IMAGE}" ... It’s just the same scoring request as before, just use the container’s private IP address instead of the container’s DNS from the public example. Security Corner Don’t send PII (that stands for Personally Identifiable Information) over public endpoints. Ever. (Seriously. Somebody might be listening to your traffic.) No HTTPS by default: Traffic isn’t encrypted—even inside a VNET. For extra-sensitive data, put additional controls in place (private endpoints, network security groups, etc.). Lock down your ACR credentials: Treat them like your Netflix password. Or better. Summary Public IP deployments are fast for testing, but risky for anything sensitive. VNET-integrated (private) deployments are more acceptable for real-world use. Security matters—enforce it from the start. Troubleshooting is part of the journey. Don’t let the first error stop you. Prompt engineering is your secret weapon for great LLM responses. What Should You Do Next? Test out both deployment methods—see which fits your needs. Experiment with prompt engineering to get the best responses from your LLM. Share your best (or worst!) LLM deployment stories in the comments below. If it’s an embarrassing (technical, please) story you can use the Anonymous option. Read the next post in the series (coming soon). Additional Resources SAS Agentic AI Accelerator – Register and Publish Models. SAS Agentic AI – Deploy and Score Models – The Big Picture. SAS Container Runtime – SAS Documentation. Want More Hands-On Guidance? SAS offers a full workshop with step-by-step exercises for deploying and scoring models using Agentic AI and SAS Viya on Azure. Access it on learn.sas.com in the SAS Decisioning Learning Subscription. This workshop environment provides step-by-step guidance and a bookable environment for creating agentic AI workflows. If you liked the post, give it a thumbs up! Please comment and tell us what you think about the AI Decisioning Assistant. For further guidance, reach out for assistance. Let us know how this solution works for you! Find more articles from SAS Global Enablement and Learning here.

Bogdan_Teleuca · ‎07-09-2025

@Meijonk @Woitake Azure new VPN diagnostic tool: thanks for the tip! haven't tried it lately, but sounds like a good debugging tool. Routes can be tricky I fully agree.

Bogdan_Teleuca · ‎06-17-2025

Welcome back to the SAS Agentic AI Accelerator series! Today we’ll explore how to deploy and score code-wrapped Large Language Models (LLMs) in Azure, then call them from Agentic AI workflows inside SAS Viya. To keep things clear, the topic is split into two parts: The Big Picture – a high-level overview with a short video and comparison tables that help you choose a deployment method. Azure is our example cloud. The Nitty-Gritty – a hands-on guide with deployment and scoring scripts. Where We Are In The Series In Part 1, Register and Publish Models, we introduced code-wrapped LLMs and showed how to publish them with the SAS Container Runtime (SCR). The end result was Docker images in a container registry. Part 2 — this post — covers the deployment options. Deployment and Scoring Overview After registering and publishing an LLM code wrapper, you can deploy it as a Docker image in various environments. Once deployed, you can score using the SAS Container Runtime API. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. Deployment Options Here’s a quick overview of the deployment options in the Azure cloud. That’s not an exhaustive list. This list reflects only what I tested: Deployment Option Use Case Scalability Ease of Setup Security Azure Container Instances Lightweight, quick starts Low Simple Public or private IP (HTTP only) Azure Container Apps Event-driven, auto-scaling Medium Managed Public IP (HTTPS) Azure Web Apps Managed containers hosting Medium Managed Public IP (HTTPS) Kubernetes Pods Large-scale, fully orchestrated High Complex (requires YAML, must manage node resources) Flexible: Private / public IP (HTTP or HTTPS) Containers on Virtual Machines Legacy or custom configurations Medium Moderate Flexible (private/public) Key Considerations for Each Deployment Option As you can see, there are so many options available in the Azure cloud. The following table should help you choose the one for you needs. Deployment Option Feasibility Advantages Limitations Azure Container Instances Ideal for small open source LLMs like phi-3-mini-4k (4 vCPUs, 16 GB RAM). Easiest to launch. Limited to 4 CPUs, 16 GB RAM. Azure Container Apps Middle ground between Container Instances and Kubernetes clusters. Built-in HTTPS ingress and auto-scale. Capped at 2 CPUs and 8 GB RAM; may cause out-of-memory errors. Azure Web Apps Simple to deploy and scale. Works for lighter workloads. Supports deployment slots. Resource limits can bottleneck performance. Adding resources doesn’t always improve performance. Kubernetes Pods Production-grade control over resources, scaling, and isolation. Fine-grained control over resources, scaling, and isolation. Requires Kubernetes skills (that’s what customers are always telling us). Containers on Virtual Machines Highly flexible for legacy systems or custom configurations. Complete control over CPU, RAM, and disk. Higher cost and operational effort. Pricing Comparison Any deployment choice, in a cloud, has a cost. To help you evaluate the cost of each deployment option, the table below summarizes typical daily costs based on Azure pricing estimates. These values may vary depending on region, container size, and configuration. The estimates assume low, infrequent traffic, a few requests per day from your Agentic AI workflow to your deployed LLM. Deployment Option Estimated Daily Cost Details Azure Container Instances ~$3–$10/day Cost depends on CPU and memory allocation (e.g., 2 CPUs and 8 GB of memory). Azure Container Apps ~$5–$12/day Includes management costs, ingress, and scalability features. Azure Web Apps ~$8–$15/day Managed service costs include app hosting and container runtime fees. Kubernetes Pods ~$10–$20/day Costs vary based on cluster size, node configuration, and resource requirements. Containers on Virtual Machines ~$15–$25/day Includes VM hosting fees, container runtime costs, and storage costs for legacy systems. Observed Price Comparison I compiled my experiments for one week, by using one of our own Azure tenants. Here’s the actual cost, per day, for a phi-3-mini-4k LLM deployment in Azure Container Instances, Container Apps, Web Apps, and Kubernetes Service. Next to it I highlighted the response time, in seconds. Deployment Type Cost Components Estimated Cost Per Day (USD) Response Time (sec) Container Instances Compute Costs $5.60 48.31 Container Apps Base Pricing $3.22 45 App Service Plans (Web Apps) Premium v3 P1mv3 $4.32 90 Premium v3 P3mv3 $18.48 25 Premium v3 P4mv3 $35.52 15 Premium v3 P5mv3 $71.24 10 Kubernetes Deployment - Extra Node Compute Costs (Standard_D4as_v5, 4 vCPUs, 16GB) $14.40 (approx.) 42 Disk Costs $2.16 Total (Compute + Disk) $16.56 (approx.) 42 Compute Costs (Standard_D16as_v5, 16 vCPUs, 64GB) $57.60 (approx.) 43 Disk Costs $2.16 Total (Compute + Disk) $59.76 (approx.) 43 Findings: Container Instances: Listed first due to its simplicity and lower cost for isolated deployments: To deploy the small open-source LLMs, qwen-25-05-b and phi-3-mini-4k we found that 4 CPUs and 16 GB of memory are more appropriate. And these are small models. To deploy a larger open-source LLM, such as a LLaMA 2-7B (Large Language Model Meta AI), you may require 16–32 GB and 4–8 vCPUs and lots of disk space, 20 – 50 GB. You could deploy it on 4 vCPUs and 16 GB RAM but that means your model may be quite constrained and the response time will be quite high, if you get a response at all.Container Instances: Listed first due to its simplicity and lower cost for isolated deployments: Container Apps: Second, as it offers scalable, event-driven microservices at a competitive cost. The scalability feature is interesting, allowing the app to spin more pods for concurrent scoring requests. Built-in ingress (HTTPS endpoints) and scalability make this an excellent choice for lightweight open source model deployments. Web Apps: Listed next, reflecting managed hosting options with varying performance tiers. Choose configurations that allow at least 4 vCPUs and 16 GB RAM. We tried scaling up the App Service Plan gradually. As you can observe, throwing more resources at a model, doesn’t proportionally reduce the response time. There’s a fine balance between cost and performance. You can only find it by experimenting. Kubernetes Deployment: Last, as it is best suited for complex workflows requiring high scalability and orchestration. For the phi-3-mini-4k LLM we added a dedicated node and deployed the container in a pod. Scaling the node size didn’t seem to influence the response time. Perhaps other parameters such as disk type an IOPS should be fine-tuned. More work is needed. Discussion Some customers avoid proprietary LLMs from OpenAI, Google, or Azure because their data would leave their premises (or their cloud). They ask a way to use open-source, on-prem LLMs. After a month of testing, I’ve learned the contest isn’t equal: Proprietary cloud LLMs almost always win on cost, latency, and accuracy. Self-hosting shifts all compute costs to you, so each request costs more and takes longer. High latency limits daily throughput, pushing the per-request price even higher. That premium is the trade-off for keeping data inside your own walls. Summary The SAS Agentic AI Accelerator lets you deploy code-wrapped LLMs almost anywhere: Azure services, Kubernetes, or standalone VMs. Use the tables above to balance cost, performance, and operational effort. Stay tuned for Part 3, where we’ll dig into deployment scripts, scoring calls, and security tips. Acknowledgements Thanks to Mike Goddard (@MichaelGoddard) for guidance on SAS Container Runtime Kubernetes deployments. Additional Resources SAS Agentic AI Accelerator – Register and Publish Models. SAS Container Runtime – SAS Documentation. How to Publish a SAS Model to Azure with SCR: A Start-to-Finish Guide. Workshop Environment Agentic AI – How to with SAS Viya workshop now available on learn.sas.com to SAS Customers in the SAS Decisioning Learning Subscription and SAS Employees. This workshop environment provides step-by-step guidance and a bookable environment for creating agentic AI workflows. If you liked the post, give it a thumbs up! Please comment and tell us what you think about the AI Decisioning Assistant. For further guidance, reach out for assistance. Let us know how this solution works for you! Find more articles from SAS Global Enablement and Learning here.

Bogdan_Teleuca · ‎06-16-2025

The SAS Agentic AI Accelerator is designed to help businesses integrate Generative AI into their workflows efficiently. By enabling the registration of models, including proprietary and on-premises LLMs, it allows you to wrap these models in code. These code-wrapped models can later be deployed and used seamlessly in SAS Intelligent Decisioning, SAS Studio, and other applications, ensuring flexibility and governance. This post, the first in a series, explores how registration and publishing set the foundation for scalable agentic AI workflows. Overview The SAS Agentic AI Accelerator has been a key topic at SAS Innovate 2025, showcasing innovative ways to build agentic AI workflows. Developed by a team of SAS experts, this accelerator leverages SAS Viya products and capabilities to bring Generative AI into practical, governed use cases. If you missed the sessions, here’s a quick introduction to what the accelerator offers: Why an Accelerator? The SAS Agentic AI Accelerator can help companies adopt Generative AI in a structured and agile way, drastically reducing the time from prototype to production. Key benefits include: Enabling business users to create Generative AI use cases using low/no-code solutions. Integrating external large language models (LLMs) such as OpenAI’s GPT, Google’s Gemini, or open-source models from Hugging Face into workflows and agents. Combining non-deterministic LLMs with deterministic models created in SAS or Python. Governing and controlling LLM usage in workflows for secure deployment and effective monitoring. SAS Agentic AI Accelerator Components A foundational model repository in SAS Model Manager. A prompt builder user interface. A prompt model repository. Agentic AI workflows in SAS Intelligent Decisioning. Monitoring dashboards in SAS Visual Analytics. Scripts and custom steps to interact with LLMs. The SAS Agentic AI Accelerator is evolving rapidly, with components being added, updated, or removed as we speak. Building Agentic AI Workflows To create your own workflows using the SAS Agentic AI Accelerator, follow these steps: Set up your SAS Viya environment: Ensure your environment is configured for the accelerator. Register and publish models: Use scripts to register models in SAS Model Manager and publish them from here. Deploy and score models: Prepare models for execution in target environments. Build workflows: Design agentic AI workflows using SAS Intelligent Decisioning. Deploy workflows: Publish and score workflows for production use. Integrate into applications: Embed workflows into enterprise systems. Monitor usage: Keep track of workflow performance and effectiveness. 1. SAS Viya Environment For demonstration purposes, the SAS Viya Enterprise 2025.03 stable and LTS versions were used, deployed on Azure Cloud with configurations such as Python, Kaniko (for SAS Container Runtime), and Azure publishing destinations. While fine-tuning the environment for model publishing can be challenging, it’s entirely achievable. 2. Register and Publish Models The SAS Agentic AI Accelerator includes a code repository that simplifies model registration and publishing. Large Language Models (LLMs) wrapped in code can be registered as models in SAS Model Manager. Registration from the Git repository is facilitated by a script. Code Repositories SAS_LLM_UCF (owner: David Weik). The primary and most up-to-date repository. The repository is private, available only to SAS employees. If/when it is offered publicly, I will post an update. Agentic AI - How to with SAS Viya (owner: Education (GEL)). A subset of the original repository designed for educational workshops is available with the workshop. Code-Wrapped Large Language Models Code wrappers serve as deployment instructions for Large Language Models (LLMs). When wrapped in code, LLMs can be registered as models in SAS Model Manager. These wrappers standardize inputs and outputs, making it easier to integrate or replace models in workflows, regardless of their type or source. Why Code Wrappers Matter By standardizing inputs and outputs, code wrappers simplify the deployment process and ensure consistency and reusability across workflows. However, registered models cannot be scored directly in SAS Viya. They must first be deployed, typically in a container, such as a Docker environment, to enable execution and scoring. Governance Models registered in SAS Model Manager are governed, ensuring: Version control. Permission settings for access and usage. Tracking of model publishing destinations. Detailed documentation through model cards. Publishing Models When models are published to a container destination, such as Azure, the code wrappers are transformed into Docker images. These images are portable, allowing deployment across different cloud platforms or on-premises environments. Publishing also enables models to be used via REST APIs, which are essential for scalable integration into agentic AI workflows. REST APIs facilitate real-time communication between systems, ensuring seamless interaction with enterprise applications. While this post provides an overview of model publishing, we’ll dive deeper into the deployment process in a future article. Read SAS Agentic AI – Deploy and Score Models – The Big Picture where we explore how to deploy and score code-wrapped Large Language Models (LLMs) in Azure. Conclusion The SAS Agentic AI Accelerator simplifies the integration of Large Language Models into workflows through the use of code wrappers. These wrappers provide standardized inputs and outputs, allowing models to be registered, governed, and published as Docker images for deployment across various platforms. With the capabilities of SAS Model Manager, you can: Govern models effectively with version control and permissions. Publish models to containers for scalable, enterprise-grade applications. Acknowledgement Special thanks to: David Weik for invaluable insights and explanations. Xin Ru Lee (@XinRu) for sharing and assistance on numerous occasions. Workshop Environment Agentic AI – How to with SAS Viya workshop is now available on learn.sas.com to SAS Customers and SAS Employees. This workshop environment provides step-by-step guidance and a bookable environment for creating agentic AI workflows. For SAS Customers, the workshop is available in the SAS Decisioning Learning Subscription. Additional Resources SAS Agentic AI – Deploy and Score Models – The Big Picture. SAS Video Portal - From Idea to Production With LLMs and SAS Viya. If you liked the post, give it a thumbs up! Please comment and tell us what you think about the AI Decisioning Assistant. For further guidance, reach out for assistance. Let us know how this solution works for you! Find more articles from SAS Global Enablement and Learning here.

Bogdan_Teleuca · ‎04-13-2025

In a previous post, From Chat to Decision: Building an AI Assistant with SAS Intelligent Decisioning and Azure, we demonstrated how to combine conversational AI with decision-making logic to calculate customer risk scores. This post dives deeper into the mechanics of that solution, showing how SAS Intelligent Decisioning, Azure OpenAI, and Azure Logic Apps work together to deliver intelligent, automated decisions. Overview The AI assistant collects user inputs, such as demographic and financial data, processes them via an Azure Logic app, and sends a scoring request to the decision logic deployed in an Azure Container Instance (ACI). This decision logic calculates the risk score, which is then returned to the AI assistant in a user-friendly format. Solution Architecture Components Here’s how the components interact in this solution: AI Assistant in Azure AI Foundry: Collects user inputs via a conversational interface and generates a structured JSON payload that is sent to an Azure Logic app. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. Azure Logic app: Acts as middleware, handling the JSON payload from the AI assistant, sending it to the decision logic, running in a container, and formatting the response. SAS Intelligent Decisioning: The core business logic is built and tested here. It calculates the risk score based on the inputs provided. The decision is then published to Azure, where it becomes a Docker image. Azure Container Instance (ACI): Hosts the decision logic as a running Docker container, making it scalable and independent of a full SAS Viya environment. Key Features and Benefits This solution brings several innovations to the table: Deploying a SAS Intelligent Decisioning decision as a Docker image allows businesses to scale decision-making independently of SAS Viya environments. Azure OpenAI provides a powerful framework for building AI assistants that deliver actionable insights through natural language interactions. The solution takes advantage of a recent integration between Azure OpenAI and Azure Logic Apps. Azure Logic Apps enable smooth communication between the AI assistant, and the container instance where the decision logic is deployed. It provides low-code-no-code tools, without requiring custom code for complex workflows. Inputs for risk calculations are collected using conversations, formatted into JSON behind the scenes, results are converted to text, making easier for users to interact. Step-by-Step Guide to Building the Solution 1. Develop and Test the Decision Logic in SAS Intelligent Decisioning Start by building the decision logic that calculates the risk score. Notice the expected inputs and the returned outputs: 2. Create an Azure Publishing Destination To publish the decision logic for deployment: Create an Azure Container Registry (ACR) to store the decision logic as a Docker image. Configure the Azure publishing destination in SAS Intelligent Decisioning. For detailed steps, refer to How to Publish a SAS Model to Azure with SCR: A Start-to-Finish Guide. Publish the decision to the ACR. For example, it will appear as repository: risk_rating2_0:latest. 3. Deploy the Decision Logic to an Azure Container Instance Once the Docker image is in the ACR: Create an Azure Container Instance (ACI) and deploy the image. Note the container endpoint (e.g., http://CONT.LOCATION.azurecontainer.io:8080/risk_rating2_0) and the payload format required for scoring. To test the container, you can use the following curl command: curl --location --request POST "http://${CONT}.${LOCATION}.azurecontainer.io:8080/${IMAGE}" \ --header 'Content-Type: application/json' \ --header 'Accept: application/json' \ --data-raw '{"inputs" : [{"name": "age", "value" : 30}, {"name": "credit_score", "value" : 720}, {"name": "dti", "value": 40}, {"name": "employment_status", "value": "full-time"}, {"name": "income", "value" : 45000} ] }' 4. Integrate with Azure Logic Apps Azure Logic Apps bridge the gap between the assistant and the decision logic. Here's how to set it up: Trigger: HTTP Request Configure the Logic App to trigger on an HTTP request. The request body should adhere to a specific JSON schema, which includes user inputs such as age, employment status, income, credit score, and DTI. In Azure AI Foundry, when you attach to the AI assistant the Azure Logic App as a function: A JSON schema is generated: When the Logic app receives a JSON payload that adheres to that schema, the rest of the logic is triggered. Action: HTTP Request Send a POST request to the container endpoint using the JSON payload. Action: Parse JSON Response Use a predefined schema to parse the JSON response from the container. Action: Initialize Variable (Optional) The parsed outputs array from the JSON response is stored in a variable named risk_rating_outputs. This variable is of type array. Action: Create CSV Table (Optional) Extract the variables from the array and convert to a CSV format for readability. Action: Return Response Send the processed response back to the AI assistant. 5. Configure the AI Assistant in Azure OpenAI Foundry Create an Azure OpenAI resource in your Azure subscription. Deploy a large language model in Azure OpenAI. We used GPT-4o. Create an AI assistant with instructions tailored to the risk-rating task. Sample Instructions for the Assistant: You are an AI assistant that helps people calculate risk rating depending on inputs. Greet with: 'Hi, I can calculate your risk rating. For a precise calculation, I am going to need some inputs.' You will need some inputs for the calculation. Ask each person the following: 'What is your age?', 'What is your employment status? (choose only from this list: "unemployed", "part-time", "full-time")', 'What is your yearly income?', 'What is your current credit score?', 'What is your debt to income ratio or 'DTI'? Collect those values, keep track of the value you collected or not. Pass the collected values to the __ALA__sas-decisioning function. The function returns a csv table. Extract risk_rating (a string) from the raw csv table outputs. Provide a text with the calculated risk_rating. Provide additional data used for risk calculation, if asked. Answer with a message: 'Based on the inputs you provided, SAS Intelligent Decisioning calculated your risk rating: …' If the question is not related to a risk rating calculation say: "Please contact your Credit Officer. I do not have knowledge of what you are asking, or I am not authorized to respond." The assistant is programmed with instruction-based prompt engineering to ensure it gathers all required inputs systematically. It integrates with the Logic App via a function (e.g., __ALA__sas-decisioning) that passes the inputs to the container instance and processes the results. The instructions are an example of instruction-based prompt engineering, ensuring the assistant operates within a predefined domain and delivers accurate, task-specific responses while avoiding ambiguity or scope creep. It uses: Step-by-step task definition. Behavioral constraints. Structured inputs/outputs. Function integration: The assistant is programmed to interact with an external function (__ALA__sas-decisioning) and utilize its results for the task. Testing the Solution Before deploying the solution in production, test it end-to-end: Logic app testing: Use the following payload to test the Logic app: { "parameters": { "properties": { "age": 30, "employment_status": "full-time", "income": 45000, "credit_score": 720, "dti": 40 } } } Verify that the Logic app triggers correctly, sends the payload to the container instance, and returns a structured response: AI assistant testing: Interact with the Azure OpenAI assistant to ensure it collects inputs accurately, passes them to the Logic App, and displays the risk rating. Example Response: "Based on the inputs you provided, SAS Intelligent Decisioning calculated your risk rating: Low Risk." Ask details about the calculation: "Explain how you calculated my Low Risk. What is behind?" Conclusion This post has shown how to: Build an AI assistant that calculates risk ratings using conversational AI, SAS Intelligent Decisioning, and Azure resources, using a low-code, no-code approach. Use Azure Logic Apps to integrate API responses into workflows. By adopting a similar approach, businesses can deliver personalized, decision-driven insights at scale. Whether for financial risk assessment or other domains, this solution demonstrates how conversational AI can be combined with powerful decision-making tools to create intelligent, automated workflows. Additional Resources Video demonstrating the approach: From Chat to Decision: Building an AI Assistant with SAS Intelligent Decisioning and Azure. I was inspired by the following resources: Call Azure Logic Apps as Functions Using Azure OpenAI Assistants. Azure OpenAI Assistants API with Logic Apps. If you liked the post, give it a thumbs up! Please comment and tell us what you think about the AI Decisioning Assistant. For further guidance, reach out for assistance. Let us know how this solution works for you! Find more articles from SAS Global Enablement and Learning here.

Bogdan_Teleuca · ‎03-20-2025

What if smarter decisions came from natural conversations? Imagine an AI assistant that not only listens to you but drives critical workflows—flawlessly calculating risk ratings, explaining complex outcomes, and ensuring decisions align with trusted, replicable standards. By integrating Azure OpenAI conversational capabilities with SAS Intelligent Decisioning, the approach in the post blends the power of conversational AI and advanced analytics into a unified experience. From secure login to rule-based decisioning, this AI-powered system tackles complex decision-making with speed, transparency, and reliability. Want to know how this changes the game? Watch the demo, explore the innovation, and discover how smarter decisions may reshape the future! The Essential Imagine chatting with a GPT model—quick responses, dynamic interactions, and intuitive conversations. But when it comes to calculating something as critical as a risk rating, would you trust it blindly? Of course not! Instead, picture this: redirecting the conversation at the perfect moment to a robust internal system, like SAS Intelligent Decisioning, which handles risk calculations with proven reliability and precision. By integrating powerful AI tools like Azure OpenAI Assistants and SAS Intelligent Decisioning through a few Python-based applications, organizations gain the ability to automate complex decision-making workflows without much effort. Not to mention that AI becomes Trustworthy AI. Most Important Points Watch the demonstration—hopefully, the approach becomes self-explanatory. If it doesn’t, well, I guess I’ll need to sharpen my explanation skills! 😊 Understanding why this solution is impactful requires looking at its main features: Secure Login: Credit officers, like Shannon in the demo, log in securely using Azure credentials to ensure that sensitive data stays protected. Conversational AI Assistant: Users interact conversationally via Azure OpenAI, submitting financial and personal inputs such as age, income, credit score, and debt-to-income (DTI) ratio. Integrated Decisioning: The AI assistant passes the collected information to SAS Intelligent Decisioning, which calculates risk ratings using advanced rule sets and thresholds. Transparency Mode: Upon request, the assistant explains the exact calculations for each variable, ensuring credit officers can confidently justify decisions to customers. Scalable Deployment: Front-end and back-end applications (built with Python and Streamlit) deployed on Azure enable flexibility for cloud-based integration. This solution isn’t just limited to banking—it can help transform decision-making across other industries. What’s New and Novel About This Approach What makes this solution stand out is its emphasis on transparency and its ability to merge Azure OpenAI's conversational AI with real-time decisions powered by SAS Intelligent Decisioning. We explored an integration using Azure Logic Apps and a SAS Container Runtime in From Chat to Decision: Building an AI Assistant with SAS Intelligent Decisioning and Azure. However, this time, the integration is direct, the AI Assistant calls the decision deployed in SAS Micro Analytic Service, in SAS Viya. Here’s what sets it apart: Dynamic Interactions: The AI assistant doesn’t just chat—it collects and processes user inputs, performs complex calculations behind the scenes, and delivers clear explanations of the outcomes. Implicit Data Management: Users can provide inputs in nearly any format. Transforming that data into the correct JSON structure to score the decision is handled entirely by the Azure AI Assistant. It even translates JSON responses back into user-friendly paragraphs, eliminating the need for traditional ETL processes. Advanced Rule-Based Modeling: SAS Intelligent Decisioning enables trustable, replicable, governed risk calculations based on predefined rules, or custom logic. Transparency in Decision-Making: The system doesn’t just compute—it explains. By leveraging decision metadata, the assistant breaks down how calculations are derived, ensuring users understand the reasoning behind every outcome. Versatility Across Industries: While demonstrated in banking, this approach is adaptable to sectors like healthcare, insurance, or any field where risk and decision modeling are essential. Summary of the Components and Their Role Here’s a breakdown of the components involved: Frontend App (Streamlit): Provides an intuitive interface, backed by Python, connecting users with the AI-driven decision system. Backend App (FastAPI): Written in Python, the backend bridges Azure and SAS integrations, ensuring authentication, communication and data exchange. Azure OpenAI Assistant with function calling feature: Acts as the user-facing interface, facilitating conversational interactions, gathering inputs, and ensuring clear communication. SAS Intelligent Decisioning: Processes risk calculations using predefined rules deployed to the SAS Micro Analytic Service, which performs high-speed analytics accessible via REST APIs. Secure Infrastructure: By leveraging Azure authentication and SAS Viya authentication, sensitive customer data is safeguarded during processing. Conclusion The integration of an AI Assistant with a SAS Intelligent Decisioning decision deployed via SAS Micro Analytic Service, could very well represent the future of decision-making in one alternative universe. By combining conversational AI with advanced analytics, it delivers insights that are transparent, reliable, and lightning-fast. The demo highlighted how a credit officer can securely interact with the assistant, submit input data, receive a calculated risk rating, and even request detailed explanations—all seamlessly powered by Azure OpenAI and SAS Intelligent Decisioning. We also took a step further, by packaging it all as sleek web apps deployed in the Azure cloud for ultimate flexibility and scalability. Thank you for exploring this exciting technology with me. In future posts, we will explore what is hidden behind the frontend and the backend app. Stay tuned for more innovations with SAS Viya. If you liked the post, give it a thumbs up! Please comment and tell us what you think about the AI Decisioning Assistant. If you wish to get more information, please write me an email. Find more articles from SAS Global Enablement and Learning here.

Bogdan_Teleuca · ‎03-16-2025

Imagine a custom step in a SAS Studio flow that generates another custom step to handle specific tasks. A custom step is a file, therefore why not use a large language model (LLM) like GPT-4o from Azure OpenAI to generate one? Today, we’ll explore how to leverage this technology to create custom steps for data processing, using Python or SAS logic. Overview: The LLM Custom Step Generator The LLM - Custom Step Generator is a custom step that uses Azure OpenAI's GPT-4o to create fully functional custom steps. These steps can be used in SAS Studio flow to perform specific tasks, such as data anonymization, merging tables, or generating detailed documentation. The custom step has been published to the SAS Studio Custom Steps Public GitHub Repository. You can find it under LLM - Custom Step Generator with Azure OpenAI. The process is simple: Define the logic of the custom step through a detailed prompt. Provide the required environment and configuration files to establish a connection with your Azure OpenAI service and ground the model. Run the generator to output the custom step code. The result? A fully functional custom step tailored to your specific requirements. Watch the video demonstration to find out more. Example 1: Custom Step Anonymizing Personal Data in a Text File You have a text file containing sensitive personal information, such as names, email addresses, and physical addresses. Your goal is to anonymize this data using a custom step. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. Steps to Generate the Custom Step Define the Inputs: Prompt: Write a detailed description of the custom step logic. For instance: Create a custom step that reads an input file containing PIIs, identifies the data containing PIIs and anonymizes it. 1. input is a csv file input.csv. 2. output is output.csv. 3. the logic is written in Python. Provide the Prompt UI, the program and the full step file. .env File: This file contains the environment variables needed to access the Azure OpenAI API. Example: AZURE_OAI_ENDPOINT='https://my_endpoint.openai.azure.com/' # change my_endpoint AZURE_OAI_KEY='my_api_key' # change my_api_key AZURE_OAI_DEPLOYMENT='gpt-4o' System Message File: A file providing context to the LLM about what a custom step is, along with examples and guidelines. This is really the “secret sauce in the recipe”. In a nutshell it describes what a custom step is, provides a few Prompt UI, Programs written in SAS and Python and tweaks the output with important instructions. Define the Output Location: Specify where the generated step code should be saved. Run the Generator: Execute the LLM - Custom Step Generator with Azure OpenAI. The process takes about 15–30 seconds, depending on the complexity of the prompt. Review the Output: The generator produces a file containing the custom step code. This file includes the Python logic for anonymizing personal data, along with a prompt UI for configuring inputs and outputs. Test the Custom Step: Save the generated code as a .step file. Add the custom step to a workflow, select the input text file, and specify the output file. Run the workflow and verify the results. The custom step successfully anonymized the text file by replacing sensitive information with hashed values or other anonymized strings. The process demonstrated the accuracy of the LLM in generating Python logic based on the provided instructions. Example 2: Classical Data Management with SAS Logic You need to perform classic data management tasks, such as merging two tables, calculating the top-selling product per month, and creating a summary table of top products. Steps to Generate the Custom Step Define the Inputs: Prompt: Provide a detailed description of the logic: Create a custom step using SAS logic. The step has two table inputs, for example SASDM.PRDSAL2 and SASDM.PRDSAL3. The logic will merge the two tables. Then it will summarize the product sales by YEAR, MONTH, PRODUCT and sum up the ACTUAL sales. It will then create another data set NATIONAL_SALES in SASDM listing by YEAR, MONTH create a new column CHAMPION_PRODUCT equal with the top selling product. Use the same .env and system message files as in the first example. Run the Generator: Execute the generator with the updated prompt. Review the Output: The generator produces a .step file containing the SAS logic for the specified tasks. The step includes input configurations for the source tables and an output configuration for the result table. Test the Custom Step: Add the generated step to a workflow. Configure the input and output tables. Run the workflow and verify the results. Results The custom step successfully merged the tables, calculated the top-selling products, and created the summary table. The SAS logic was accurate and aligned with the prompt instructions. Key Components To replicate these examples, ensure you have the following: Azure OpenAI Resource: Create an Azure OpenAI resource and deploy a model like GPT-4o. Environment Configuration (.env File): Include the necessary API endpoint, key, and deployment name in a .env file. System Message File: Provide detailed instructions and examples to guide the LLM in generating the custom step. Python Dependencies: Install python-dotenv and requests to handle environment variables and API requests. Why Use the LLM Custom Step Generator? Efficiency: Automates the creation of custom steps, saving time and effort. Flexibility: Supports multiple programming languages, including Python and SAS. Accuracy: Leverages the power of GPT-4o to generate precise and (in most of the instances) functional code. Scalability: Can be used to generate a wide range of custom steps for various tasks. Where to Find the Custom Step The custom step is in the process of being published to the LLM - Custom Step Generator with Azure OpenAI, SAS Studio Custom Steps Public GitHub Repository. We will publish an update when it's done. Thank you for your patience. Conclusion The ability to generate custom steps using Azure OpenAI GPT-4o opens up new possibilities for data management, automation, and innovation. Whether you’re anonymizing data, managing tables, or documenting workflows, the LLM - Custom Step Generator provides a powerful and flexible solution. The examples presented here are just the beginning—what custom steps will you create next? Thank you for exploring this exciting technology with me. Stay tuned for more innovations in Data Management with SAS Viya. Thank you for your time reading this post. If you liked the post, give it a thumbs up! Please comment and tell us what you think about the LLM custom step generator. If you wish to get more information, please write me an email. Find more articles from SAS Global Enablement and Learning here.

Bogdan_Teleuca · ‎03-16-2025

@NickSmisdom I am not sure why the public key has to be described in SAS Viya. Anyway, this posts deserves a "review", as a few things changed in regards to authentication. Will put it on my to-do.

Online Status	Offline
Date Last Visited	a week ago

Re: Can GPT-5 Build SAS Studio Flows? A Hands-On Test Drive

Can GPT-5 Build SAS Studio Flows? A Hands-On Test Drive

Re: Go with the Job Flow in SAS Viya 3.5

Re: Go with the Job Flow in SAS Viya 3.5

Re: SAS Agentic AI – Deploy and Score Models – Kubernetes

SAS Agentic AI – Build Workflows in SAS Intelligent Decisioning

SAS Agentic AI – Deploy and Score Models – Kubernetes

SAS Agentic AI – Deploy and Score Models – Apps

Re: All about CORS and CSRF for developing web applications with the S...

Re: Configure Cross-Origin Resource Sharing for SAS Viya for REST API’...

AI Agents and Agentic AI: What’s the Difference?

A ModelContextProtocol Server(mcp) for Scoring with SAS Viya

All about CORS and CSRF for developing web applications with the SAS V...

Configure Cross-Origin Resource Sharing for SAS Viya for REST API’s an...

Using Azure OpenAI GPT Models in SAS Viya

SAS Agentic AI – Deploy and Score Models – Kubernetes

SAS Agentic AI – Deploy and Score Models – Containers

SAS Agentic AI – Build Workflows in SAS Intelligent Decisioning

SAS Agentic AI Accelerator – Register and Publish Models

SAS Agentic AI – Deploy and Score Models – Apps

Re: Can GPT-5 Build SAS Studio Flows? A Hands-On Test Drive

Re: Go with the Job Flow in SAS Viya 3.5

Re: SAS Agentic AI – Deploy and Score Models – Kubernetes

SAS Agentic AI – Build Workflows in SAS Intelligent Decisioning

SAS Agentic AI – Deploy and Score Models – Apps

Re: All about CORS and CSRF for developing web applications with the S...

Re: Configure Cross-Origin Resource Sharing for SAS Viya for REST API’...

SAS Agentic AI – Deploy and Score Models – Containers

Re: How to Connect SAS Viya in Azure to On-Prem with VPN Gateways - Pa...

SAS Agentic AI – Deploy and Score Models – The Big Picture

SAS Agentic AI Accelerator – Register and Publish Models

From Chat to Decision: A Blueprint for Intelligent AI Assistants

Smart Conversations, Smarter Decisions: Integrating SAS Intelligent De...

LLM Custom Step Generator in SAS Studio

Re: How to Clone a GitHub Repository in SAS Studio Using SSH

SAS Global Forum 2020