Welcome back to the SAS Agentic AI Accelerator series! In the previous posts, we deployed Large Language Models (LLMs) as secure, scalable services on Azure, using containers, managed apps, and Kubernetes. We then built agentic AI workflows.
Now it’s time to deploy something smarter.
In this post, we’ll take a SAS agent, an agentic AI workflow built in SAS Intelligent Decisioning, publish it as a container image to Azure Container Registry (ACR), deploy it to Kubernetes, and score it through a secure HTTPS endpoint.
At the end of the post, you will have a SAS agent running in production.
Before you start deploying, let’s look at what’s actually running.
How the Pieces Fit Together:
From the outside, it’s just a REST API.
Inside, it’s a fully governed decisioning system.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
Suppose you developed the following agentic AI workflow that assess a loan request, formulates an approval / rejection message using LLM calls, assess the sentiment of the response and decides if a human should review the message:
Assumption: you have configured in SAS Viya a container publishing destination, such as Azure.
Start by publishing the Agentic AI workflow as a container image.
Steps in SAS Intelligent Decisioning:
After publishing completes, a new container image appears in your Azure Container Registry.
At this point, the SAS agent logic lives inside the container image.
Assumption: TLS certificates are already configured as Kubernetes secrets.
This includes:
Rather than repeating those steps, we’ll treat them as pre-requisites.
Start here: SAS Agentic AI – Deploy and Score Models – Kubernetes, complete the steps from “TLS Certificates Briefly” through “Create Your Deployment YAML”.
The deployment YAML:
The YAML is almost identical to the LLM deployment to Kubernetes. Differences:
Traffic remains encrypted end-to-end (Ingress → Service → Pod), eliminating the clear-text hop on 8080. This is considered more secure because sensitive scoring requests and responses are protected all the way to the container.
# Variables
RG=Resource_group
INGRESS_HOST=SAS_Viya_Ingress
echo $INGRESS_HOST
az login
ACR_NAME=Your_Azure_Container_Registry
# LLM image must be stored here as a container image
az acr login --name $ACR_NAME
## List ACR repositories or container images
az acr repository list --name $ACR_NAME --output table
## The variables are called LLM_, but they refer to the sas-agent.
## We're just reusing the template published in the previous post.
LLM=sas_agent1_0
LLMDASH=${LLM//_/-}
echo $LLM & echo $LLMDASH
# Create the deployment YAML file
cat > ~/project/deploy/models/${LLMDASH}-tls-deployment.yaml <<'EOF'
# ${LLMDASH} model deployment
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/name: ${LLMDASH}
workload/class: models
name: ${LLMDASH}
spec:
# modify replicas to support the requirements
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: ${LLMDASH}
template:
metadata:
labels:
app: ${LLMDASH}
app.kubernetes.io/name: ${LLMDASH}
workload/class: models
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.azure.com/mode
operator: NotIn
values:
- system
- key: node.kubernetes.io/name
operator: In
values:
- llm
containers:
- name: ${LLMDASH}
image: ${ACR_NAME}.azurecr.io/${LLM}:latest
imagePullPolicy: Always # IfNotPresent or Always
resources:
requests: # Minimum amount of resources requested
cpu: 1
memory: 8Gi
limits: # Maximum amount of resources requested
cpu: 4
memory: 16Gi
ports:
- containerPort: 8080
name: http # Name the port "http"
- containerPort: 8443
name: https # Name the port "https"
env:
- name: SAS_SCR_SSL_ENABLED
value: "true"
- name: SAS_SCR_SSL_CERTIFICATE
value: /secrets/tls.crt
- name: SAS_SCR_SSL_KEY
value: /secrets/tls.key
- name: SAS_SCR_LOG_LEVEL_SCR_IO
value: TRACE
volumeMounts:
- name: tls
mountPath: /secrets
volumes:
- name: tls
secret:
secretName: scr-certificate
items: # Explicitly define the keys to mount
- key: tls.crt
path: tls.crt
- key: tls.key
path: tls.key
tolerations:
- key: workload/class
operator: Equal
value: models
effect: NoSchedule
- key: workload
operator: Equal
value: llm
effect: NoSchedule
---
# TLS service definition
apiVersion: v1
kind: Service
metadata:
name: ${LLMDASH}-tls-svc
labels:
app.kubernetes.io/name: ${LLMDASH}-tls-svc
spec:
selector:
app.kubernetes.io/name: ${LLMDASH}
workload/class: models
ports:
- name: ${LLMDASH}-https
port: 443
protocol: TCP
targetPort: 8443
type: ClusterIP
---
# TLS ingress definition
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ${LLMDASH}-ingress
annotations:
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
labels:
app.kubernetes.io/name: ${LLMDASH}-ingress
spec:
ingressClassName: nginx
tls:
- hosts:
- ${INGRESS_HOST}
secretName: scr-certificate
rules:
- host: ${INGRESS_HOST}
http:
paths:
- path: /${LLM}
pathType: Prefix
backend:
service:
name: ${LLMDASH}-tls-svc
port:
number: 443
EOF
Deploy your model:
# Deploy (apply)
kubectl apply -f ${LLMDASH}-tls-deployment.yaml -n llm
kubectl get pods -n llm -o wide
kubectl get pods -n ingress-nginx
kubectl get svc -n llm
kubectl get ingress -n llm
LLM_POD_NAME=$(kubectl get pods -n llm --no-headers | awk '$1 ~ /^sas-agent/ {print $1; exit}')
# Loop until the pod is Ready
while true; do
STATUS=$(kubectl get pod $LLM_POD_NAME -n llm -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}')
if [ "$STATUS" == "True" ]; then
echo "Pod $LLM_POD_NAME is Ready."
break
else
echo "Waiting for pod $LLM_POD_NAME to become Ready..."
sleep 5
fi
done
# Check logs when ready
kubectl logs $LLM_POD_NAME -n llm
Once the pod is ready, the SAS agent is live and ready to take scoring requests!
The SAS agent is now available as a secure HTTPS endpoint.
With everything live, you can send HTTPS requests to your Kubernetes ingress endpoint and watch your SAS agent produce its magic!
# Score the SAS Agent
echo $INGRESS_HOST, "https://${INGRESS_HOST}/${LLM}"
curl --location --request POST "https://${INGRESS_HOST}/${LLM}" --header 'Content-Type: application/json' --header 'Accept: application/json' --data-raw '{"inputs":
[
{"name":"customer_id","value": 1012},
{"name":"customer_name","value": "Robert Little"},
{"name":"customer_language","value": "EN"},
{"name":"BAD","value": 0},
{"name":"LOAN","value": 2000},
{"name":"CLAGE","value": 147.133},
{"name":"CLNO","value": 9},
{"name":"DEBTINC","value": 19},
{"name":"DELINQ","value": 0},
{"name":"DEROG","value": 0},
{"name":"JOB","value": "Office"},
{"name":"MORTDUE","value": 64536},
{"name":"NINQ","value": 1},
{"name":"REASON","value": "HomeImp"},
{"name":"VALUE","value": 87400},
{"name":"YOJ","value": 11},
{"name":"high_value","value": 1}
]}' | jq
The response includes:
All from a single API call.
If you get a response, such as the following sample, congratulations! You’ve just deployed a secure, scalable SAS agent using Kubernetes.
With this deployment, you’ve achieved something important:
This is the natural evolution from experimenting with LLMs to operationalizing Agentic AI.
SAS Viya handles the intelligence.
Kubernetes handles the scale and security.
You get a clean, auditable API that can power real business decisions.
In the world of Agentic AI, this is what production looks like.
Happy deploying!
Thanks to Michael Goddard for sharing his time and resources.
SAS offers a full workshop in the SAS Decisioning Learning Subscription with step-by-step exercises for deploying and scoring models using Agentic AI and SAS Viya on Azure.
Access it on learn.sas.com. This workshop environment provides step-by-step guidance and a bookable environment for creating agentic AI workflows.
For further guidance, reach out for assistance.
Find more articles from SAS Global Enablement and Learning here.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.