BookmarkSubscribeRSS Feed

SAS Agentic AI – Deploy and Score a SAS Agent on Kubernetes

Started a week ago by
Modified a week ago by
Views 190

Welcome back to the SAS Agentic AI Accelerator series! In the previous posts, we deployed Large Language Models (LLMs) as secure, scalable services on Azure, using containers, managed apps, and Kubernetes. We then built agentic AI workflows.

 

Now it’s time to deploy something smarter.

 

In this post, we’ll take a SAS agent, an agentic AI workflow built in SAS Intelligent Decisioning, publish it as a container image to Azure Container Registry (ACR), deploy it to Kubernetes, and score it through a secure HTTPS endpoint.

 

At the end of the post, you will have a SAS agent running in production.

 

 

Where You Are In The Series

 

 

 

Architecture Overview

 

Before you start deploying, let’s look at what’s actually running.

 

How the Pieces Fit Together:

 

  • SAS Intelligent Decisioning: Authors and governs the Agentic AI workflow (rules, models, LLM calls). Allows workflow publishing as a container image.
  • Azure Container Registry (ACR): Stores the published SAS agent as a container image.
  • Azure Kubernetes Service (AKS): Runs the SAS agent container in a pod. The example here is using AKS, but it can also be deployed in an on-prem Kubernetes cluster.
  • Ingress with TLS: Exposes the agent as a secure HTTPS REST endpoint.
  • SAS Container Runtime (SCR): Executes the agent workflow deployed to a pod and handles scoring requests.

 

From the outside, it’s just a REST API.

Inside, it’s a fully governed decisioning system.

01_BT_Agentic_AI_Deployment_Options.png

 

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

 

Publish the SAS Agent to Azure

 

Suppose you developed the following agentic AI workflow that assess a loan request, formulates an approval / rejection message using LLM calls, assess the sentiment of the response and decides if a human should review the message:

 

02_BT_SAS_Agentic_AI_Workflow_Code_Files.png

 

Assumption: you have configured in SAS Viya a container publishing destination, such as Azure.

 

Start by publishing the Agentic AI workflow as a container image.

 

Steps in SAS Intelligent Decisioning:

 

  1. Select Build Decisions to open SAS Intelligent Decisioning.
  2. From the Decisions tab, open SAS_Agent.
  3. Publish the decision to Azure as SAS_Agent1_0.

 

03_BT_SAS_Agentic_AI_Workflow_Publish.png

 

After publishing completes, a new container image appears in your Azure Container Registry.

 

At this point, the SAS agent logic lives inside the container image.

 

 

TLS Certificates

 

Assumption: TLS certificates are already configured as Kubernetes secrets.

 

This includes:

 

  • Creating TLS certificates.
  • Storing them as Kubernetes secrets.
  • Preparing the AKS cluster and node pools.
  • Configuring HTTPS ingress.

 

Rather than repeating those steps, we’ll treat them as pre-requisites.

 

Start here: SAS Agentic AI – Deploy and Score Models – Kubernetes, complete the steps from “TLS Certificates Briefly” through “Create Your Deployment YAML”.

 

04_BT_TLS_Certificates.png

 

 

Deployment Manifest / YAML

 

The deployment YAML:

 

  • Runs the SAS agent container from ACR.
  • Mounts TLS certificates into the pod.
  • Exposes ports for HTTP/HTTPS.
  • Routes traffic through a NGINX ingress.

 

The YAML is almost identical to the LLM deployment to Kubernetes. Differences:

 

  • The container image name and the container path changes.
  • The Service forwards traffic to the container’s 8443 port.
  • The Ingress tells NGINX to use HTTPS when talking to the backend.

 

Traffic remains encrypted end-to-end (Ingress → Service → Pod), eliminating the clear-text hop on 8080. This is considered more secure because sensitive scoring requests and responses are protected all the way to the container.

 

# Variables
RG=Resource_group
INGRESS_HOST=SAS_Viya_Ingress
echo $INGRESS_HOST
az login
ACR_NAME=Your_Azure_Container_Registry
# LLM image must be stored here as a container image
az acr login --name $ACR_NAME

## List ACR repositories or container images
az acr repository list --name $ACR_NAME --output table

## The variables are called LLM_, but they refer to the sas-agent.
## We're just reusing the template published in the previous post.
LLM=sas_agent1_0
LLMDASH=${LLM//_/-}
echo $LLM & echo $LLMDASH

 

05_BT_SAS_Agentic_AI_Workflow_Container_Image.png

 

# Create the deployment YAML file
cat > ~/project/deploy/models/${LLMDASH}-tls-deployment.yaml <<'EOF'
# ${LLMDASH} model deployment
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/name: ${LLMDASH}
    workload/class: models
  name: ${LLMDASH}
spec:
  # modify replicas to support the requirements
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: ${LLMDASH}
  template:
    metadata:
      labels:
        app: ${LLMDASH}
        app.kubernetes.io/name: ${LLMDASH}
        workload/class: models
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.azure.com/mode
                operator: NotIn
                values:
                - system
              - key: node.kubernetes.io/name
                operator: In
                values:
                - llm
      containers:
        - name: ${LLMDASH}
          image: ${ACR_NAME}.azurecr.io/${LLM}:latest
          imagePullPolicy: Always  # IfNotPresent or Always
          resources:
            requests:  # Minimum amount of resources requested
              cpu: 1
              memory: 8Gi
            limits:  # Maximum amount of resources requested
              cpu: 4
              memory: 16Gi
          ports:
            - containerPort: 8080
              name: http # Name the port "http"
            - containerPort: 8443
              name: https # Name the port "https"
          env:
          - name: SAS_SCR_SSL_ENABLED
            value: "true"
          - name: SAS_SCR_SSL_CERTIFICATE
            value: /secrets/tls.crt
          - name: SAS_SCR_SSL_KEY
            value: /secrets/tls.key
          - name: SAS_SCR_LOG_LEVEL_SCR_IO
            value: TRACE
          volumeMounts:
          - name: tls
            mountPath: /secrets
      volumes:
        - name: tls
          secret:
            secretName: scr-certificate
            items:  # Explicitly define the keys to mount
              - key: tls.crt
                path: tls.crt
              - key: tls.key
                path: tls.key
      tolerations:
      - key: workload/class
        operator: Equal
        value: models
        effect: NoSchedule
      - key: workload
        operator: Equal
        value: llm
        effect: NoSchedule
---
# TLS service definition
apiVersion: v1
kind: Service
metadata:
  name: ${LLMDASH}-tls-svc
  labels:
    app.kubernetes.io/name: ${LLMDASH}-tls-svc
spec:
  selector:
    app.kubernetes.io/name: ${LLMDASH}
    workload/class: models
  ports:
  - name: ${LLMDASH}-https
    port: 443
    protocol: TCP
    targetPort: 8443
  type: ClusterIP
---
# TLS ingress definition
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ${LLMDASH}-ingress
  annotations:
    nginx.ingress.kubernetes.io/backend-protocol: HTTPS
  labels:
    app.kubernetes.io/name: ${LLMDASH}-ingress
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - ${INGRESS_HOST}
    secretName: scr-certificate
  rules:
  - host: ${INGRESS_HOST}
    http:
      paths:
      - path: /${LLM}
        pathType: Prefix
        backend:
          service:
            name: ${LLMDASH}-tls-svc
            port:
              number: 443

EOF

 

Apply and Go!

 

Deploy your model:

 

# Deploy (apply)
kubectl apply -f ${LLMDASH}-tls-deployment.yaml -n llm
kubectl get pods -n llm -o wide
kubectl get pods -n ingress-nginx
kubectl get svc -n llm
kubectl get ingress -n llm
LLM_POD_NAME=$(kubectl get pods -n llm --no-headers | awk '$1 ~ /^sas-agent/ {print $1; exit}')

# Loop until the pod is Ready
while true; do
  STATUS=$(kubectl get pod $LLM_POD_NAME -n llm -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}')
  if [ "$STATUS" == "True" ]; then
    echo "Pod $LLM_POD_NAME is Ready."
    break
  else
    echo "Waiting for pod $LLM_POD_NAME to become Ready..."
    sleep 5
  fi
done

# Check logs when ready
kubectl logs $LLM_POD_NAME -n llm 

 

Once the pod is ready, the SAS agent is live and ready to take scoring requests!

 

06_BT_SAS_Agentic_AI_Deployment_Kube_1-1024x524.png

 

 

Score the SAS Agent

 

The SAS agent is now available as a secure HTTPS endpoint.

 

With everything live, you can send HTTPS requests to your Kubernetes ingress endpoint and watch your SAS agent produce its magic!

 

# Score the SAS Agent
echo $INGRESS_HOST, "https://${INGRESS_HOST}/${LLM}"
curl --location --request POST "https://${INGRESS_HOST}/${LLM}" --header 'Content-Type: application/json'  --header 'Accept: application/json' --data-raw '{"inputs":
        [
        {"name":"customer_id","value": 1012},
        {"name":"customer_name","value": "Robert Little"},
        {"name":"customer_language","value": "EN"},
        {"name":"BAD","value": 0},
        {"name":"LOAN","value": 2000},
        {"name":"CLAGE","value": 147.133},
        {"name":"CLNO","value": 9},
        {"name":"DEBTINC","value": 19},
        {"name":"DELINQ","value": 0},
        {"name":"DEROG","value": 0},
        {"name":"JOB","value": "Office"},
        {"name":"MORTDUE","value": 64536},
        {"name":"NINQ","value": 1},
        {"name":"REASON","value": "HomeImp"},
        {"name":"VALUE","value": 87400},
        {"name":"YOJ","value": 11},
        {"name":"high_value","value": 1}
        ]}' | jq

 

The response includes:

 

  • Decision outcomes.
  • LLM‑generated content.
  • Scores and classifications.
  • Execution metadata.

 

All from a single API call.

 

If you get a response, such as the following sample, congratulations! You’ve just deployed a secure, scalable SAS agent using Kubernetes.

 

07_BT_SAS_Agentic_AI_Kube_Score_1-1024x517.png

 

08_BT_SAS_Agentic_AI_Kube_Score_2-1024x516.png

 

09_BT_SAS_Agentic_AI_Kube_Score_3-1024x516.png

 

 

Why This Matters

 

With this deployment, you’ve achieved something important:

 

  • governed SAS agent that calls LLMs and makes decisions.
  • Packaged as a container.
  • Running on Kubernetes. Kubernetes provides enterprise‑grade scalability and security.
  • Exposed to secure HTTPS.
  • Ready for enterprise workloads.

 

This is the natural evolution from experimenting with LLMs to operationalizing Agentic AI.

 

 

Summary

 

SAS Viya handles the intelligence.

Kubernetes handles the scale and security.

You get a clean, auditable API that can power real business decisions.

 

In the world of Agentic AI, this is what production looks like.

 

Happy deploying!

 

 

Acknowledgment

Thanks to Michael Goddard for sharing his time and resources.

 

Additional Resources

 

Want More Hands-On Guidance?

SAS offers a full workshop in the SAS Decisioning Learning Subscription with step-by-step exercises for deploying and scoring models using Agentic AI and SAS Viya on Azure.

 

Access it on learn.sas.com. This workshop environment provides step-by-step guidance and a bookable environment for creating agentic AI workflows.

 

10_BT_AgenticAI_Workshop-1024x496.png

 

 
For further guidance, reach out for assistance.

 

 

Find more articles from SAS Global Enablement and Learning here.

Contributors
Version history
Last update:
a week ago
Updated by:

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started