Welcome back to the SAS Agentic AI Accelerator series! If you’ve made it this far in the series, you’ve survived the high-level overviews and cost comparisons—give yourself a pat on the back (or at least a fresh cup of coffee). Now, let’s roll up our sleeves and actually deploy something.;
Ever tried to deploy a Large Language Model (LLM) and felt like you were assembling IKEA furniture with missing instructions? Today, I’ll walk you through the nuts and bolts—so you can get your models running in Azure with minimal head-scratching.
Today, we’re going to take all that theory and put it into practice: deploying a code-wrapped LLM as a container in Azure. We’ll start simple (public IP), then get secure (private IP), and make sure you know what to watch out for at every step.
As an example, let’s walk through deploying an open source model: the phi-3-mini-4k LLM code wrapper. Code-wrappers come with the SAS Agentic AI Accelerator code repository. Code-wrappers standardize LLM inputs and outputs, can be switched easily in agentic AI workflows and are easy to deploy, thanks to SCR.
This phi-3-mini-4k model, developed and released by Microsoft, is a lightweight large language model designed for efficiency and quick responses—think of it as a compact, agile AI that doesn’t need a supercomputer to run.
If you’re wondering about the quirky name “phi-3-mini-4k,” you’re not alone—it sounds like it could be R2-D2’s distant cousin from the Star Wars universe! There’s just something about AI and robotics that inspires these metallic, alphanumeric names. Maybe it’s a subtle nod to our sci-fi dreams, or perhaps it’s just because “Bob the Bot” doesn’t sound as futuristic or impressive.
Either way, let’s see how to get our own “phi” up and running in the cloud—no droids, no bots required!
In Azure, that’s the fastest way to test or demo your LLM. It’s not meant for production or anything sensitive.
Michael Goddard wrote about Deploying SAS Container Runtime models on Azure Container Instances. So far the code-wrapped LLM follow the same guidelines.
For Azure deployment scripts, you can use the Azure Command Line Interface (CLI).
# Variables to set
CONT="myprefix-phi" # Name/DNS label for your container
RG="my-resource-group" # Azure Resource Group
ACR_NAME="myacr" # Azure Container Registry name
IMAGE="phi_3_mini_4k" # Image name
IMAGE_TAG="latest" # Image tag/version
ACR_PASS=$(az acr credential show -n $ACR_NAME --query "passwords[0].value" -o tsv)# ACR password or service principal
LOCATION="westus3" # Azure region – choose one that suits you
az container create -n $CONT -g $RG \
--image "${ACR_NAME}.azurecr.io/$IMAGE:$IMAGE_TAG" \
--registry-username $ACR_NAME \
--registry-password $ACR_PASS \
--ports 80 8080 \
--protocol TCP \
--dns-name-label $CONT \
--location $LOCATION \
--cpu 4 \
--memory 16
You’re spinning up a container with your LLM, exposing ports for API access (SCR needs 8080), and giving it enough juice (4 CPUs, 16 GB RAM) to keep things snappy. But it’s public! Anyone with the endpoint can poke your model.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
Time to see if your LLM is awake! Here’s a sample curl command to send a scoring request:
# Variables
curl -X POST "http://${CONT}.${LOCATION}.azurecontainer.io:8080/${IMAGE}" \
-H 'Content-Type: application/json' \
-d '{
"inputs": [
{"name":"userPrompt","value":"customer_name: X Y; loan_amount: 20000.0; customer_language: EN"},
{"name":"systemPrompt","value":"You are tasked with drafting an email to respond to a customer whose mortgage loan application has been accepted by the SAS AI Bank. You will be provided with three pieces of information: customer_name, loan_amount, customer_language. Use the provided customer name and loan amount to personalize the email."},
{"name":"options", "value":"{temperature:0.7,top_p:1,max_tokens:800}"}
]
}' | jq
These three inputs are set by the LLM code wrapper from the SAS Agentic AI Accelerator.
Suitable for internal, secure deployments—perfect for real workflows where you care about data privacy.
# New Variables
CONTP="myprefix-phi-private" # Name/DNS label for your container
VNET="SAS-Viya-azure-vnet" # Virtual Network name
SUBNET="llm-subnet" # Subnet name
# Step 1: Create a dedicated subnet for containers within your existing VNET
az network vnet subnet create \
--resource-group $RG \
--vnet-name $VNET \
--name $SUBNET \
--address-prefix 192.168.3.0/26 \ # adapt it to match your VNET ip range
--delegations Microsoft.ContainerInstance/containerGroups
# Deploy the container with a private IP
az container create \
--resource-group $RG \
--name $CONTP \
--image "${ACR_NAME}.azurecr.io/$IMAGE:$IMAGE_TAG" \
--registry-username $ACR_NAME \
--registry-password $ACR_PASS \
--ports 80 8080 \
--protocol TCP \
--location $LOCATION \
--vnet $VNET \
--subnet $SUBNET \
--ip-address private \
--cpu 4 \
--memory 16
# Retrieve the Private IP of your container
az container show --resource-group $RG --name $CONTP --query "ipAddress.ip" --output tsv
Make sure your SAS Viya and the deployed containers subnet are on the same VNET! Otherwise, your scoring requests will be like postcards sent to a house with no mailbox.
The only thing changing in the scoring is the usage of a container IP instead of a DNS label (FQDN) for the public IP container:
CONT_IP=$(az container show --resource-group $RG --name $CONTP --query "ipAddress.ip" --output tsv)
echo "CONT_IP=${CONT_IP}"
curl -X POST "http://${CONT_IP}:8080/${IMAGE}" ...
It’s just the same scoring request as before, just use the container’s private IP address instead of the container’s DNS from the public example.
SAS offers a full workshop with step-by-step exercises for deploying and scoring models using Agentic AI and SAS Viya on Azure.
Access it on learn.sas.com in the SAS Decisioning Learning Subscription. This workshop environment provides step-by-step guidance and a bookable environment for creating agentic AI workflows.
If you liked the post, give it a thumbs up! Please comment and tell us what you think about the AI Decisioning Assistant. For further guidance, reach out for assistance. Let us know how this solution works for you!
Find more articles from SAS Global Enablement and Learning here.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.