BookmarkSubscribeRSS Feed

Reduce the Azure bill when you are not using the SAS Viya environment

Started ‎01-13-2022 by
Modified ‎01-13-2022 by
Views 3,814

The cost of the Cloud infrastructure has become a critical factor.


At the beginning, the standard pricing model in the Cloud was the "Pay-As-You-Go" one (cloud services are billed per actual usage, typically with a "per hour" fixed price).


The nice thing about this model is that you only pay for actual usage and can scale down resources when needed.

 

But Production environments often need to remain available 24/7 (for example to provide Analytics capabilities to all users around the world, all the time) and Cloud providers now offer different and more advantageous pricing models for this use case, such as "Prepaid/Fixed Subscriptions" (where cloud customers pay for services upfront) or "reserved instances".

 

However, the "Pay-as-you-go" might still be used for a demo or lab Viya environment that don’t necessarily need to be up during the night or kept around all the time. You might also want to keep environments for "bursty" and batch scenarios that only runs a few days during the month.

 

But in such case what would be the best way to scale down the resources and reduce your costs ?

 

In this article we'll look at a nice Azure feature that allows you to stop the entire AKS cluster where Viya is running and restart it later when needed.

 

Nodes autoscaling vs Cluster stop

 

If you have a dedicated cluster for SAS Viya and have configured node autoscaling in your Kubernetes cluster you have already improved the cost efficiency of your environment.

 

In such case, when you stop the Viya services, your infrastructure should automatically be scaled down to the number of minimum nodes in each of the node pulls.

 

For example, if you provisioned your cluster with the viya4-iac GitHub tool and kept the default "node pools" settings :

 

rp_1_iac-example.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

 

Once you stop all your Viya services (there is a specific Kubernetes cronjob for that), then Kubernetes detects that there are no more resources requests from the Viya pod and triggers the Cloud autoscaler to decommission the nodes until their number is equal to the configured min_node value in each node pool (0 in this case for the Viya node pools).

 

Several SAS runtimes (like ESP or the compute server) can also benefit from the auto-scaling to keep only the number of nodes required for a given number of pods that corresponds to the user’s requests and hence provide true cloud elasticity.

 

For example, we could set the Compute Nodes range from 2 to 10 so the system would start with 2 nodes but could accommodate an increase of the demand with extra SAS compute sessions corresponding to additional container resources requests (expressed in CPU and memory) in the Kubernetes cluster and trigger the Azure autoscaler.

 

That will already save a lot of Cloud money because we shrink down the computing resources when we don't need them.

 

Note that in our example we have defined the autoscaling settings during the initial provisioning of the AKS cluster (with Terraform) BUT not all is lost for those who did not have auto-scaling enabled by default during the initial setup : auto-scaling can be enabled after the fact by running a command similar to the following in the Azure Cloud Shell :

az aks nodepool update –enable-cluster-autoscaler –cluster-name viya-1-aks –resource-group viya-1-rg –name compute –min-count 0 –max-count 5

Where viya-1-aks is the cluster name and viya-1-rg its associated resource group. In this example, the command only affects the "compute" node pool. 

 

However, even when the autoscaling is enabled and all the nodes are scaled down, the Kubernetes system pool (as well as the associated network components) remain active and continue to generate costs every hour.

 

A cleaner and more complete way to reduce the bill of the Kubernetes cluster in Azure, during a known period of inactivity, is to use the "AKS stop/start" feature.

 

Implementing the AKS stop/start with Viya

 

rp_2_az-aks-stop.png

 

When you use this feature, it is like "pausing" a video and resuming sometime later.

 

Note that the “stop” button is now also available in the Azure portal (However using the az CLI command makes the automation and scheduling of the stop/start process much easier).

 

Behind the scenes it leverages the fact that AKS is already backing up the cluster state for resiliency. the only state in the Kubernetes system is really the contents of etcd.

 

As noted in the official azure documentation there are some limitations, such as :

  • “The cluster state of a stopped AKS cluster is preserved for up to 12 months. If your cluster is stopped for more than 12 months, the cluster state cannot be recovered. For more information, see the AKS Support Policies.
  • You can only start or delete a stopped AKS cluster. To perform any operation like scale or upgrade, start your cluster first.
  • The customer provisioned PrivateEndpoints linked to private cluster need to be deleted and recreated again when you start a stopped AKS cluster.”

 

However, it remains a very handy feature and, according to our tests, it works well with the Viya environment.

 

Here is an example of the process (that can easily be automated):

  1. Use az aks stop command to pause the cluster
  2. Optional (if you have them) : Stop the jump and nfs VMs
  3. Go live your life 😊 
  4. Optional: Start the jump and nfs VMs manually
  5. Use az aks start command to restart the cluster
  6. Let the cronjob start Viya
  7. Let autoscaling scale the Viya nodes back up as Viya starts

 

Assuming any in-memory data (such as CAS output tables) has been properly saved, there is no need to take extra backup or to stop the Viya services before running the AKS stop command.

 

Conclusion

 

Several methods have been experimented by our SAS colleagues from various teams to keep the AKS costs down when not used (using node autoscaling with 0 nodes for all the node pool, automating the stop/start of VM scale sets, etc…). However it seems like the AKS stop feature is the simpler and most efficient way to do it.

 

But the Kubernetes cluster is rarely the only Cloud Infrastructure piece used by the Viya environment.

 

Stopping the AKS cluster will not automatically stop the Jumphost or NFS server VMs and "satellite" components like the NetApp Storage services or the Azure Postgres database will likely generate significant costs.

 

So, a good practice would be to implement and test a true CI/CD process to also automate the stop of these services whenever possible, when the Viya environment is not used (for example the standard Azure Postgres database cannot be stopped but the flexible server can ! and work is in progress to officially support it in the Viya 4 IaC tool in the future).

 

Finally, this capability is quite unique with Azure. For the other Kubernetes Managed Services (such as GKE or EKS), you can't really stop the whole Kubernetes cluster like this, as the master nodes / control planes are directly managed by the Cloud providers (AWS or GCP).

 

Find more articles from SAS Global Enablement and Learning here.

Comments

Hi @RPoumarede 

 

Is there anything equivalent on AWS (EKS)?

We have setup SAS Viya 4 on EKS and when I run the "sas-stop-all" job then all SAS pods are killed and the node pools are closed (they have minimum servers = 0) but the EKS cluster is still running and the "default" node in the cluster is still running the system pods like ingress controller and other EKS stuff and we are charged for this server... Any ideas?

 

Thanks,

Eyal

Version history
Last update:
‎01-13-2022 04:04 AM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags