BookmarkSubscribeRSS Feed

Choosing the Right Cloud-Based PostgreSQL for SAS Viya

Started ‎01-22-2024 by
Modified ‎01-22-2024 by
Views 243

The SAS Infrastructure Data Server is a fundamental component of the SAS Viya architecture, that can be deployed both in the same Kubernetes cluster of the other components or outside the cluster.

 

Are you looking to choose an external PostgreSQL instance for use with the SAS Viya platform? In this post, we’ll highlight useful considerations that can help choose the right instance when opting for an external, managed cloud instance.

A Quick Summary

 

Before diving into the topic at hand, let’s recap what are we talking about. If you’ve been following this space, you may recognize there have been a few posts on the subject of external (or internal) PostgreSQL:

 

 

You can refer to these articles for additional details, but, in short, we can say that the SAS Infrastructure Data Server stores SAS Viya user and system content, such as reports, authorization rules, attachments, user preferences, and so on. The SAS Infrastructure Data Server requires an instance of PostgreSQL. The SAS Viya platform deployment can install and configure its own open-source PostgreSQL database inside the SAS Viya Kubernetes cluster; this is referred to as an internal instance. If you want, you can choose instead to provide another PostgreSQL instance, which is called an external instance.

 

External PostgreSQL choices

 

For an external PostgreSQL server, the SAS Viya platform support has changed over time and with the different releases. For this reason, you are always encouraged to refer to the official documentation corresponding to your SAS Viya platform release. You can find the latest one here: https://go.documentation.sas.com/doc/en/itopscdc/default/itopssr/p05lfgkwib3zxbn1t6nyihexp12n.htm#n0...

 

As a general rule, with the SAS Viya platform, you can use managed PostgreSQL instances hosted by any supported cloud providers, or the Open Source version of PostgreSQL – for self-managed deployments:

 

Customer-managed PostgreSQL instance​

 

  • PostgreSQL (Open Source)​

 

Managed PostgreSQL instances hosted by cloud providers:​

 

  • Microsoft Azure Database for PostgreSQL - Single Server
  • Microsoft Azure Database for PostgreSQL - Flexible Server​
  • Amazon RDS for PostgreSQL​
  • Google Cloud SQL for PostgreSQL

 

Two of the communities.sas.com articles previously referenced focus on the provisioning and configuration of self-managed Open Source PostgreSQL. The rest of this post will focus on the cloud-hosted managed options.

 

What drives the choice of a specific managed instance?

 

The official documentation highlights all the requirements that should be met when using an external PostgreSQL instance. For example, starting with SAS Viya platform 2023.11, PostgreSQL 11 is not supported anymore: this automatically excludes any instance of “Microsoft Azure Database for PostgreSQL - Single Server”, which only supports PostgreSQL up to version 11 (and, by the way, has been put on the retirement path by Azure).

 

Cloud architects, when designing the database to use for each project, must balance business drivers that sometimes drive in opposite directions. For example, it’s quite difficult to satisfy scalable performance, high availability, and low cost, all at the same time. However, it’s possible to get good performance and minimal failure rates for a given cost when considering the specific software requirements.

 

While other solutions may require high I/O throughput or fast storage, the most important factor that drives proper instance sizing with the SAS Viya platform is that PostgreSQL should support at least 1024 “maximum number of connections”.  How can you satisfy this on cloud-managed instances? Let’s see what it means for each of the supported cloud vendors.

 

Azure Database for PostgreSQL

 

Azure Database for PostgreSQL powered by the PostgreSQL community edition is available in two deployment modes:

 

“Azure Database for PostgreSQL - Single Server” is on the retirement path and is scheduled for retirement by March 2025. We do not recommend that you choose this edition for any new deployment.

 

“Azure Database for PostgreSQL – Flexible Server” provides different instances that meet the 1024 concurrent connections requirements. For production use, Burstable instances are usually not recommended because of their lack of predictable performance.

 

er_1_20240118_01_AzureFlexServerLimits.png

Highlights of the minimum “Azure Database for PostgreSQL – Flexible Server”  instance sizes that support at least 1024 max user connections

Screenshot as of January 2024 of https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-limits#maximum-connectio....

 

The viya4-iac-azure utility can be used to provision a managed PostgreSQL server on Azure. By default, it creates an instance type GP_Standard_D16s_v3 (16vCPU, 64GiB RAM) that exceeds the SAS Viya requirement for 1,024 concurrent connections and is suited for baseline production use.

 

You can choose to further customize the maximum number of connections independently of the chosen instance size. Microsoft recommends against this custom setting, for the risk of exhausting the memory of the PostgreSQL instance, risking crashes or high latency.

 

Amazon RDS for PostgreSQL​

 

Amazon RDS for PostgreSQL lets you choose from a variety of instance types with varying combinations of CPU, memory, storage, and networking. The maximum number of concurrent connections is calculated based on the DB Instance memory, with a maximum of 5000.

 

er_2_20240118_02_AmazonRDSLimits.png

Highlights of the “Amazon RDS for PostgreSQL”  rules to calculate the max user connections parameter

Screenshot as of January 2024 of https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Limits.html#RDS_Limits.MaxConnections

 

The viya4-iac-aws utility can be used to provision a managed PostgreSQL server on Amazon. It defaults to an instance type db.m5.xlarge (4vCPU, 16GiB RAM) that meets SAS requirement for 1,024 connections: 16 GiB = 17,179,869,184 bytes, which, divided by 9,531,392 is equal to 1,802 connections allowed.

 

Google Cloud SQL for PostgreSQL

 

Google provides Cloud SQL for PostgreSQL as a managed service. The most important factor impacting image sizing is the server memory. The underlying instance needs more than 120GB of RAM to meet the 1024 connections requirements.

 

er_3_20240118_03_GoogleCloudSQLLimits.png

 

 Highlights of the “Google Cloud DQL for PostgreSQL”  rules to calculate the max user connections parameter

Screenshot as of January 2024 of https://cloud.google.com/sql/docs/postgres/flags#postgres-m

 

The viya4-iac-gcp utility can be used to provision a managed PostgreSQL server on Google Cloud. By default, it creates an instance type db-custom-8-30720 (8 vCPU and 30 GB of RAM): you can see in the above table that this supports 600 connections by default. You can choose to further customize the maximum number of connections independently of the selected instance size.

 

Additional Considerations

 

If you are deploying multiple SAS Viya instances, you can use multiple PostgreSQL servers with a single database in each (one server per deployment), or you can share a single PostgreSQL server with multiple databases inside (one database per deployment).

 

Some SAS offerings require an additional PostgreSQL cluster called SAS Common Data Store, or CDS PostgreSQL. It is configured separately from the required platform PostgreSQL cluster that supports SAS Infrastructure Data Server. You can find a list of offerings that require this separate PostgreSQL cluster at https://go.documentation.sas.com/doc/en/itopscdc/default/itopssr/p05lfgkwib3zxbn1t6nyihexp12n.htm#n0.... All of the considerations written in the previous sections apply to this instance, as well.

 

Conclusion

 

The SAS Viya platform requires an instance of PostgreSQL, and you can choose an external, cloud-based PostgreSQL instance for use with the platform. We have seen in this article different options and minimum instance sizes that support at least 1024 concurrent connections, which is the requirement for the SAS Viya platform.

Version history
Last update:
‎01-22-2024 11:37 AM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started