BookmarkSubscribeRSS Feed

Switch SAS Viya from an internal instance of PostgreSQL to an external instance

Started ‎05-07-2024 by
Modified ‎05-10-2024 by
Views 386

PostgreSQL is an open-source database management system that SAS provides uses as the SAS Infrastructure Data Server. The SAS Viya platform uses this component to store user content. The SAS Viya platform can use either an internal instance of PostgreSQL that it provides or an external instance. The PostgreSQL data transfer tool allows you to switch between these two options. This post demonstrates the process of switching a Viya deployment from an internal PostgreSQL instance to an external instance using that Azure Flexible PostgreSQL server.

 

The process we will follow contains the following steps.

 

  1. Perform Prerequisites
  2. Prepare to use the PostgreSQL Data Transfer Tool
  3. Transfer data using the PostgreSQL Data Transfer Tool
  4. Switch SAS Viya to point to External PostgreSQL
  5. Validate that that switch worked
  6. Delete internal PostgreSQL.

 

This process is documented in the SAS Viya Operations guide. Please check out the documentation for the specifics of each cloud provider and offering. In this post, we will demonstrate the process for the SAS Infrastructure Data Server with SAS Viya Visual Analytics on Azure.

 

Perform prerequisites and prepare to use the PostgreSQL Data Transfer Tool

 

If you deployed with the internal instance of PostgreSQL it uses Crunchy PostgreSQL. Two versions of Crunchy PostgreSQL are used by SAS Viya. The steps in the process vary based on the version used in the deployment. Run the following code to get your software release to determine what version of Crunchy PostgreSQL is being used in your SAS Viya deployment. SAS Viya 2022.10 and later use Crunchy PostgreSQL 5, prior releases use version 4. The output below shows that we are using Crunchy version 5. The rest of the process here applies to version 5. If you have an earlier version of Crunchy check out the documentation.

 

kubectl get cm -o yaml | grep 'SAS_CADENCE_DISPLAY_NAME'

 

Expected output:

 

SAS_CADENCE_DISPLAY_NAME: Stable 2024.01
    {"apiVersion":"v1","data":{"SAS_BUILD_TYPE":"x64_oci_linux_2-docker","SAS_CADENCE_DISPLAY_NAME":"Stable 2024.01",....

 

It is recommended that we have a database administrator user in the target instance of the same name as the current SAS Infrastructure Data Server administrator. The default for SAS Viya is dbmsowner. To check that this is the name on the current deployment run the following code:

 

kubectl get secrets sas-crunchy-platform-postgres-pguser-dbmsowner -o template={{'.data.user'}} | base64 -d

 

Expected output:

 

dbmsowner

 

Some of the upcoming steps will be performed from inside the crunchy PostgreSQL primary POD. To get the primary POD name and exec into the POD run the following:

 

export SOURCE_CLUSTER=sas-crunchy-platform-postgres
PRIMARY=$(kubectl get pod --selector="postgres-operator.crunchydata.com/role=master,postgres-operator.crunchydata.com/cluster=$SOURCE_CLUSTER" -o jsonpath='{.items[*].metadata.name}' | awk '{print $1}') echo Primary Crunchy POD is $PRIMARY kubectl exec -it $PRIMARY -c database -- bash

 

Expected output:

 

Primary Crunchy POD is sas-crunchy-platform-postgres-00-c992-0

 

The next few steps will be performed from within the Crunchy PostgreSQL primary POD. Firstly, let us check the size of the existing database. To account for the future growth of your PostgreSQL database, SAS recommends that your new external instance has twice the space that your existing internal instance uses.  To determine the size of the existing PostgreSQL database from within the crunchy primary POD run this code:

 

psql -c "select pg_size_pretty ( pg_database_size('SharedServices'));"

 

Expected output:

 

pg_size_pretty
 ----------------
 377 MB
 (1 row)

 

The external instance of PostgreSQL must meet certain requirements. In addition to an admin user matching the name of the current user, the instance must have the appropriate extensions enabled. In the following steps, we will update the configuration of External PostgreSQL. To perform these steps, we need to know the connection information for the Azure Flexible PostgreSQL server. Using that connection information, we will connect to the target instance from inside the primary Crunchy PostgreSQL POD. The connection information is stored in the following environment variables.

 

TARGET_HOST_NAME="sasgnn-p21077-default-flexpsql.postgres.database.azure.com"
TARGET_HOST_PORT="5432"
TARGET_USER_NAME="pgadmin"
TARGET_USER_PASSWORD="Aa0EmhCxZJYOJWBMdEf3uj1zr60"
DATASERVER_NAME="sas-platform-postgres"

 

Firstly, let's create the dbmsowner user in the Azure Flexible PostgreSQL server instance and allocate the user the required privileges.

 

PGPASSWORD=$TARGET_USER_PASSWORD psql -h $TARGET_HOST_NAME -p $TARGET_HOST_PORT -U $TARGET_USER_NAME -d postgres -c "CREATE USER dbmsowner"
PGPASSWORD=$TARGET_USER_PASSWORD psql -h $TARGET_HOST_NAME -p $TARGET_HOST_PORT -U $TARGET_USER_NAME -d postgres -c "ALTER USER dbmsowner CREATEDB"
PGPASSWORD=$TARGET_USER_PASSWORD psql -h $TARGET_HOST_NAME -p $TARGET_HOST_PORT -U $TARGET_USER_NAME -d postgres -c "ALTER USER dbmsowner CREATEROLE"
PGPASSWORD=$TARGET_USER_PASSWORD psql -h $TARGET_HOST_NAME -p $TARGET_HOST_PORT -U $TARGET_USER_NAME -d postgres -c "ALTER USER dbmsowner WITH PASSWORD '$TARGET_USER_PASSWORD';"
PGPASSWORD=$TARGET_USER_PASSWORD psql -h $TARGET_HOST_NAME -p $TARGET_HOST_PORT -U $TARGET_USER_NAME -d postgres -c "GRANT CONNECT ON DATABASE postgres TO dbmsowner;"
PGPASSWORD=$TARGET_USER_PASSWORD psql -h $TARGET_HOST_NAME -p $TARGET_HOST_PORT -U $TARGET_USER_NAME -d postgres -c "GRANT azure_pg_admin TO dbmsowner;"

 

Now let's enable the extensions on the Azure PostgreSQL flexible server instance. In this step, we use the Azure CLI to enable these extensions and then restart the database server.

 

az account set --subscription ${SUBSCRIPTIONID}
az postgres flexible-server parameter set --resource-group  ${RG}  --server-name sasgnn-p21077-default-flexpsql --name azure.extensions --value LTREE,PG_CRON,PG_STAT_STATEMENTS,PG_TRGM,PGAUDIT,PGCRYPTO,PLPGSQL
az postgres flexible-server parameter set --resource-group  ${RG}  --server-name sasgnn-p21077-default-flexpsql --name shared_preload_libraries --value PG_CRON,PG_STAT_STATEMENTS,PGAUDIT
az postgres flexible-server restart --resource-group ${RG} --name sasgnn-p21077-default-flexpsql --subscription ${SUBSCRIPTIONID}

 

The target instance is now ready for data transfer. It is recommended that you perform a SAS Viya Backup before continuing with the process.

 

Prepare and then Transfer Data using the PostgreSQL Data Transfer Tool

 

The PostgreSQL Data Transfer Tool is available for download from support.sas.com. The tool is delivered in a ZIP file and is installed by simply unzipping it to a directory of your choice. The tool is implemented as a Kubernetes job and the job is configured by updating the provided configuration files and manifests.

 

SAS provides a script to determine the values of the required variables and insert the values into the manifest files of the tool. The script requires that mirror manager is deployed and that you have access to the license files in the SAS Viya deployment assets. In this step run the script to see the required values.

 

NOTE: if you have CDS PostgreSQL data you need to perform these steps for that database also.

 

cd /home/cloud-user/project/deploy/pgtransfer/scripts/
./manifest-values.sh --namespace gelcorp --mirrormgr /opt/mirrormgr/mirrormgr --deployment-data /home/cloud-user/project/deploy/license/SASViyaV4_${GELLOW_ORDER}_certs.zip

 

Expected output:

 

Token values
SAS-CERTFRAME-IMAGE-NAME = cr.sas.com/viya-4-x64_oci_linux_2-docker/sas-certframe:3.49.1-20240105.1704477691165
SAS-CUSTOMER-PROVIDED-CA-CERTIFICATES-CONFIGMAP-NAME = sas-customer-provided-ca-certificates-6ct58987ht
SAS-IMAGE-PULL-SECRETS-NAME = sas-image-pull-secrets-b7bt458c8f
SAS-INGRESS-CERTIFICATE-SECRET-NAME = sas-ingress-certificate
SAS-POSTGRES-DATA-TRANSFER-IMAGE-NAME = cr.sas.com/viya-4-x64_oci_linux_2-docker/sas-postgres-data-transfer:1.6.0-20240201.1706816594189
STORAGE-CAPACITY (cds) =
STORAGE-CAPACITY (platform) = 256Gi

 

If we are happy with the required values, we can run the script using the --apply parameter to update the PostgreSQL Data Transfer Too manifests and insert the required values.

 

cd /home/cloud-user/project/deploy/pgtransfer/scripts/
./manifest-values.sh --namespace gelcorp --mirrormgr /opt/mirrormgr/mirrormgr --deployment-data /home/cloud-user/project/deploy/license/SASViyaV4_${GELLOW_ORDER}_certs.zip --apply

 

Expected output:

 

INFO: Skipping substitutions for cds - it was not found in the SAS Viya platform deployment.
INFO: Applying changes to '/home/cloud-user/project/deploy/pgtransfer/manifests/platform/pvc.yaml'.
INFO: Applying changes to '/home/cloud-user/project/deploy/pgtransfer/manifests/platform/job.yaml'.

 

Before running the tool to transfer data populate the user_input.yaml. The user_input.yaml is located in the install directory of the tool at /manifests/platform/user_input.yaml. The key information in this file is the connection information to the target database server.

 

The tool can also be run in phases using the "run" options.   For example, if you want to only run the pre-check steps to check that the target instance meets the requirements for transfer set run.pre-check to true and the other two run variables to false. If any of the run phases are marked false, then in order to run that phase later, you must delete the job, modify the user_input.yaml file, and re-run the Job.

 

target:
    database: "postgres"
    admin: "dbmsowner"
    password: "${TARGET_USER_PASSWORD}"
    fqdn: sasgnn-p21077-default-flexpsql.postgres.database.azure.com
    port: 5432
    sslmode: verify-full
run:
    pre-check: true
    transfer: true
    post-check: true
overwrite: false
status_update_timespan: 15m00s
sleep_after_completion: 0m00s
log_level: info

 

When you are ready to transfer data, you must stop the SAS Viya environment.  The following code will kick off the SAS Viya stop job and wait for the PODS to terminate.

 

kubectl create job sas-stop-all-`date +%s` --from cronjobs/sas-stop-all
RUNNING_PODS=$(kubectl get pods  --field-selector=status.phase=Running  -o go-template='{{len .items}}')
until [ $RUNNING_PODS -lt 2 ]; do  \
      sleep 5; echo Running Pods: $RUNNING_PODS; \
      RUNNING_PODS=$(kubectl get pods  --field-selector=status.phase=Running  -o go-template='{{len .items}}'); \
done

 

Expected output:

 

Running Pods: 132
Running Pods: 87
Running Pods: 65
Running Pods: 31
Running Pods: 9
Running Pods: 4

 

Now we can apply the manifests to run the data transfer job. The PostgreSQL Data Transfer Tool uses two standard PostgreSQL utility programs: pg_dump to back up data from your source instance and pg_restore to load the backup to your target instance. During execution, the transfer tool performs the following tasks:

 

  • Starts the internal PostgreSQL server if it is not already running.
  • Performs a logical backup of each database in the source server.
  • Restores the logical backup into the target external PostgreSQL cluster, creating or re-creating the databases as necessary.

 

To start the transfer job, apply the mainfests.

 

cd /home/cloud-user/project/deploy/pgtransfer
kustomize build manifests/platform -o platform-transfer.yaml
kubectl apply -f platform-transfer.yaml

 

Expected output:

 

serviceaccount/sas-platform-postgres-data-transfer-sa created
role.rbac.authorization.k8s.io/sas-platform-postgres-data-transfer-role created
rolebinding.rbac.authorization.k8s.io/sas-platform-postgres-data-transfer-rolebinding created
secret/sas-platform-postgres-data-transfer-input-f7k8hm7khk created
persistentvolumeclaim/sas-platform-postgres-data-transfer-backups created
job.batch/sas-platform-postgres-data-transfer created

 

To check the success of the job, review the log. The log should contain messages indicating the successful execution of the data transfer tool. You can grep for specific text like "successfully copied database".

 

kubectl logs job/sas-platform-postgres-data-transfer

 

Expected output:

 

status=success duration=515ms
level=0 logger=summary ts=2023-10-05T15:47:55.015215518Z msg="pre-check sasgnn-p21077-default-flexpsql.postgres.database.azure.com | check server versions" status=success duration=15.3ms
level=0 logger=summary ts=2023-10-05T15:47:55.015274819Z msg="pre-check sasgnn-p21077-default-flexpsql.postgres.database.azure.com | check shared preload libraries" status=success duration=3.95ms matchingEssentialSharedPreloadLibraries=2
level=0 logger=summary ts=2023-10-05T15:47:55.015286719Z msg="pre-check sasgnn-p21077-default-flexpsql.postgres.database.azure.com | check extensions availability" status=success duration=381ms
level=0 logger=summary ts=2023-10-05T15:47:55.015295719Z msg="transfer sas-crunchy-platform-postgres" status=success duration=1m3.2s
level=0 logger=summary ts=2023-10-05T15:47:55.015325319Z msg="transfer sas-crunchy-platform-postgres | backup SharedServices to /backups/sas-crunchy-platform-postgres/SharedServices" status=success duration=36.7s
level=0 logger=summary ts=2023-10-05T15:47:55.015372120Z msg="transfer sas-crunchy-platform-postgres | restore from /backups/sas-crunchy-platform-postgres/SharedServices" status=success duration=26s
level=0 logger=summary ts=2023-10-05T15:47:55.015394720Z msg="post-check SharedServices" status=success duration=14.3s verifiedDatabaseAttributes=3 totalVerifiedRoleDetails=5789 verifiedRoles=1 totalVerifiedTableColumns=6840 totalVerifiedTableIndexes=1027 tablesWithMatchingNumbersOfRows=821 tablesWithMatchingNumbersOfConstraints=821 totalVerifiedTables=821 verifiedSchemas=84

 

NOTE: If the transfer fails you can delete the job using kubectl delete job/sas-platform-postgres-data-transfer and run it again using kubectl apply.

 

Switch SAS Viya to point to External PostgreSQL

 

The data has been copied to the external PostgreSQL instance, but SAS Viya is still using the internal instance. Once we are sure that the data has been transferred successfully, we can switch the SAS Viya deployment over to use the new external PostgreSQL instance. Follow the external PostgreSQL Configuration steps here to do this, do not remove the entries for your internal PostgreSQL clusters yet. They are removed later when the internal cluster is deleted. To summarize the process.

 

  1. Copy /sas-bases/examples/postgres/postgres-user.env to $deploy/site-config/postgres/platform-postgres-user.env  and add the PostgreSQL database user (dbmsowner) and password. Add a secret generator that refers to the file to. the secretGenerator block of the kustomization.yaml
  2. Copy the /sas-bases/examples/postgres/dataserver-transformer.yaml/dataserver-transformer.yaml from to the $deploy/site-config/postgres/ and edit the file to update the values. Add the file to the transformers section of the kustomization.yaml.  Here is an example of the of the dataserver-transformer.yaml.

 

apiVersion: builtin
kind: PatchTransformer
metadata:
  name: sas-platform-postgres-dataserver-transformer
patch: |-

  - op: replace
    path: /spec/users/0/credentials/input
    value:
      secretRef:
        name: platform-postgres-user
      usernameKey: username # For internal use, do not modify
      passwordKey: password # For internal use, do not modify

  - op: replace
    path: /spec/registrations/0/host
    value: sasgnn-p21077-default-flexpsql.postgres.database.azure.com

  - op: replace
    path: /spec/registrations/0/port
    value: 5432

  - op: replace
    path: /spec/ssl
    value: true
target:
  group: webinfdsvr.sas.com
  kind: DataServer
  version: v1beta1
  name: sas-platform-postgres

 

Now that the manifests are updated you can build and apply using your chosen deployment method.

 

Now we will validate that the Viya Environment points at the Azure flexible PostgreSQL server by checking the database registration in the SAS Configuration server. The serviceAddress in the output should match the name of your target PostgreSQL database, for example: sasgnn-p21077-default-flexpsql.postgres.database.azure.com

 

WARNING: Do not continue to the next step until you have validated that the Consul registration has switched to external PostgreSQL.

 

HTTP_PROTOCOL=https
CONSUL_SVC_NAME=sas-postgres
kubectl exec -it sas-consul-server-0 -c sas-consul-server -- bash -c "export CONSUL_HTTP_ADDR=$HTTP_PROTOCOL://localhost:8500; /opt/sas/viya/home/bin/sas-bootstrap-config catalog service $CONSUL_SVC_NAME" 

 

Expected output:

 

{
"items": [
    {
        "address": "sasgnn-p02189-default-flexpsql.postgres.database.azure.com",
        "createIndex": 26687,
        "datacenter": "viya",
        "modifyIndex": 26687,
        "node": "sas-postgres",
        "nodeMeta": null,
        "serviceAddress": "sasgnn-p21077-default-flexpsql.postgres.database.azure.com",
        "serviceEnableTagOverride": false,
        "serviceID": "sas-postgres",
        "serviceMeta": {},
        "serviceName": "sas-postgres",
        "servicePort": 5432,
        "serviceTags": [
            "postgres",
            "public",
            "data-service-postgres",
            "ssl"
        ],
        "serviceWeights": {
            "passing": 1,
            "warning": 1
        },
        "taggedAddresses": null
    }
]
}

 

If the serviceAddress still points to the internal instance or there are any other issues you can review the data server operator custom resource and logs.  There should be no errors in the data server operator logs and the operator definition should point to the new azure PostgreSQL flexible server.  If you find errors check your overlays and re-build and apply.

 

kubectl describe dataservers.webinfdsvr.sas.com
kubectl logs -l app=sas-data-server-operator

 

To complete the process, restart the SAS Viya Platform by running the stop and then the start job and wait for SAS Viya to start.

 

kubectl create job sas-stop-all-`date +%s` --from cronjobs/sas-stop-all
RUNNING_PODS=$(kubectl get pods  --field-selector=status.phase=Running  -o go-template='{{len .items}}')
until [ $RUNNING_PODS -lt 2 ]; do  \
      sleep 5; echo Running Pods: $RUNNING_PODS; \
      RUNNING_PODS=$(kubectl get pods  --field-selector=status.phase=Running  -o go-template='{{len .items}}'); \
done
kubectl create job sas-start-all-`date +%s` --from cronjobs/sas-start-all
kubectl wait --for=condition=ready pod --selector='app.kubernetes.io/name=sas-readiness' --timeout=2700s

 

If you are satisfied that the data has been transferred successfully and SAS Viya is functioning, then you can stop the internal PostgreSQL instance. Shut down the internal PostgreSQL.

 

kubectl patch postgresclusters sas-crunchy-platform-postgres --type json --patch '[{"op":"replace", "path": "/spec/shutdown", "value": true}]'

 

Expected output:

 

postgrescluster.postgres-operator.crunchydata.com/sas-crunchy-platform-postgres patched

 

Validate

 

Before removing the internal PostgreSQL instance, it is important to validate that Viya is functioning and that the content and configuration has been transferred.  In this step we use the

ValidateViya tool to execute some validation tests. 

 

python3 /opt/pyviyatools/validateviya.py -c /mnt/workshop_files/initenv/validation/tests.json -d /tmp/testresults -o report-full

 

Review the HTML Report Output in /tmp/testresults. The reports should show the folders where content was loaded (gelcorp and gelcontent) and the caslibs are all available.

 

01_GN_validate.png

 

Delete internal PostgreSQL.

 

Now we can remove the internal PostgreSQL. A script is provided to delete internal PostgreSQL servers and its Persistent Volume Claims(PVCs). Answer yes when prompted.

 

chmod 755 ~/project/deploy/pgtransfer/scripts/sas-postgres-delete.sh
~/project/deploy/pgtransfer/scripts/sas-postgres-delete.sh --namespace gelcorp

 

Finally, you can remove all references to the internal PostgreSQL from the kustomization.yaml.

 

ansible localhost -m lineinfile -a "path=~/project/deploy/${GELLOW_NAMESPACE}/kustomization.yaml regexp='crunchy' state='absent'"

 

If all the Viya environments in your cluster have been switched to using an external PostgreSQL you can delete the Crunchy data cluster-wide resources.

 

kubectl delete crd pgupgrades.postgres-operator.crunchydata.com
kubectl delete crd postgresclusters.postgres-operator.crunchydata.com

 

Wrap Up

 

The PostgreSQL Data Transfer Tool for the SAS Viya Platform allows you to move content from an internal instance of PostgreSQL to an external instance. The transfer tool only copies the data in the internal PostgreSQL instance to an external PostgreSQL instance. Required post-transfer steps change the connection information so that the existing SAS Viya platform deployment uses the external instance instead of the internal one. Later steps remove the internal instance. In this post we have walked through the process of switching from an internal PostgreSQL instance to an external instance using Azure Flexible PostgreSQL server.

 

 

Find more articles from SAS Global Enablement and Learning here.

Comments

Hi @GerryNelson ,

 

Thanks for this operation guide, detailed and well-researched. This is not an easy task, for sure ! At the beginning,

in the commands given for finding out the primary pod controller, there is an undefined environment variable SOURCE_CLUSTER  :

 

 

export SOURCE_CLUSTER=sas-crunchy-platform-postgres
PRIMARY=$(kubectl get pod --selector="postgres-operator.crunchydata.com/role=master,postgres-operator.crunchydata.com/cluster=$SOURCE_CLUSTER" -o jsonpath='{.items[*].metadata.name}' | awk '{print $1}')
echo Primary Crunchy POD is $PRIMARY
kubectl exec -it $PRIMARY -c database -- bash

SOURCE_CLUSTER being defined  in the following page :

https://documentation.sas.com/doc/fr/sasadmincdc/default/calsrvinf/n00000sasinfrdatasrv000admin.htm?...

 

Instead of following the steep path of DB switching here described, another option could be to re-install Viya from scratch a-A (internal) => B-b (external) using the export/import feature or Environment Manager or  (better) SAS Viya CLI transfer plugin with additional steps as following : (a) export all Postgres content to JSON package , (b) import the previous package.

 

Best Regards,

Ronan

 

Hi @ronan thank you for pointing out that omission. I have fixed the post.

 

You are correct, using export and import between two environments would be an alternative approach. It could be useful in some cases, however it might have its own issues around making sure you move all the content and related configuration.  

Version history
Last update:
‎05-10-2024 11:20 AM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started