BookmarkSubscribeRSS Feed

CAS Server Topology Changes And CAS Table Balancing

Started ‎11-16-2023 by
Modified ‎11-16-2023 by
Views 464

As of the stable 2023.09 release of SAS Viya platform, CAS table balancing can be enabled, for massively parallel processing (MPP) CAS server only, by the SAS Viya platform administrator to ensure that CAS global and/or session tables are automatically rebalanced each time the number of CAS server workers changes.

In this article, I will show you how to enable CAS table balancing and discuss this setting's impact.

  

The new CAS table balancing feature

 

The new CAS table balancing feature, when enabled by the SAS Viya platform administrator, automatically rebalances the configured CAS tables when adding workers or redistributes impacted CAS tables when removing workers for an MPP CAS server by moving CAS tables data blocks.

This new feature allows the SAS Viya platform administrator to modify the MPP CAS server topology without having to reload the CAS tables.

Before this new feature

 

Before this new feature, when an MPP CAS server number of workers changes, a few points had to be taken into consideration...

  • When adding workers:
    • No need to restart the CAS server (at the discretion of the SAS Viya platform administrator).
    • CAS tables loaded before the workers were added will be unbalanced because remain on the original set of workers.
    • The new CAS server workers cannot participate in workload on these CAS tables.
    • New loaded CAS tables will automatically be distributed across all of the available workers.

 

  • When removing workers:
    • Existing loaded CAS tables could be impacted if no redundancy is set for them (number of copies). CAS tables could become corrupt because of missing blocks of data.
    • Existing loaded CAS tables with redundancy sets are impacted if the number of copies is more than the number of CAS workers.
    • The CAS server must be restarted.
    • CAS tables must be reloaded.
    • New CAS tables will be loaded as expected across all of the available workers.
    • The SAS Viya platform administrator must ensure that the CAS server provides enough resources to manage the required tables (most specifically memory and CAS_DISK_CACHE).


Note: Here, I talk about CAS server topology changes. When I talk about "removing worker", it is not regarding CAS server worker failure.

Why use this new CAS table balancing feature?

 

CAS table balancing is most appropriate when the MPP CAS server topology frequently changes. For example, if it is necessary to add a few more CAS server workers during a specific busy period to support more CAS server workload without impacting the user community, CAS table balancing should be considered. Likewise, after the busy period ends and the SAS Viya platform administrator wants to reduce the number of CAS workers, CAS table balancing can prevent the need to reload tables that are loaded into memory.

A few points that need to be considered when using CAS table balancing

 

Before enabling CAS table balancing, the SAS Viya platform administrator should consider the following:

 

  • When adding workers:
    • Only selected blocks of data for each rebalanced table are redistributed to the new workers. Because CAS tables are not fully reloaded from the source files, CAS table balancing is quicker.
    • Table redistribution policy options determine which tables are eligible for rebalancing. The SAS Viya platform administrator can set rules to permit CAS table redistribution at the CAS table, the CAS Library, the CAS server (tableRedistUpPolicy), or the CAS session (sessionTableRedistUpPolicy) level.

 

  • When removing workers:
    • Redistribution occurs but cannot be controlled (if CAS table balancing is enabled, the CAS tables are automatically redistributed. No way to exclude some CAS tables).
    • Only blocks of data from the removed nodes are redistributed. Because CAS tables are not fully reloaded from source files, CAS table balancing is quicker.
    • The memory and CAS_DISK_CACHE on the remaining nodes must be large enough to accommodate the redistributed blocks of data.


Whether you add or remove workers, the CAS server will not accept new work while the blocks of data are being rebalanced. Therefore, users may experience a brief interruption of service from the CAS server until the rebalancing is completed. For any current CAS sessions, the CAS server pauses work at an action boundary until the rebalancing is finished. At that point, workload for the session will continue.

How CAS table balancing works...

 

The following diagrams illustrate what happens during the CAS table balancing process.

 

Add CAS server workers 1 - CAS Server Initial Topology 2 - Add Two CAS Server Workers
gc_1_BlogPost20230811_Viya_CAS_TableBalancing_ScaleUp_0000_20.png

In green: A CAS table with redistribution policy options.

In orange: A CAS table without redistribution policy options.

gc_2_BlogPost20230811_Viya_CAS_TableBalancing_ScaleUp_0001_20.png

In light blue: Two new CAS server workers to add.

3 - CAS Table Balancing Process 4 - CAS Server New Topology
gc_3_BlogPost20230811_Viya_CAS_TableBalancing_ScaleUp_0002_20.png

Only the data blocks of the green CAS table are redistributed because of the redistribution policy options.

gc_4_BlogPost20230811_Viya_CAS_TableBalancing_ScaleUp_0003_20.png

The two new CAS server workers are added, and the required CAS tables' data blocks are redistributed.

 

Remove CAS server workers 1 - CAS Server Initial Topology 2 - Remove Two CAS Server Workers
gc_5_BlogPost20230811_Viya_CAS_TableBalancing_ScaleDown_0000_20.png

In green: A CAS table with redistribution policy options.

In orange and red: CAS tables without redistribution policy options.

gc_6_BlogPost20230811_Viya_CAS_TableBalancing_ScaleDown_0001_20.png

In light red: Two CAS server workers to remove.

3 - CAS Table Balancing Process 4 - CAS Server New Topology
gc_7_BlogPost20230811_Viya_CAS_TableBalancing_ScaleDown_0002_20.png

All data blocks from the removed CAS server workers are redistributed, no redistribution policy options are used when scale down.

gc_8_BlogPost20230811_Viya_CAS_TableBalancing_ScaleDown_0003_20.png

The two CAS server workers are removed, and all CAS tables' data blocks are redistributed.

 


 

How to enable CAS table balancing

 

Two new CAS server environment variables are provided to enable the CAS table balancing feature.

By setting these new CAS server environment variables, and setting the CAS tables redistribution policies, the SAS Viya platform administrator ensures that the CAS tables will not require to be reloaded. Based on the settings, only a few of their data blocks are moved to other workers. This takes less time than having to reload full CAS tables.

The CAS table balancing environment variables

 

Two CAS server environment variables could be set by the SAS Viya platform administrator. These environment variables can be set independently from each other.

  Values If True or 1
CAS_GLOBAL_TABLE_AUTO_BALANCE Enable: True or 1

Disable: False or 0, or unset (the default)
When adding workers: In-memory configured global CAS tables are rebalanced.

When removing workers: All in-memory impacted global CAS tables are redistributed.
CAS_SESSION_TABLE_AUTO_BALANCE Enable: True or 1

Disable: False or 0, or unset (the default)
When adding workers: In-memory impacted configured user sessions' CAS tables are rebalanced.

When removing workers: All in-memory impacted user sessions' CAS tables are redistributed.

 


The SAS Viya platform administrator must decide if only the CAS Global Tables should be rebalanced, or if the CAS Session Tables should be rebalanced too. This choice will impact the availability of the CAS server after its number of workers changes since this process will take more time depending on the number of existing loaded CAS tables.

The patchTransformer manifest

 

Currently, these two new CAS server environment variables can be set only using a patchTransformer manifest. However, a feature request exists to be able to set these environment variables using future options of the create-cas-server.sh script).

1. The SAS Viya platform administrator must copy a provided example from the SAS Viya platform deployment asset into the deployment sas-config directory. This file is named cas-add-environment-variables.yaml and is located in the sas-bases/examples/cas/configure/ directory.

 

2. Depending on which kind of CAS tables must be redistributed, the SAS Viya platform administrator needs to modify the copy of the /sas-config/cas-add-environment-variables.yaml manifest to set the values of the CAS_GLOBAL_TABLE_AUTO_BALANCE and CAS_SESSION_TABLE_AUTO_BALANCE CAS server environment variables.

 

# This block of code is for adding environment variables for the CAS server.
---
apiVersion: builtin
kind: PatchTransformer
metadata:
  name: cas-add-environment-variables
patch: |-
  - op: add
    path: /spec/controllerTemplate/spec/containers/0/env/-
    value:
      name: CAS_GLOBAL_TABLE_AUTO_BALANCE
      value: "true"
  - op: add
    path: /spec/controllerTemplate/spec/containers/0/env/-
    value:
      name: CAS_SESSION_TABLE_AUTO_BALANCE
      value: "true"
target:
  group: viya.sas.com
  kind: CASDeployment
  # Uncomment this to apply to all CAS servers:
  name: .*
  # Uncomment this to apply to one particular named CAS server:
  #name: {{ NAME-OF-SERVER }}
  # Uncomment this to apply to the default CAS server:
  #labelSelector: "sas.com/cas-server-default"
  version: v1alpha1

 

3. Then, this new /sas-config/cas-add-environment-variables.yaml manifest must be referenced into the transformer field of the kustomization.yaml file.

 

4. Finally, the SAS Viya platform administrator must build and apply the new SAS Viya platform deployment manifest using his specific deployment method to update the SAS Viya platform deployment CAS server.

Note: Assuming that only the CAS server configuration changes in the new SAS Viya platform deployment manifest.

 

 

CAS table balancing versus CAS state transfer

 

CAS table balancing is not the only option that the SAS Viya platform administrator could use when changing the topology of a CAS server. Another feature named the CAS server State Transfer can be used to "restart" the CAS server after modifying its topology without having to reload the CAS tables and losing the user CAS session.

 

  Table Balancing State Transfer  
Impact on Resources When adding workers: More CAS worker pods, requires more CPU/Memory/Disk and eventually Kubernetes cluster nodes if CAS AUTORESOURCES is enabled.

When removing workers: Less CAS worker pods, requires less CPU/Memory/Disk and eventually Kubernetes cluster nodes if CAS AUTORESOURCES is enabled.
During the process, two instances of the CAS server exist for a period of time.
Topology Changes For MPP CAS server only.

Supported topology changes:
  • Add/Remove workers only.
For both SMP and MPP CAS servers.

Supported topology changes:
  • SMP to MPP, MPP to SMP.
  • Add/remove workers.
  • Add/remove backup controller.
CAS Tables A few data blocks are redistributed. CAS tables from CAS server instance 1 are saved and then reloaded in the CAS server instance 2.
All tables (Global and session)
CAS Sessions Preserved. Preserved and transferred.
Service Interruption During the redistribution process. During the transfer process.

 

 

I hope this article has been helpful to you.

 

Special thanks to SAS Technical writers, Mike Carney (R&D), Logan Chandler (R&D), Richard Knight (R&D), @RPoumarede , and @RobCollum .

 

References:

      SAS documentation:

 

 

Version history
Last update:
‎11-16-2023 03:09 PM
Updated by:
Contributors

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags