BookmarkSubscribeRSS Feed

CAS Table Data refresh (CAS Table “Swap Out”)

Started ‎09-06-2017 by
Modified ‎09-06-2017 by
Views 3,372

With SAS Viya 3.2, what is the best approach to refresh a very large, stale CAS table (read FACT table) when the source data gets updated? The question is even more important when there is a narrow downtime window for complete refresh.

 

This post discusses a mechanism for swapping data into a CAS table from a staged session scoped CAS table, minimizing downtime while still doing a complete refresh. The mechanism leverages the basic concept of CAS where a user can load a table as session scoped table and later promote it to global scope for use in VA 8.1. The data transfer (PROMOTE) operation executes in CAS (memory) so it’s super-fast and efficient with minimum down time.  

 

To refresh a stale CAS table from source data quickly, follow these steps:

  • Load the new table version of source as a session CAS table (staging table)
  • Drop target CAS table
  • Promote the staged session CAS table to the target CAS table name

 

The following code demonstrates the data swap. For convenience purposes, I am using a .sas7bdat table to load CAS. The same data could have been in a third party database (hadoop-hive) with a view created using multiple tables to generate the flat FACT table. By using Data Accelerator and SAS EP, the source data could have parallel feed to CAS staged table. The initial step in the code loads the 10GB dataset table into a session CAS table (staging table). Then, it drops the target CAS table and promotes the staging table to the target CAS table name.

 

In this case, a 10GB (36M rows) table only takes ~ 8 sec to drop and ~ 0.23 sec to PROMOTE !!!    

 

Code:

 

  options msglevel=i ;

  cas mySession sessopts=(messagelevel=all) ;

  libname srcld "/opt/sas/data";

 

  * Load new version of source table into a session scope CAS table;

  proc casutil ;

    load data=srcld.order_fact_10G outcaslib="CASPATH"

    casout="order_fact_10G_stg" replace ;

  quit ;

 

  * Drop target CAS table (global) and PROMOTE session scoped CAS table with target CAS table name;

  PROC CASUTIL ;

    droptable casdata="order_fact_10G" incaslib="CASPATH" ;

    promote casdata="order_fact_10G_stg" casout="order_fact_10G"

    incaslib="CASPATH" outcaslib="CASPATH" ;

  quit ;

 

  cas mySession terminate ;    

 

 

Log from code execution:

 

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

INFO: Character variables have defaulted to a length of 200 at the places given by: (Line):(Column). Truncation can result.

23:1 RC

56

57 options msglevel=i ;

58 cas mySession sessopts=(messagelevel=all) ;

NOTE: The session MYSESSION connected successfully to Cloud Analytic Services intcas01.race.sas.com using port 5570. The UUID is c8bc1412-2942-5f40-b8b8-25b8962ee5d0. The user is sastest1 and the active caslib is CASUSER(sastest1).

NOTE: The SAS option SESSREF was updated with the value MYSESSION.

NOTE: The SAS macro _SESSREF_ was updated with the value MYSESSION.

NOTE: The session is using 3 workers.

NOTE: The CAS statement request to update one or more session options for session MYSESSION completed.

59

60 libname srcld "/opt/sas/data";

NOTE: Libref SRCLD was successfully assigned as follows:

Engine: V9

Physical Name: /opt/sas/data

61

62 * Load new version of source table into a session CAS table ;

63 proc casutil ;

NOTE: The UUID 'c8bc1412-2942-5f40-b8b8-25b8962ee5d0' is connected using session MYSESSION.

64 load data=srcld.order_fact_10G outcaslib="CASPATH"

casout="order_fact_10G_stg" replace ;

NOTE: SRCLD.ORDER_FACT_10G was successfully added to the "CASPATH" caslib as "order_fact_10G_stg".

65 quit ;

 

NOTE: PROCEDURE CASUTIL used (Total process time):

real time 4:06.37

cpu time 34.60 seconds

66

67 * Drop target live CAS table and PROMOTE the session CAS table with target CAS table name ;

68 PROC CASUTIL ;

NOTE: The UUID 'c8bc1412-2942-5f40-b8b8-25b8962ee5d0' is connected using session MYSESSION.

69 droptable casdata="order_fact_10G" incaslib="CASPATH" ;

NOTE: Cloud Analytic Services dropped table order_fact_10G from caslib CASPATH.

NOTE: The Cloud Analytic Services server processed the request in 6.089074 seconds.

70

71 promote casdata="order_fact_10G_stg" casout="order_fact_10G"

72 incaslib="CASPATH" outcaslib="CASPATH" ;

NOTE: The Cloud Analytic Services server processed the request in 0.239992 seconds.

NOTE: Cloud Analytic Services promoted table order_fact_10G_stg in caslib CASPATH.

73 quit ;

 

NOTE: PROCEDURE CASUTIL used (Total process time):

real time 6.35 seconds

cpu time 0.01 seconds

74

75 cas mySession terminate ;

NOTE: Deletion of the session MYSESSION was successful.

NOTE: Request to TERMINATE completed for session MYSESSION.

76

77 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

90  

 

 

 

A Note on Memory Requirements

The session CAS table (staging table) load and swap-out process briefly requires twice the normal CAS table memory storage since there will be two copies of the table in memory, the staged one and the one being updated. Once the CAS table is dropped and Staged CAS table promoted with target CAS table name, this requirement goes away.    

 

 

Reference documents CASUTIL Procedure

Version history
Last update:
‎09-06-2017 10:03 AM
Updated by:
Contributors

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags