BookmarkSubscribeRSS Feed

Data append to a partitioned CAS table

Started ‎09-17-2019 by
Modified ‎09-17-2019 by
Views 4,310

You may have partitioned a CAS table to improve the overall performance of Viya application. Over time, if there is a need for incremental data append to partitioned CAS table, the task is not simple. In this post, I discuss the data append process for a partitioned CAS table.

 

A CAS table can be partitioned on a specific data column to improve the CAS action performance. A partitioned CAS table avoids the auto partitioning and data movement among CAS worker nodes at run time for the by-group operation.

 

The data append to a partitioned CAS table is not supported. The data can be appended only to a standard CAS table (non-partitioned). When data is appended to a partitioned CAS table, it generates the following error message:

 

75    
76    data pubSPRE.DM_FACT_MEGA_CORP_GOLD( append=yes );
77       set pubSPRE.DM_FACT_MEGA_CORP_APND;
NOTE: The DATA step will run in multiple threads.
NOTE: Executing action 'dataStep.runBinary'.

ERROR: DATA step cannot append to table DM_FACT_MEGA_CORP_GOLD in caslib DM. 
Append is not supported for partitioned tables.
ERROR: The action stopped due to errors.

How to append data to a partitioned CAS table

The data can be appended to a partitioned CAS table using an intermediate staging table with the following steps. To append data to a partitioned CAS table you must have enough (thrice the size of the table) memory space (read CAS_DISK_CACHE) to accommodate two intermediate CAS table for a short period.

  1. Copy an existing in-memory partitioned CAS table to a staging CAS table without data partition.
  2. Append incremental data into the same staging CAS table.
  3. Create a new partitioned staging CAS table from appended staging CAS table.
  4. Drop the global in-memory partitioned CAS table.
  5. Promote the staging table (session scope) to global scope with the original table name.

When creating a staging table, do not use the original name for the target table name. When the same original name is used for target staging table, there will be two CAS tables with the same name, one in sessions scope and another in the global scope. When drop table action is submitted for same name table, it will drop the session scope staged table rather than the global scope table. The promote action will fail as there is no session-scope table. The global table will be unaffected; no new data appended. So, avoid confusion and use a new name (e.g. _stg) for staging table and stay away from getting same name CAS table in session scope and global scope.

 

The following code example describes the process of data append to an in-memory partitioned global CAS table. The data step used for creating a staging table and data append runs in CAS as both source and target is a CAS table. Step 3 creates a new staging table with partitioned data since data partition CAS action does not support in-place data partition.

 

CAS mySession  SESSOPTS=( CASLIB=casuser TIMEOUT=99 LOCALE="en_US" metrics=yes);

/* Note the number of rows in CAS table before data append */  
proc cas ;
   action table.tabledetails / name="DM_FACT_MEGA_CORP_GOLD" caslib="DM"; 
run ;

/* 1. Copy existing global CAS table into a session scope staging CAS table. */ 
libname pubSPRE cas caslib="DM";
data pubSPRE.DM_FACT_MEGA_CORP_STAGE;
   set pubSPRE.DM_FACT_MEGA_CORP_GOLD; 
run ;

/* 2. Incremental data append into staging table */ 
data pubSPRE.DM_FACT_MEGA_CORP_STAGE( append=yes );
   set pubSPRE.DM_FACT_MEGA_CORP_APND; 
run ;

/* 3. Create a new staging table with data partition from appended staging table */ 
proc cas ;
   table.partition /
      casout={name="DM_FACT_MEGA_CORP_STAGE_N", caslib="DM"},
      table={name="DM_FACT_MEGA_CORP_STAGE", caslib="DM", 
             groupby={{name="FacilityCity"}},orderby={{name="date"}}};
quit ;

/* 4. Drop the existing global CAS table. */ 
/* 5. Promote the partitioned staging CAS table to global CAS table with same original name */
Proc casutil; 
droptable casdata="DM_FACT_MEGA_CORP_GOLD" incaslib="DM"  quiet;
promote incaslib="DM" outcaslib="DM" casdata="DM_FACT_MEGA_CORP_STAGE_N" casout="DM_FACT_MEGA_CORP_GOLD" ; 
quit;

/* verify new CAS table have appended data with data partition. */  
proc cas ;
   action table.tabledetails / name="DM_FACT_MEGA_CORP_GOLD" caslib="DM"; 
run ;

proc cas;
      table.tabledetails result=r / level='partition'       
      name="DM_FACT_MEGA_CORP_GOLD" caslib="DM" ;
      saveresult r dataout=work.r ;
run ; quit ;

proc print data=work.r; 
var node blocks  rows DataSize 'Key(FacilityCity)'n;  
run;

CAS mySession  TERMINATE;

 

Log extract from code execution:

 

…………………..
………
75    
76    /* 1. Copy existing global CAS table into a session scope staging CAS table. */
77    libname pubSPRE cas caslib="DM";
NOTE: Libref PUBSPRE was successfully assigned as follows: 
      Engine:        CAS 
      Physical Name: 86b07d6f-2eab-7041-812d-fa73baed385a
78    
79    data pubSPRE.DM_FACT_MEGA_CORP_STAGE;
80       set pubSPRE.DM_FACT_MEGA_CORP_GOLD;
81    run ; 
NOTE: Running DATA step in Cloud Analytic Services.
NOTE: The DATA step will run in multiple threads.
NOTE: Executing action 'dataStep.runBinary'.
NOTE: There were 3722071 observations read from the table DM_FACT_MEGA_CORP_GOLD in caslib DM.
NOTE: The table DM_FACT_MEGA_CORP_STAGE in caslib DM has 3722071 observations and 39 variables.
NOTE: DATA statement used (Total process time):
      real time           4.00 seconds
      cpu time            0.25 seconds
      
82    
83    /* 2. Incremental data append into staging table */
84    data pubSPRE.DM_FACT_MEGA_CORP_STAGE( append=yes );
85       set pubSPRE.DM_FACT_MEGA_CORP_APND;
86    run ;
NOTE: Running DATA step in Cloud Analytic Services.
NOTE: The DATA step will run in multiple threads.
NOTE: Executing action 'dataStep.runBinary'.
NOTE: There were 3722071 observations read from the table DM_FACT_MEGA_CORP_APND in caslib DM.
NOTE: The table DM_FACT_MEGA_CORP_STAGE in caslib DM has 7444142 observations and 39 variables.
NOTE: The APPEND operation for table DM_FACT_MEGA_CORP_STAGE in caslib DM added 3722071 observations.
NOTE: DATA statement used (Total process time):
      real time           4.28 seconds
      cpu time            0.21 seconds
      
87    
88    /* 3. Create a new staging table with data partition from appended staging table */
89    proc cas ;
90       table.partition /
91          casout={name="DM_FACT_MEGA_CORP_STAGE_N", caslib="DM"},
92          table={name="DM_FACT_MEGA_CORP_STAGE", caslib="DM",
93                 groupby={{name="FacilityCity"}},orderby={{name="date"}}};
94    quit ;
NOTE: Active Session now MYSESSION.
NOTE: Executing action 'table.partition'.
NOTE: Action 'table.partition' used (Total process time):
{caslib=DM,tableName=DM_FACT_MEGA_CORP_STAGE_N,rowsTransferred=2790769,shuffleWaitTime=8.6317923069,minShuffleWaitTime=0,
maxShuffleWaitTime=0.7270908356,averageShuffleWaitTime=0.000617748}
NOTE: PROCEDURE CAS used (Total process time):
      real time           20.56 seconds
      cpu time            0.77 seconds
……
………………….

 

Result extract from code execution:

  • The number of rows before data append to a partitioned table

     

    apnd_partitioned_pics1.png

    Select any image to see a larger version.
    Mobile users: To view the images, select the "Full" version at the bottom of the page.

     

  • The number of rows after data append to a partitioned table

     

    apnd_partitioned_pics2.png

     

  • CAS table partition information after data append to a partitioned table

     

    apnd_partitioned_pics3.png

Version history
Last update:
‎09-17-2019 02:04 PM
Updated by:
Contributors

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags