BookmarkSubscribeRSS Feed

The CAS CAS library: copying data between CAS libraries and servers

Started ‎04-21-2022 by
Modified ‎04-21-2022 by
Views 3,021

New to SAS Viya the CAS caslib (srcType=CAS) is a CAS library that has as its source in-memory data in a different CAS library. In this post I will look at how, and why you would use the CAS type of CAS library.  It doesn’t seem that long ago that the CAS library was a new and sometimes confusing concept. I say it doesn’t seem that long ago, but it was. 😊  I recorded a youtube video in 2017 SAS Viya CAS Libraries (Caslibs) Simplified. The video is still relevant if you need a brief introduction to CAS libraries.

 

To recap, a CAS library provides access to data from a data source environment and access to in-memory tables that are loaded to CAS from the data source. A caslib’s data source can be a path on a filesystem, a database, a SAS 9.4 LASR library, and now another CAS library. Path-based CAS libraries are one of the most commonly used, using a path-based caslib you can easily load sas7bdat or sashdat from a path on the file system into memory for processing in Viya. With this new type of CAS library, we can now define a caslib that has as its data source in-memory tables in another CAS server. Data can then be loaded from a CAS library in one CAS server to a different CAS server and CAS library using native SAS technology.

 

Why would you want to do that?  Well, up to this point it has not been easy to copy data between caslib using native SAS tools. This is something that we can now do using a caslib of scrtype=CAS. You could copy data between CAS libraries in two different deployments or between CAS libraries in two different CAS servers in the same deployment.

 

In my case, I was working on migration between two Viya 4 environments where I had to move in-memory files from one CAS server to another. I have two Viya environments in two separate namespaces in the same K8s cluster. I have some in-memory files in the VAModels caslib in from35 that I want to copy to the VAModels caslib in Viya in my target namespace. To do this you have to:

 

  1. Create a caslib in the target namespace, called "transfer" that has as its source the VAModels caslib in the from35 environment.
  2. Load tables into memory in this new transfer caslib.
  3. Persist the in-memory tables to the path of the VAModels caslib in the target namespace.

 

The diagram below shows the process of copying the data from the VA models CAS library on one CAS server to the VA models CAS library on a different CAS server. In this case, the two CAS servers are in two different Kubernetes namespaces.

 

gn_1_gnn-cascaslib-006.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

Step 1) Define a caslib with srcType=CAS in the target

 

The first step is, in the target environment, to define a CAS library with srcType=CAS. The target CAS library will point at the CAS server and caslib in the source environment. CAS libraries can be defined in multiple ways (caslib statement, proc CAS, etc.) Here we define the CAS library using the table.addcaslib using the following attributes:

 

  • srctype=CAS
  • cashost=controller.sas-cas-server-default.target the hostname of the CAS controller in the source environment
  • caslib=VAModels is the name of the caslib that I want to copy from in the source environment.

 

The cashost setting can be retrieved from the hostknownby setting in CAS configuration. In SAS Environment Manager in the Servers area assume the Super-User role then select Configuration > CAS Configuration.

 

gn_2_gnn-cascaslib-003.png

 

This code creates the caslib, displays information about the caslib, and lists the tables. This process has made in-memory tables in the VAModels caslib in the source environment(from35) available to the transfer caslib in the target environment(target).

 

cas transfersess;
proc cas;
/* add a caslib that has as its source the a caslib on a different CAS server */
table.addcaslib /
       name="transfer",
       description="source for data transfer",
       dataSource={srctype="CAS",
                   user="geladm",
                   password="lnxsas",
                   cashost="controller.sas-cas-server-cas-shared-default.from35",
                   caslib="VAModels"
               };

table.caslibinfo / caslib="transfer";
table.fileinfo / caslib="transfer";
quit;

gn_3_gnn-cascaslib-002.png

 

 Step 2) Load to memory in the target caslib from the memory of the source caslib

 

The next step will be to load the tables into memory. They are currently accessible in the source of the "target" CAS library, the code will load them into memory in the VAmodels CAS library.

 

To be clear what is happening here, our transfer CAS libraries data source is the in-memory space of a different CAS library on a different CAS server, as a result, we are loading tables from the memory of one CAS library and server into memory in another CAS library and server. In this case, the CAS servers are in two separate deployments, but they could be in the same deployment.

 

/* drop the current table if it exists */
/* allows you to run the code multiple times*/
table.droptable / caslib="VAModels" name="TRAIN1626187802213" quiet="TRUE";
table.droptable / caslib="VAModels" name="TRAIN1626187176671" quiet="TRUE";

/* Load the table from the transfer CAS caslib to the memory space of the VAModels caslib */
table.loadtable / caslib="transfer" path="TRAIN1626187802213"
                  casout={caslib="VAModels",name="TRAIN1626187802213",promote=TRUE};
table.loadtable / caslib="transfer" path="TRAIN1626187176671"
                  casout={caslib="VAModels",name="TRAIN1626187176671",promote=TRUE};

gn_4_gnn-cascaslib-004.png

 

Step 3) Persist the data in the target CAS library's source

 

As we know in-memory data does not persist across restarts of the CAS server. The target VAModels caslib is a path-based caslib. This step will save the in-memory tables to the source path of the VAModels caslib. After this step is completed, in the future the data will be available to load from the source of the VAmodels CAS library in the target.

 

/* save the table as a caslib in the source of the target VAModels caslib */
table.save / caslib="VAModels" name="TRAIN1626187802213"
             table={caslib="VAModels",name="TRAIN1626187802213"} permission="GROUPWRITEPUBLICREAD" replace=True;
/* save the table as a caslib in the source of he target path caslib */
table.save / caslib="VAModels" name="TRAIN1626187176671"
             table={caslib="VAModels",name="TRAIN1626187176671"} permission="GROUPWRITEPUBLICREAD" replace=True;

gn_5_gnn-cascaslib-005.png

 

 

Conclusion

 

The CAS CAS library is a useful new feature. I am sure you can think of other use-cases it would support. In migration scenarios, it will help in moving data between CAS servers in support of migrated content. In Viya 4 moving data between deployments can be more complex and require the assistance of a Kubernetes administrator. The CAS CAS library allows us to move data natively using SAS code. This can help keep the process under the control of the SAS Administrator.  

 

Version history
Last update:
‎04-21-2022 08:36 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started