Q: Is cas duplicating the tables on every node?
A: CAS partitons the data to spread is across CAS nodes. By default, it makes a single replicate copy, also partitioned across the nodes, such that each node would have two unique partitions. If a CAS node were to go down, this allows processing on the data to continue, because a copy of that CAS node’s data partition would be located on another node. The number of replicate copies is configurable using the copies= option on the LOAD statement of PROC CASUTIL.
Q: Is cas when loading to memory using also cas cache because the extension is different than SAShdat?
A: The way that the SAS Viya Cloud Analytic Services (CAS) Server manages data in memory is a complicated topic – the two links below are handy references. In general, for *.sas7bat, cas-cache will be used.
4 Rules to Understand CAS Management of In-Memory Data
Directing CAS when to use its cache (or not)
Loading Data Using Graphical User Interface
The easiest way to load files that are already in a caslib is within the GUI. Use the Manage Data option in the application menu.
Navigate to Data Sources and then to the CASLIB of interest.
Click on the table that is to be loaded into CAS and click the lightning bolt in the upper right hand corner.
Loading Data Using SAS Code
If SAS code is a preference, below is an example that demonstrates loading a sas7bdat file to a CAS library and promoting it so that it is available to other CAS sessions, while also controlling the number of replicate copies (default is copies=1).
* Start a CAS session called mySession;
cas mySession sessopts=(caslib=casuser timeout=1800 locale="en_US" );
*Generate SAS librefs for caslibs;
caslib _all_ assign;
*Define a SAS libref where files SAS datasets are stored;
libname mydata "/nfs/Data/";
*load to CAS (CASUSER or other CASLIB) and promote;
proc casutil;
load data=mydata.modeldata
outcaslib="CASUSER"
casout="ModelData" promote copies=1;
run;