Directing CAS when to use its cache (or not)

4 Likes

We all know that SAS Cloud Analytics Services (CAS) is an in-memory analytics engine. Ideally, all of the data will reside in RAM on the CAS host(s) which gives it the fastest possible route for processing in the CPU. To improve its enterprise flexibility however, CAS can employ disk-based storage for caching data as well. And that leads to benefits such as improved data availability and a more robust memory utilization scheme.

The space set aside for this purpose is CAS_DISK_CACHE. But the CAS_DISK_CACHE isn't always used when CAS is working with in-memory data. There are circumstances that determine when it's used and when it's not. Most of the time, CAS will use its cache in a way that benefits your process objectives. But occasionally, there's a need to direct CAS to behave differently.

Let's take a look at the situations where you might want to override CAS' default caching behavior and how to accomplish it.

Default behaviors

The way that the SAS Viya Cloud Analytic Services (CAS) Server manages data in memory is a complicated topic. There are many twists and turns to cover all of the nuances and possibilities. However, most of that behavior can be distilled down to a few simple rules which adequately describe the majority of situations. See 4 Rules to Understand CAS Management of In-Memory Data for a fuller explanation, but for now, let's look at:

Rule № 2:

All CAS in-memory data is memory mapped to a locally-accessible backing store.

Very briefly, this means that the data loaded into CAS is typically backed with a memory-map to local disk storage - and usually, that's CAS_DISK_CACHE. But there's one notable distinction, if you're loading data into CAS which is already in unencrypted SASHDAT format on disk that appears local to the CAS host, then it memory-maps to the source (and not to cache).

Why change the behavior?

To ensure robust data availability in spite of a hardware failure, CAS normally defines a backing store for all in-memory data. For PATH and DNFS types of caslib, CAS makes efficient use of the existing SASHDAT file as the backing store instead relying on its own cache location.

The problem

There's one use-case in particular where overriding CAS' default caching behavior can be helpful. When you have multiple, concurrent users of a PATH or DNFS sourced, unencrypted SASHDAT file - and one (or more) of those users need to make changes to the data. This kind of situation often becomes a classic computer science challenge referred to as a race condition. If you've ever tried to edit a file alongside other people on a shared disk, you've experienced a similar race condition yourself.

Let's say you and I both want to work with an unencrypted SASHDAT file in a DNFS caslib. And then we both start our own CAS sessions, load the table, and get to work.

By default, when the first user (you) runs the loadTable action, their CAS session will memory-map to the SASHDAT file's source (not CAS_DISK_CACHE). Then when the second user (me) runs the loadTable action, the second CAS session will use the same memory-map handles. This makes CAS very efficient - it only effectively loads the table into RAM once.

Knock, knock.
A race condition.
Who's there?

But now imagine that you make changes to the in-memory table, save them back to source, and quit your session. And then I make some different set of changes and write those back to the same source. I'll be the jerk who overwrote your changes and you won't know until later (that's the race condition).

The reality is slightly more gritty than this, though. Turns out in real life that I wouldn't get a chance to save my changes. After you pushed your changes to the source, that then effectively adds (and/or deletes) some of the memory maps CAS was using. Your session knows about those changes (of course), but my CAS session doesn't. And we've seen that when this happens, it can cause my and other CAS sessions to hang or otherwise become unresponsive. That's not the kind of experience we want to provide to our users.

The solution

Normally if you have a table in CAS for multiple people to see, it should be promoted to the global caslib. Updates can be made and the other consumers will see those as well. But in the scenario I'm describing here, we've got multiple users attempting to update the table at its source.

You should protect your PATH and DNFS sources of unencrypted SASHDAT to only have a single writer and for best results, keep it as read-only for general use. But if you cannot for some reason, then direct CAS to always rely on its cache as the backing store for unencrypted SASDHAT from PATH or DNFS caslibs.

Let's return to our scenario where you and I both have CAS sessions which loaded the same table from source into memory and this time, we've directed CAS to use its cache (not memory mapping to source).

This now means that we both have our own instance of the table in memory. Your instance is backed to the cache with its own memory maps. And so is mine. When you make changes to your in-memory table, it has no effect on mine. That's great! And better, eliminates that nasty hanging of CAS sessions.

But to be clear, we haven't really eliminated the race condition if we both intend to save our final changes back to the same source. To do that, we must coordinate with each other and employ other common data management strategies to be good stewards of the data and system.

Make it happen

Specific to loading SASHDAT data using either PATH or DNFS type of caslibs, we are able to override CAS' default behavior with its cache through the use of three parameters.

Those parameters rely on two values in particular:

INPLACEPREFERRED:
Directs the CAS server to follow its default behavior as described by Rule № 2 above.
CASDISKCACHE:
Directs the CAS server to always relies on the CAS_DISK_CACHE as a backing store for the in-memory data loaded from unencrypted SASHDAT files using either PATH or DNFS type of caslibs.

The three parameters where these values are used employ a hierarchical relationship which you can override depending on the level of specificity desired: entire CAS server > per CAS library > each CAS table.

First

To globally set the CAS server's behavior, modify the casconfig_usermods.lua file on the CAS Controller host in directory /opt/sas/viya/config/etc/cas/default:

env.CAS_LOADTABLEBACKINGSTORE="INPLACEPREFERRED" | "CASDISKCACHE"

Specifies the backing store to use for all SASHDAT tables in PATH and DNFS caslibs on this CAS server.

If not specified, then the default is INPLACEPREFERRED.

For reference, see Controlling Use of CAS_DISK_CACHE in the SAS® Viya® 3.5: System Programming Guide.

Second

When adding a caslib with a srcType of either PATH or DNFS, provide this source-specific option:

loadTableBackingStore="CASDISKCACHE" | "INHERIT" | "INPLACEPREFERRED"

Specifies the backing store to use for all SASHDAT tables in this caslib.

If not specified, then the default is INHERIT which means this caslib will follow the global setting indicated by env.CAS_LOADBACKINGSTORE above.

Here's an example:

caslib mydata sessref=mysess datasource=(srctype="DNFS", 
path="/path/to/data" loadTableBackingStore="CASDISKCACHE");

For reference, see addCaslib Action in the SAS® Viya® 3.5: System Programming Guide.

Third

When loading an unencrypted SASHDAT table in a PATH or DNFS type of caslib, provide this importOption:

backingStore="CASDISKCACHE" | "INHERIT" | "INPLACEPREFERRED"

Specifies the backing store to use for this SASHDAT table.

If not specified, then the default is INHERIT which means this table load will follow the caslib setting indicated by the loadTableBackingStore caslib option above.

Here's an example:

proc casutil;
  load casdata="ImportantData.sashdat" 
       importOptions=(filetype="HDAT", backingStore="CASDISKCACHE")
       casout="ImportantData" ;
run;

For reference, see Common Parameter: importOptions in the SAS® Viya® 3.5: System Programming Guide.

Other considerations

There are a few other considerations to deal with.

SASHDAT exceptions

There are a couple of exceptions where the value of the backingStore parameter has no effect when a SASHDAT file is loaded:

Encrypted SASHDAT: Decryption must take place in memory, and so the unencrypted data is backed to the CAS cache
Filtered data with a where parameter: The subset results of the where processing after the load are placed in the CAS cache.

HDFS

When the Apache Hadoop Distributed File System is installed symmetrically alongside CAS, then unencrypted SASHDAT is locally accessible and so CAS will memory-map to the source blocks directly. It will not use its cache.

As a matter of fact, specifying the backingStore option for any caslib type other than PATH or DNFS will be silently ignored (i.e. no syntax errors in the log).

Block copies

When CAS memory-maps to SASHDAT directly at the source, then COPIES=0 (i.e. no replicated blocks for failover are placed in the CAS cache). However, when you specify the CAS cache as the backing store, then the usual default for external data source COPIES=1 will go into effect. This means CAS maps in-memory data to the cache *and* also places duplicate copies of the blocks in cache as well to protect against the unexpected loss of a worker.

Coda

Designing your system and processes for multi-user concurrent access to tables which change requires planning, effort, and communication. CAS offers several options for dealing with this in how it operates, the procedures for working with tables, and configuration parameters.

H/T

Thanks to Andy Bouts, Principal Pre-Sales Solutions Architect for sharing his hard-won experience and insights with these use-cases in the field.