Provisioning CAS_DISK_CACHE for SAS Viya

7 Likes

The SAS Cloud Analytic Services (CAS) offer lightning fast analytics of in-memory data. However, relying exclusively on RAM for all data management and operations is often a short road with an abrupt end. CAS extends its capabilities by also taking advantage of the persistent storage options available. In particular, CAS can be defined to use one or more local disks as a caching space to help improve fault tolerance, data availability and memory utilization. That disk space is referred to as CAS_DISK_CACHE.

Analytics In-Memory

CAS operates as a high-performance in-memory analytics engine.

In-memory is emphasized as that mode of operation provides the substantial performance value which is CAS’ trademark. Since CAS is designed for performing operations in memory, then running out of memory, like jail, is a place we don't want to be. But CAS offers us a Get Out of Jail Free card with its CAS_DISK_CACHE.

One of the many new features CAS has over LASR is the ability to use a cache as a backing store for in-memory data. The CAS cache provides flexibility for in-memory operations ensuring that CAS:

Maintains availability of tables when more data is loaded than physical RAM available
Provides failover protection if a CAS worker goes offline unexpectedly
Relies on file handles to memory-mapped SASHDAT blocks to provide the mechanism for multiple CAS sessions to access the same single instance of data

What is a backing store for CAS tables in memory?

The data in CAS tables in memory are organized as SASHDAT blocks. And CAS will memory-map every SASHDAT block in memory to a location on disk. That disk might be the CAS cache. Or not.

If the original SASHDAT blocks are loaded from a mappable location, then that original source will be the backing store (caslib srctypes corresponding to Path, DNFS, and symmetrically co-located HDFS). Else if the table comes from another place (or a non-SASHDAT source), then CAS will memory-map that table's SASHDAT blocks in memory to the location specified for CAS_DISK_CACHE. Regardless of source, any new in-memory blocks created as output from CAS operations are always mapped to the CAS cache.

CAS relies on the operating system to manage the movement of data to/from the CAS cache via the memory map. This means the OS determines if and when to write (that is, page out) the in-memory SASHDAT data to the cache on disk. Even if the data is on disk, it's possible - ideal, really - that CAS might not ever need to read it back (that is, page in) from the cache. So for a system with plenty of RAM, the SASHDAT blocks may always be available to CAS in memory until the tables themselves are explicitly dropped.

The cache does provide a useful service and yet, as a rule, we prefer CAS to work with in-memory data as much as possible. If data must be fetched (that is, paged in) from disk, then performance should be expected to degrade by orders of magnitude for that operation. The CAS cache therefore is helpful when nominal RAM resources have been exhausted, but is not performance-equivalent to RAM.

The location for the CAS cache (env.CAS_DISK_CACHE) can be specified in several places:

File:	When:	Description:
`sas_viya_playbook/vars.yml`	At install	The default used by CAS
`…/cas/casconfig.lua`	At startup on CAS Controller	The default used by CAS Controller notifies Workers
`…/cas/casconfig_usermods.lua`	At startup on CAS Controller after `casconfig.lua`	Overrides the default Controller notifies Workers
`…/cas/node_usermods.lua`	At startup on each CAS host after `casconfig_usermods.lua`	Overrides the value from Controller per worker

Analogy

So, in a way, the CAS cache might be thought of as similar to an escape tunnel. It's there if we need it, but hopefully we won't. If we're digging a tunnel, but don't really plan to use it much, then we probably will spend as little time and effort on that tunnel's construction as we can get away with.

CAS is a very flexible product and can be used in many ways. Often, a site will use CAS in ways not originally envisioned at the time of purchase. We want to ensure that kind of flexibility of use is maintained. Also, as a high-performance product, many assumptions about CAS operations are based on the idea that hardware has been dedicated primarily, even exclusively, to CAS. If CAS is sharing its host(s) with other enterprise products, then we may need to adjust our approach.

So let's look at some tunnel - I mean, CAS_DISK_CACHE considerations.

Primitive Resources

There are some places on the CAS host which we can use as the CAS cache without any modifications at all. With minimal effort, we can get things working, but it's possible that we won't like it for long.

env.CAS_DISK_CACHE using /tmp

Out of the box, CAS configuration defaults to using /tmp as the location of CAS_DISK_CACHE.

While /tmp is practically guaranteed to exist on most systems, it is not an ideal location to use for critical software operations. It might be sized too small (as little as 1 GB). And if /tmp fills up, then the operating system won't be able to run correctly, affecting everything else on the host. Furthermore, just because the files in CAS cache are indeed temporary, we want to ensure CAS itself makes the determination as to when they can be released.

➤ Using /tmp as a short-term solution for working with small volumes of data with limited users is okay. But as a rule, we do not recommend using /tmp – instead, we should specify a different location.

Specified as:

env.CAS_DISK_CACHE='/tmp'

env.CAS_DISK_CACHE using /dev/shm

The /dev/shm location is a special construct in Linux referring to shared memory: it’s not a physical location on disk. Instead, it provides a mechanism for programs to share data with each other through the use of RAM. By specifying /dev/shm as the location for the CAS cache, the operating system will not need to perform any memory-mapped copying to slow disk.

➤ Notice that we need to be careful using /dev/shm for the CAS cache because if RAM is exhausted, then the OS will swap out to its paging file. If that occurs, then all of sudden CAS will transition from running very fast to very, very slow - or worse, if CAS is running in a cgroup, then the largest RAM consuming process will be killed.

Specified as:

env.CAS_DISK_CACHE='/dev/shm'

Improved Resources

With a little bit of effort and planning, we can improve the route to CAS_DISK_CACHE so it can function well in a wider range of scenarios.

"In a hole in the ground there lived a hobbit. Not a nasty, dirty, wet hole, filled with the ends of worms and an oozy smell, nor yet a dry, bare, sandy hole with nothing in it to sit down on or to eat: it was a hobbit-hole, and that means comfort."

-- Gratuitous reference to The Hobbit: There and Back Again

Physical Disk as Persistent Storage

If the site has requirements where CAS must load, manage, and operate on data which combined is larger than the RAM available, then the CAS cache really should rely on physical disk. After all, this concept has been de rigueur for storage considerations in computer science for decades. With physical disk, we can ensure that if circumstances occur where available RAM is not sufficient, then CAS has a dedicated location with plenty of space to store inactive data.

Most of the time, users of enterprise Linux servers aren't concerned with physical disks themselves. They just know about directories. Directories are free. Create them wherever you can. And this approach can work for CAS_DISK_CACHE. But at some point, things get real. If CAS fills up the root or user file systems, causing unexpected problems on the box, then the Linux administrator might want to have a discussion about giving CAS dedicated disk resources for its cache.

➤ When provisioning physical disk for CAS cache, implement a single file system per disk. We prefer XFS and ext4 is acceptable. Avoid ext3 and any kind of NFS-based storage as the disk metadata and journaling may be overwhelmed by requests.

env.CAS_DISK_CACHE using JBOD

The acronym JBOD means Just a Bunch of Disks. It's a reference to one or more disks directly attached to the host with a basic intermediary connection. Almost like (but not really), we don't care how it's attached.

We typically want JBOD to be local-attached storage with enough disk space to match the size of RAM or more. Never less. Usually 1 - 3 disks are sufficient. When multiple disks are used, CAS has multiple I/O channels to access data and will create its files (with associated memory mappings) on those disks in round-robin order.

The challenges of working with individual disks are well-known. Each is a single point of failure and throughput is constrained. And the more disks we add to the system, the likelihood of a failure with those disks increases. If CAS loses a disk from its cache and that's the only place where that segment of data was, then the associated table(s) are incomplete - meaning losing a small part of a table is effectively the same as losing the entire table. Those files in CAS cache have a temporary lifespan, but CAS does need them.

➤ Most enterprise IT teams don't expect to work with JBOD disks in this manner due to these challenges. So if positioning this as an option for CAS cache with your customer, take time to discuss with them. Perhaps a RAID solution (below) is more acceptable.

Specified as colon-delimited list of 1 or more full directory paths:

env.CAS_DISK_CACHE='/disk1:/disk2:/disk3'

env.CAS_DISK_CACHE using RAID

Having an escape tunnel is a nice feature. Furnishing it with nice appointments when you expect to use it regularly is a good idea. But there are cases where having a motorcycle on rails can help speed things up, too.

A RAID is a Redundant Array of Independent Disks. And there are many RAID levels with different functionality. As is typical for SAS solutions, we're focused here on RAID 5 where we have a group of 3 or more disks which work together as a single logical disk. Two-thirds of the space is for data and one-third is for parity. (For a 9-disk pack, only one-ninth is used for parity.) If one disk breaks, the others can continue operations. This eliminates individual disks as single points of failure and also provides some improvement to throughput as well.

➤ Enterprise IT teams usually like this RAID approach. It's a standard offering with racked systems; easy to understand and justify. And our needs for CAS cache here are relatively modest. A single RAID array for the CAS cache should be fine.

Avoid shared storage solutions for CAS cache

We want to keep the storage for the cache local to the CAS host - so do not cross the line over to shared storage solutions/appliances for CAS cache. Shared storage may appear convenient in situations where capacity is easily available, but the approach CAS uses with its cache negates some of the perceived benefits. Furthermore, some storage solutions, like IBM Spectrum Scale (a.k.a. GPFS), are not compatible with CAS_DISK_CACHE - it just won't work.

CAS cache is a completely separate conversation from using DNFS. DNFS technology used by CAS was explicitly designed for and works great with shared storage solutions.

For more information about RAID and other disk provisioning considerations for CAS, see Tony Brown's paper from SAS Global Forum 2019 titled, Engineering CAS Performance Hardware Network, and Storage Considerations for CAS Servers.

Specified as colon-delimited list of 1 or more full directory paths:

env.CAS_DISK_CACHE='/raidlogicaldisk1:/raidlogicaldisk2'

Special Cases

There are circumstances where CAS can be directed to operate in a way which (inadvertently) maximizes its reliance on the CAS cache. These situations need to be identified and recommendations made to either operate differently or tweak the configuration to reduce the impact.

In-line scoring models

The use of an in-line workflow to iteratively process scoring models, where the output from one run is appended with a new column and input into another run, may cause CAS to hit its cache excessively. So if CAS is relying on persistent storage for its cache, then expect a commiserate slowdown in performance since data loading from physical disk is much slower than from RAM.

➤ If the iterative in-line approach to running scoring models is necessary, then consider:

Ensuring there is plenty of RAM provided to CAS for this heavy use-case
Configuring CAS to use /dev/shm for its cache
Setting cas.MAXTABLEMEM=0 so that CAS won't cache very-temporary process files
Using COPIES=0 in program code so CAS won't make any failover copies as its default behavior

Super Highway

~~Having an escape tunnel~~ CAS_DISK_CACHE is a fine idea and it should be built correctly. But don't lose sight of the fact that CAS itself is meant to act as a super highway of performance. It is very flexible and extremely scalable. Remember that for CAS, we prefer to invest in scaling out horizontally with more CPU and RAM as problems get larger. So don't go too crazy with effort and resources trying to optimize every bit of performance out of CAS_DISK_CACHE. Often that can be better spent on CAS' primary mode of operation: in memory.

CAS cache offers very useful functionality to help in situations where primary performance resources (notably RAM) have been exhausted. Allowing a process to complete successfully, albeit slowly, is often preferable to outright failure. But slow is not the objective for CAS which is designed to operate at maximum performance and efficiency. Understanding how CAS can be configured for performance and resilience within the spirits of its design objectives is important for proper use.

Understand more about how CAS uses its cache

Before you can get serious about your decision on where to place the CAS_DISK_CACHE, you need to understand exactly what your processes will be asking CAS to do. With careful planning, a process can be optimized for CAS to reduce its reliance on the cache. On the other hand, some processes may require extensive use of the cache.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

I recommend getting acquainted with the SAS® Cloud Analytic Services 3.4: Fundamentals document for more information about how CAS works with data in-memory and on disk. Specifically look at topics including:

Repeated tables for performance
Fault tolerance and data redundancy
Normal SASHDAT vs. compressed vs. encrypted
Data partitioning and indexes
Etc.

Thanks

Just a quick shout out to Brian Bowman, Tony Brown, Gordon Keener, Steve Krueger, Barbara Walters, and many others for their work to identify, discuss, and explain the concepts here.

Diveek · ‎02-06-2020

How do i check the space in CAS_DISK_CACHE

RobCollum · ‎02-07-2020

@Diveek

That's a very shrewd question to ask. Because the files in CAS_DISK_CACHE are "pre-deleted" before they're used, then a directory listing or other simple query of the OS to show what's in there will make it look like it's empty... even when it's heavily in active use. So the best way to monitor the usage of CAS_DISK_CACHE is to get the information from CAS itself.

The SAS documentation provides a high-level summary of Tools for Monitoring Memory Use.

The most obvious approach is to use SAS Environment Manager to monitor CAS activity.
But one that I find the simplest to understand is the builtins.cacheInfo action set. It returns information about the current usage of CAS_DISK_CACHE like:
```
Node                 NodePath Capacity Free_Mem %_Used 
cloud.example.com    /sastmp1    11 Tb    11 Tb    2.4 
cloud01.example.com  /sastmp2    11 Tb    11 Tb    2.4 
cloud02.example.com  /sastmp1     7 Tb     2 Tb   62.6 
cloud03.example.com  /sastmp2     7 Tb     2 Tb   62.6 
cloud04.examples.com /sastmp1  1000 Gb   581 Gb   41.9 
```
Of course, you can create schedule a SAS program to run and collect this information regularly if that helps. Unfortunately, this action set is not available for Windows-based SAS Viya deployments.
There's also the gridmon.sh utility. It provides very technical insights into CAS activity.

anbout · ‎02-14-2020

Hey Rob, appreciate the post. I am getting much deeper in my own learning of CasDiskCache.

sounak93 · ‎07-06-2020

Hey Rob,

How do I manage my cas_disk_cache. I am seeing even after unloading data from CAS my cas_disk_cache utilization doesn't come down. Is their a way I can delete data under cas_disk_cache?

Thanks

Sounak

RobCollum · ‎07-06-2020

@sounak93,

The files that CAS creates in its cache are pre-marked for deletion. There's no cleanup necessary even if CAS crashes.

[ Not to confuse topics, but if you've worked with SAS 9.4, then files in SASWORK (and UTILLOC) should be cleaned up automatically when the SAS process terminates normally. But if there's some problem, then you might need to use the cleanwork utility. To be clear, the cleanwork utility is not used for CAS_DISK_CACHE - but people familiar with SAS 9 often ask this question. ]

You might want to ensure that all tables have really been dropped from CAS memory - for all user sessions as well as the global session. Maybe there are some hanging around you're not aware of?

For more information about monitoring your CAS_DISK_CACHE utilization, try looking at:

A New Tool to Monitor CAS_DISK_CACHE Usage

https://communities.sas.com/t5/SAS-Communities-Library/A-New-Tool-to-Monitor-CAS-DISK-CACHE-Usage/ta...

anbout · ‎07-07-2020

I agree with Rob that it's likely that there are likely sessions that are still active that are holding on the pre-deleted cache. There are ways, like Rob mentioned, to successfully monitor CAS_DISK_CACHE being loaded / unloaded as expected when CAS data is loaded / unloaded.

Also, Viya uses CAS datasets for internal operations that are also going to use some space in CAS_DISK_CACHE.

sounak93 · ‎07-14-2020

Thanks Rob, Used the monitoring tool found it useful for monitoring

HannuSihvonen · ‎10-08-2021

Now every user has access (read, write, execute) to cas cache folder. What are necessary owner and group permissions to that folder?

RobCollum · ‎10-08-2021

Hannu,

I’m guessing you’re asking about SAS Viya 3.5.

The permissions structure on the CAS_DISK_CACHE directories will depend on how users are authenticated to the SAS Viya environment. Stuart Rogers provides a nice breakdown of these options and what they mean for the SAS Viya environment in his post in SAS Communities:

SAS Viya 3.5 Authentication Options

https://communities.sas.com/t5/SAS-Communities-Library/SAS-Viya-3-5-Authentication-Options/ta-p/6203...

Scenario 2 describes the default where no back-end user accounts are provided. In that case, only the service account which runs the CAS Server will need access to the cache directories. Permissions like 755 should be sufficient.

Scenario 5 describes where users will have back-end accounts on the CAS host – and that a CAS service is run as the user’s own personal account. In that case, 777 permissions (or 775 if all users are in the same OS group) is necessary to ensure they can each create files in the cache as needed.

Even so, remember that the CAS cache files are “pre-deleted” when they’re created – only the CAS server has an open file handle and can see the contents directly. If one were to perform an “ls -l” on any of the cache directories, they’d see zero files.

Hope this helps,

Rob

StaceyChan · ‎10-12-2021

Hi Rob,

Thanks for the information in this article.

If we have an in-memory table on the CAS cluster, and suddenly the cache of one worker in the cluster is down (hardware issue), would the table be accessable(just read, not write)?

Or we should do some DR for this situation?

Thanks!

Best,

Stacey

HannuSihvonen · ‎10-12-2021

Hi Rob,

Thanks for the article. I found only scenario 1 - 4 from it not scenario 5.

Anyway we have dozens of folders under /cascache/mirror/... all created about half year ago when installation was made. Can we just delete them?

Best regards,

Hannu

RobCollum · ‎10-12-2021

Stacey,

First of all, if you haven't seen it already, I recommend reading:

4 Rules to Understand CAS Management of In-Memory Data
https://communities.sas.com/t5/SAS-Communities-Library/4-Rules-to-Understand-CAS-Management-of-In-Me...

Specifically, the concept of the COPIES= parameter is there to help protect against the failure of a CAS worker node (for whatever reason).

And so to answer your question, for most data sources (i.e. not unencrypted SASHDAT file on locally accessible disk), CAS defaults to COPIES=1 which keeps a duplicate copy of the data in the cache. If "1" worker fails for some reason, the block copies are distributed such that the remaining workers still have a complete table to work with. If you want to tolerate the unexpected loss of more workers, then you could increase COPIES= to that number, however, for most scenarios where cost and benefit are considered, COPIES=1 is usually sufficient.

HTH

RobCollum · ‎10-12-2021

Hannu,

My apologies - "Scenario № 5" is from an earlier draft of that blog post. It referenced a configuration parameter that was in evaluation for SAS Viya 3.5, but not actually put into production use until SAS Viya 4 (env.CASALLHOSTACCOUNTS).

Yes, you should be able to delete any subdirectories as far as I know. I recommend stopping the CAS Server first to ensure nothing is actually in-use. And also be careful not to delete the main cache directory (as referenced by env.CAS_DISK_CACHE).

Cheers

HannuSihvonen · ‎11-01-2021

Hi Rob,

How can we test that CAS_DISK_CACHE has enough permissions?

Best regards,

Hannu

RobCollum · ‎11-02-2021

Hannu,

It depends on what authentication model CAS is configured to use.

If you're using a single service account to make CAS available to all users, then that service account should be the owner of the CAS_DISK_CACHE directory structure and permissions like 700 (-rwx------) should be fine.

If users have their own host-based accounts to run CAS, then permissions will need to be more wide open, like 770 (-rwxrwx---) or even 777 (-rwxrwxrwx) depending on how users are grouped in the environment.

Wide-open permissions are usually a red flag - and something to be cautious about. Remember that CAS creates files in the cache and "pre-deletes" them using a memory-map to keep the open file handles. This makes it difficult for any other OS-level process to see what's there.

For more information about user authentication options in SAS Viya and what they mean for CAS, check out a couple of blogs from Stuart Rogers:

SAS Viya 3.5 Authentication Options

https://communities.sas.com/t5/SAS-Communities-Library/SAS-Viya-3-5-Authentication-Options/ta-p/6203...

and

SAS Viya 2021.1.3 CAS Security Changes

https://communities.sas.com/t5/SAS-Communities-Library/SAS-Viya-2021-1-3-CAS-Security-Changes/ta-p/7...

HannuSihvonen · ‎11-02-2021

Thanks Rob

Actually I was thinking about a sas code or something to run at SAS Studio to make a huge data set in cas library so that cas-caching will occur. I tried with a data step by generating 1000000000 rannor variables but it was not enough.

BR,

Hannu

RobCollum · ‎11-02-2021

Hannu,

How are you monitoring the cache? Using regular "ls" command won't do it. You can use the "df" or "lsof" commands to get an idea of activity in the cache.

Also, remember that CAS doesn't explicitly write data to the cache itself. Instead, it creates the files, then marks them for deletion while keeping the file handle open as part of the memory map. Due to how m-maps work, the OS will make the determination when to "page out" the data in memory to the disk (and vice versa later if there's a need to "page in" from disk back to memory).

You might try using the gridmon utility to get more details about what's happening behind the scenes with CAS:
https://go.documentation.sas.com/doc/en/calcdc/3.5/calserverscas/n03061viyaservers000000admin.htm#n0...

For assistance with interpreting your results, I suggest following up with SAS Technical Support.

HannuSihvonen · ‎11-03-2021

Hi Rob,

at this early stage we are not monitoring it yet. I would just like to test that it really works (related to permissions) if I create something huge but it appeared to be non-trivial sas programming task. Will there be individual folders for each user like in /saswork/?

Best Regards,

Hannu