How SAS Watchdog restricts filesystem access for Open Source programs

3 Likes

SAS Watchdog is an optional component for the SAS Viya platform's Programming Run-time, which architects and administrators should know about. It has been documented in the READMEs and HTML doc that come with your SAS Viya platform deployment assets for a long time now, but in the Stable 2023.01 release of the SAS Viya platform, that doc has been improved and SAS Watchdog is covered in the Help Center documentation too.

If you enable SAS Watchdog, it helps keep customer environments secure for users who run their own open source code written in Python, R or Java from within a SAS Compute, Connect or Batch programming session.

In this post, I want to tell you what SAS Watchdog is, how it works, and why you would choose to enable it (or not). This is partly just because it's really clever, but also because SAS Watchdog is an essential tool that can keeping your customer's data safe when they run code written in other programming languages from within a SAS program.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

Most importantly, sas-watchdog restricts access to the filesystem using the same allowlist that SAS's LOCKDOWN feature uses. In a future post, we'll look at how to manage both the SAS Programming Run-time allowlist that is relevant here, and also the CAS allowlist, which is a seperate list but which implements a similar concept for SAS CAS Language processing.

Why you need a way to restrict filesystem access from the SAS Programming Run-time

SAS Watchdog is needed because of the way two features of the SAS programming language need to work together: LOCKDOWN, and SAS's ability to run user-written code in other languages.

LOCKDOWN prevents unauthorized access to the filesystem from the SAS programming language

Native SAS code running in a compute, connect or batch programming session is prevented from accessing filesystem paths which should be off-limits by the LOCKDOWN feature of the SAS programming language. It also limits access to certain language features and access methods, but in this post we are more concerned with its function of limiting filesystem access. After initialization, any attempt to access the filesystem from any SAS language feature - a procedure, data step or path-based library - will be checked against an allowlist of filesystem paths by the SAS programming run-time.

If the directory or file is in or below a path in the allowlist, it is allowed. If the directory or file is not in the allowlist and access to it is attempted after the programming session enters its locked-down state, the access is prevented.

There is an exception to this: autoexec code which is managed by SAS administrators, and is processed during the compute (or connect or batch) server/process initialization can define file references and path-based library references to paths outside the allowlist. These references have to be set up during initialization, before the SAS programming session enters its locked-down state. By the time end-user-written code can begin to run, the SAS programming session has already entered a locked-down state and no new file or library references to paths outside the allowlist are allowed.

The SAS Programming Run-time supports calling other programming languages

One of the most exciting features of the SAS programming run-time is that it supports running user-written code in other languages: Python, R and Java.

@ScottMcCauley wrote two great posts recently about how to Configure Python Integration with SAS Viya. and Configuring R Integration with SAS Viya. Together, the two posts (and the documentation they link to) show how the SAS Configurator for Open Source and other steps can be used to enable SAS programs to run Python (e.g. via proc python;), and R (e.g. via proc IML;). SAS has also long supported running Java code via the data step Java Object.

3-Code-in-other-languages-can-be-run-from-SAS-programming-run-time-sessions.png

When code in another language is executed, the compute (or connect or batch) server uses another SAS process called sasels to launch processes running the respective programming language's own run-time binaries as the user who started the SAS programming session. It passes the code to be executed (or a filepath pointing to a file containing the code) to the respective programming language's executable.

Of course, the Python, R and Java languages are all fully-featured. They also have features which allow them to access the filesystem. And obviously those language runtimes do not have the SAS programming run-time's LOCKDOWN feature. Plus, the top-level processes called when another language's code is executed are also capable of launching additional processes themselves.

So, what will prevent a user running a proc python; statement which executes Python code to read a file outside the SAS Programming Run-time's allowlist? Equally, the same goes for R and Java code.

Sure, the only file system volumes that it has access to, since it is running inside a Kubernetes pod are the paths mounted into the pod. The container runtime (Docker, containerd etc.) and Kubernetes together prevent a process from accessing the host file system, other pods, other hosts etc. But still, it is not desirable to give a launched processes running arbitrary user-written code access to all filesystem paths mounted into the pod!

SAS Watchdog

SAS Watchdog's job - if it is enabled - is to prevent user-written code in one of those languages, started inside a compute (or connect or batch) programming session from exploring the compute (or connect or batch) pod's filesystem, and from accessing files mounted to it which they have no business accessing.

Once you enable it, SAS Watchdog runs as a process inside its own separate sidecar container, in each SAS Programming Run-time pod. In Kubernetes a 'sidecar container' is an additional container in a pod (beside the main container), which stays running after a pod's init containers have finished their initialization.

However, SAS Watchdog is disabled by default.

How SAS Watchdog works

4_1-SAS-Watchdog-overview-1024x391 (1).png

Follow the numbered annotations in the diagram above and the list below to see how SAS Watchdog works.

If it is enabled, SAS Watchdog runs with elevated privileges that allow it to register itself with a LINUX kernel API called fanotify, part of the operating system running inside the pod. Once SAS Watchdog has registered itself with fanotify, any request to access the filesystem - such as a file open request - coming from any process launched by the sasels process will trigger an 'open event' notification which will be sent to SAS Watchdog.
When a process launched by sasels - including any Python, R or Java processes, and also any sub-processes they spawn - requests access to the filesystem, fanotify sees that a file open request has been made by a watched process. The LINUX kernel does not grant or deny the request yet.
Fanotify sends an event to the SAS Watchdog process, and waits for it to respond.
The SAS Watchdog process consults the list of filesystem paths in the allowlist, and determines whether or not the requested file is in or below a path on the allowlist.
If the file or directory requested is in the allowlist, SAS Watchdog responds to the open event to indicate that it allows the request. If not, it responds saying it denies the request.
If SAS Watchdog allowed the request, the LINUX kernel grants the file open request and the calling process gets access to the file or directory. If SAS Watchdog denies the request, the kernel denies that request to access the file.

The allowlist that watchdog uses is the exact same one that LOCKDOWN uses. Inside the pod, it's read from the same file by both processes. So if you change the list of paths in the LOCKDOWN paths list, there is nothing else you need to do to have your change affect the SAS Watchdog process.

How can you enable SAS Watchdog

SAS Watchdog is disabled by default, but can be enabled in your SAS Viya platform deployment in the same way as so many other SAS Viya platform features, with an overlay you copy to your site-config directory, and reference from your kustomization.yaml, so that (along with many other overlays) it will be used by kustomize to modify your site.yaml file before site.yaml is applied to your Kubernetes cluster, following whichever SAS Viya platform deployment method you use at your site.

Specifically, read the instructions in:

the README file at $deploy/sas-bases/overlays/sas-programming-environment/watchdog/README.md (for Markdown format) or
the web page at $deploy/sas-bases/docs/configuring_sas_compute_server_to_use_sas_watchdog.htm (for HTML format)

Using the instructions that are shipped with the deployment assets in your specific release of the SAS Viya platform ensures you have the right instructions for that specific release. But the general gist of those instructions is that you add a reference to the sas-programming-environment/watchdog overlay to the transformers block of the base kustomization.yaml file ($deploy/kustomization.yaml), like this

...
transformers:
...
- sas-bases/overlays/sas-programming-environment/watchdog
- sas-bases/overlays/required/transformers.yaml
...

NOTE: The reference to the sas-programming-environment/watchdog overlay MUST come before the required transformers.yaml, as seen in the example above.

If you are running the SAS Viya platform on an OpenShift environment, there are additional steps in the instructions for applying a Security Context Constraint that is required for SAS Watchdog to work.

Check your release's instructions, at the paths given above, to be sure you are following the right process for your release. These instructions were updated and refreshed for SAS Viya platform Stable cadence release 2023.01.

And take a look at the new documentation for SAS Watchdog in the SAS Viya Platform Administration Guide.

Why might you choose not to enable SAS Watchdog?

The SAS Watchdog process in its sidecar pod has to run with elevated privileges in order to register itself with the fanotify API. Some Kubernetes administrators, or their parent organization may not approve of this. Discuss this with your architect and customer during the architecture phase of an implementation project.

If you disable access to the JAVA Data Step Object using LOCKDOWN, and do not enable Python and R code to be run from within SAS Programming Run-time servers (SAS Compute, SAS/CONNECT and SAS Batch), there is no need to enable SAS Watchdog, and doing so would consume a little bit of system resource for no additional benefit.

Besides those two reasons, I cannot immediately think of any reason why you would not enable SAS Watchdog. But feel free to contact me or add a comment below if you can! See you next time!

Find more articles from SAS Global Enablement and Learning here.