This article explains the principles of file system security for teams, projects or departments and their data on a shared file system used with SAS Viya 4. It explains how to segregate authorization to data for reading and writing within a team and beyond. It is built on classic Unix file security mechanisms (POSIX) and how Viya treat users and groups. It can be applied to POSIX compliant filesystems such as NFS (incl. Azure NetApp Files), Lustre, CEPH and GPFS. It also explains a practical approach to managing file system permissions with SAS Viya CLI that do not require Kubernetes API permissions.
When working with SAS as an analyst in a team it is a common need to save prepared data so your teammates can use it for reporting, ad-hoc analysis or model training. For a typical SAS programmer, it is easier to let SAS auto-create output tables as files (.sas7bdat, parquet etc.), instead of writing explicit SQL DDL statements to create and define output tables in an external database. For this reason, new customers on Viya, and existing SAS customers migrating to Viya, often require a shared filesystem with a secure folder structure for reading and writing prepared data.
Requirements met in this example:
1) Teams should have a folder structure where only the team’s own members can read and write data and create sub directories.
2) When a team member creates new directories and files, they must inherit the permissions of their parent directory.
3) Some users need read-only access to specific other team’s data
4) Some users are members of multiple teams.
This example has two teams: Sales and Finance. An administrator has created a batch user account “dataadmin” that is needed when creating new team folders.
OS Directory | Owner | Group | Permission Attributes:
owner group other* |
Notes |
/viya-share/sasdata | dataadmin | sasdata | rwx r-x — | Every new team should get a top folder and a data subfolder like those shown for sales and finance here. |
/viya-share/sasdata/sales | dataadmin | sales-reader | rwx r-x — | Only sales-reader can access this folder and below |
/viya-share/sasdata/sales/data | dataadmin | sales-readwriter | rwx rws r-x | Read-only access is granted for sales-reader through the read-only access for “other”, and sales-reader group on the folder above.
When sales-readwriter members creates sub directories in SAS Studio or other tool, they inherit the sales-readwriter group because of the “s” permission attribute on group*. |
/viya-share/sasdata/finance | dataadmin | finance-reader | rwx r-x — | |
/viya-share/sasdata/finance/data | dataadmin | finance-readwriter | rwx rws r-x |
* Attribute abbreviations: d=directory, r=read, w=write, x=execute file or list directory, s=automatically set group id on child files and folders from parent directory.
s also implies x (execute file or list directory).
User | Group Memberships |
anita | sasdata, sales-reader, sales-readwriter |
hugh | sasdata, finance-reader, finance-readwriter, sales-reader |
hans | sasdata, finance-reader, finance-readwriter, sales-reader, sales-writer |
dataadmin | sasdata, finance-reader, finance-readwriter, sales-reader, sales-writer
(all groups used on the shared filesystem)* |
* dataadmin can be a service-user or batch user only used for managing file system security.
Linux permission attributes (such as rwx) translate to a four digit number. For example 0777 indicates a file with rwx rwx rwx attributes. A Linux process’ current umask removes (masks) permissions when creating new files and directories. The basis for new files and directories is 0777, which means read, write and execute for owner, group and other. Umask 0002 removes write permissions for “other”, by deducting two bits from the last octet. The result for new files is therefore 0775 (rwx rwx rw-).
For an explanation of Linux permissions and numbers, refer to Unix / Linux – File Permission / Access Modes.
You might wonder why I use rws instead of rwx for group permissions in the example above. This refers to the “set group id” feature of POSIX filesystems.
The “s” instead of x means that new files and subdirectories will inherit the group from the directory. This is a key feature when a group of users are working in the same directory, because it allows users to be able to read and write files that other group members have produced. If “set group id” is not used, then new files and subdirectories will instead get the creator-users primary group. The primary group for Active Directory users is by default just “domain users” meaning all authenticated users. So to avoid giving access to all authenticated users on new files, it is important to use the set group id feature.
In octal notation, this bit is set with the number 2 in front. This means that the security combination rwx, rws, rwx translates to 2777, instead of just 0777.
Filesystem permissions take effect across most Viya analytical engines including:
1) SAS Compute (incl. Enhanced Compute Engine and DuckDB)
2) CAS*
3) Python
4) R
This means users can switch freely between engines, or one user can use Python while another use SAS Compute, and authorization to data files will be transparent.
* CAS by default access files with a fixed user “sas” instead of the end-user. For SAS environments where both CAS and other engines are used in the same team, I recommend configuring CAS to access files as the end user themselves with the CASALLHOSTACCOUNTS setting. Thereby CAS can be used to read and write files from a common folder structure like the one seen above.
POSIX attributes like user id number (UID) and group id number (GID), and secondary GID are vital elements of the above security model, because Unix just stores user and group information as numbers in the file systems internal metadata. POSIX attributes are not available from all types of external identity providers to Viya. For example, a SCIM based identity provider cannot supply UID and GID numbers.
By default, SAS Viya Platform 2023.04 and later releases provides a generated user id (UID) and allows a SAS Administrator to provide group id (GID) values through REST API or with Viya CLI. Viya can also be configured to generate GID values. For details see:
SAS Viya and POSIX attributes (UID and GID).
Figure 1: Editing one of the rules to grant dataadmin permissions to create batch jobs.
Figure 2: Obtaining GID number for sasdata group.
sas-viya -k --output json identities show-group --id sasdata --show-advanced
2. We will use viya-cli batch to start a SAS Compute pod and submit Linux commands to create directories and set permissions. This do not require SAS Administrator membership – just read / write access to the shared file system directory that you want to use as the starting point for the folder structure (e.g. /viya-share/).
sas-viya batch jobs submit-cmd --context default-cmd --cmd "df" --wait --watch-output
192.168.2.4:/export/sas-viya/data 395191296 38588416 336454656 11% /viya-share/
./sas-viya -k batch jobs submit-cmd --context default-cmd --cmd "mkdir /viya-share/sasdata" --wait --watch-output
./sas-viya -k batch jobs submit-cmd --context default-cmd --cmd "chgrp 1676916821 /viya-share/sasdata " --wait --watch-output
./sas-viya -k batch jobs submit-cmd --context default-cmd --cmd "chmod 2750 /viya-share/sasdata " --wait --watch-output
The NFS protocol is often used for mounting shared filesystems from Network Attached Storage, NFS-servers and cloud storage such as Azure Files, Azure NetApp Files and AWS Elastic File System. By default, an NFS-client, such as a SAS pod, transfers the current user’s list of group membership to the NFS-server. However, the NFS protocol used for communication between NFS-client and NFS-server without Kerberos allows transmission of maximum 16 groups. This means that if a user is a member of more than 16 groups, the protocol only transmits the user’s first 16 groups, and the NFS-server will deny access to a directory, if the relevant group for accessing the directory is omitted by the client.
At least four options exist for overcoming this limitation:
1) Design a simple security model with less than 16 groups per user. Be aware that if Viya is configured to generate group id’s (gid) then both external groups and custom (internal) groups in Viya will have GID’s and count against the limit.
2) Integrate your NFS server or NFS cloud service with the Identity Provider that is also used by SAS Viya and enable it to look up groups for a user. Thereby it can obtain all groups for a user. I have seen examples of this with: Linux NFS-Server, Hitachi NAS, and Azure Netapp Files
3) Use a combination of group on directory as shown above, and secondary group or user permissions set with Access Control List (ACL) to reduce the total number of groups needed. Use ACL’s sparingly as the authorization model easily becomes complex. Note that not all NFS implementations support ACL.
4) Use a more advanced file system without a 16 group limit such as Lustre. Lustre also performs better than NFS with SAS workloads. Lustre can be bought as a managed filesystem with full Kubernetes support if your SAS environment is situated on Amazon or Azure: Amazon Fsx for Lustre, and Azure Managed Lustre. I have not tested group limits for Ceph with Red Hat OpenStack Data Foundation with the Ceph protocol but would be glad to hear any results if you have.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.