Today I would like to illustrate row-level authorization (or if you prefer, row-level security) in SAS Viya, with some examples. We'll use the fictional GELCorp organization from some of the SAS Global Enablement and Learning (GEL) team's Administration workshops, with a couple of small modifications. Our example uses two row-level grants for two different groups on the Select permission for the same CAS table. We'll see how they affect four users, variously members of one group, the other, both, and neither.
If you would rather watch and listen than read, my YouTube video, How to Apply Row-Level Security in SAS Viya, on the SAS Technical Insights & Expertise Series channel uses the same examples.
Many of our workshop examples feature two business groups, HR and Sales. We'll use those same two groups for our row-level grants.
We will therefore choose to use the fictional users Helena in HR and Sophia in Sales, but any members of HR and Sales would work equally well. For this illustration I wanted two more users: one in both HR and Sales, and one in neither.
If you know our GELCorp set-up from those workshops, you may remember that the administrators geladm and sasadm are both in both HR and Sales. That suited the workshops well, but for these examples,
If you want to 'try this at home' using our workshop image, please make similar changes.
Now we are ready for our demonstration: Please welcome... to the stage... Helena, Barney, Sophia and geladm!
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
As the diagram shows:
Here is a simplified representation of the HR_SUMMARY CAS table to which we are going to apply row-level authorization:
It is not a very big table in IT terms - the CSV source file is 539 KB on disk, but with 10 columns and 14,999 rows, it's more than big enough for this example. I've not shown details of the other rows or columns in the picture because they're not very important to this example. The only thing I'd add is that there isn't just one row for each combination of department and salary_level, as the table might imply: there are hundreds of rows for each combination, and thousands more rows for departments which are neither hr nor sales.
Here are the relevant permissions, or CAS access controls, set on our HR_SUMMARY table:
Authenticated users can neither see the table (ReadInfo is not authorized), nor the data in it (Select is not authorized).
HR and Sales can both see the table (ReadInfo is authorized), and each has a different row-level grant of the Select permission. We'll look at the detail of these in a moment.
The only permission which may raise an eyebrow is the grant of Select to SAS Administrators, which HR_SUMMARY inherits from its parent caslib hrdl.
In real customer projects, where data is real and can be sensitive or personal, it's a good idea to separate the responsibilities of SAS Administrators from those of users who work with data. You should consider not granting your SAS Administrators access to data. (The CAS Superuser role does not have access to see data by default). Instead, design and implement a separate data administrator role, perhaps one for the whole organization or one per line of business, to look after your data. In our workshop environment we do have a Data Administrators group (Delilah and Douglas are members). But we decided to leave SAS Administrators with full control of data in the hrdl (and salesdl) caslibs. That's something I would reconsider when we next review the GELCorp security model design.
The filter expression for HR is upcase(department) in ('SUB::SAS.IdentityGroups').
The filter expression for Sales is department = 'sales'.
When Helena accesses data in the HR_SUMMARY table, the CAS authorization system checks her group memberships. The only relevant ones are HR and Authenticated Users. (Splitting hairs a little, I would argue that Authenticated Users is a principal type, not a group, but for our purposes here it behaves much the same way as a group). CAS also looks at the access controls on the table.
It finds that the relevant effective permissions which apply to Helena are for Authenticated Users and HR:
The row-level filter expression for HR says that members of group ID='HR' should only be given access to rows where the uppercased value in the department column of the table is found in a comma-separated list of the current user's group memberships (i.e. in the string resulting from the SQL-like substitution value SUB::SAS.IdentityGroups). For Helena, the only relevant group in her list of group memberships is 'HR'; there are no rows in the table where the value in the department column matches any of the other groups Helena belongs to. So, she only sees the rows for 'department=hr' in HR_SUMMARY.
Here is an example report as seen by Helena:
Sophia's experience is similar to Helena's. When she accesses data in the HR_SUMMARY table, CAS notices her relevant group memberships, Sales and (split hairs aside) Authenticated Users. As before, it finds that the effective permissions which are relevant to Sophia are those for Authenticated Users and Sales, and again, only the ones from Sales have are important because there are not really any access controls for Authenticated Users, just an absence of them.
Sophia can see the table, and can select some rows in the table as determined by the filter expression for Sales which says that they should only have access to rows where department = 'sales'. Note that no string substitution is required in this filter expression. This is handy when the ID of the defining group for this row-level grant ('sales') is not the same as the values in the column you wish to filter on. In this case, there is a value of 'sales', so they are the same. But we could easily show members of the 'Managers' group data where department = 'management' despite the slight spelling difference, or let them see rows for several departments, by modifying the filter expression as desired.
Here is the same report, as seen by Sophia:
Here is where things start to get a bit more interesting.
Barney is in both HR and Sales, so both row-level grants of the Select permission apply to him. As explained in SAS Viya 3.4 Administration > Security > Authorization > CAS Authorization > Concepts > Multiple Fil..., with some caveats, 'If there is an identity precedence tie (the user is a member of multiple groups, each of which has a filter), the user can access any row that meets any of the filters.'. Put another way, a user to whom multiple row-level grants applies sees the union of rows granted by all of the filters.
Because Barney is in both HR and Sales, both row-level filter expressions apply to him when he tries to access data in HR_SUMMARY, and he sees rows for both 'hr' and 'sales':
Our last example user, geladm, is in neither HR nor Sales (at least, not since I took this user out of both groups while preparing these examples). Neither of the row-level grants have any effect on geladm, and the only relevant access controls are grants of ReadInfo and Select which SAS Administrators inherit on this table from the parent caslib, hrdl:
This means that geladm sees results which are based on all the rows in HR_SUMMARY:
It's good that administrators or data managers can restrict the rows in a CAS table that users can see, using row-level grants. And once a global caslib which points to a path on the filesystem is defined, the same CAS server (e.g. cas-shared-default) will not allow you to create another CAS library which uses the same filesystem path.
Because our example user Helena does not have host account access to the Viya server hosting SAS Studio and the SAS Compute Service in our collection, nor on the CAS servers, we switch to another user for this next step. Delilah is also a member of HR, and has an OS account which has HR as one of her secondary groups:
[root@intviya02 data]# id Delilah uid=4002(Delilah) gid=2003(sasusers) groups=2003(sasusers),3007(powerusers),3001(HR) [root@intviya02 data]#
The problem is that the source CSV file (hr_summary.csv) for the CAS table (hrdl.hr_summary) to which we have applied row-level grants in CAS, is in a filesystem directory which is accessible to members of the HR group, and the hr_summary.csv file is readable to members of the group HR:
[root@intviya02 data]# ls -al total 1244 drwxrws---. 2 sas HR 86 Mar 22 10:23 . drwxrws---. 7 sas HR 72 Sep 6 2017 .. -rwxrwx---. 1 sasadm HR 589824 Sep 8 2017 hrdata.sas7bdat -rwxrwx---. 1 sasadm HR 551791 Oct 25 2017 hr_summary.csv -rwxrwx---. 1 sasadm HR 131072 Sep 7 2017 performance_lookup.sas7bdat [root@intviya02 data]#
First, let's show that Delilah is subject to the same row-level security as anyone else in the HR group, when she tries to access the HR_SUMMARY table in CAS. Also, she cannot just declare a new caslib pointing to the same path in this, the only CAS server in our deployment:
The first part of that code works, and only shows Delilah rows for HR:
And, as expected, we get errors for the second part of that code fragment, because CAS will not let Delilah define a new caslib, even a session-level caslib, pointing to the same path as the existing hrdl caslib from the same CAS server:
But this is not sufficient. We must also protect the source data using filesystem permissions.
If read access to the source data is not suitably secured on the filesystem, a user with host account access like Delilah can simply use base SAS code in e.g. SAS Studio, and can access the whole table, without the row-level security. If you allow this, it rather negates the value of the row-level grant in CAS!
Here is an example of some SAS code which demonstrates this alternative route to access the data using an INFILE statement, along with its successful results.
Here are the results, showing Delilah bypassed the CAS authorization system entirely, and can see data for all departments in hr_summary.csv:
Running ordinary SAS code is clearly not a 'back door' or a 'hack' of some obscure vulnerability. It is a perfectly normal 'front door' which, for the purposes of illustrating this alternative way to get to the source data, we have simply failed to lock. My point is simply that securing your data in CAS alone is not enough.
Members of the HR group who have host accounts can get to the source file, and read it from a SAS session, bypassing CAS completely. (Having read the table in a SAS session, they would be free to copy the data anywhere they have write access).
The solution is straightforward and quite obvious. When applying row-level security to a table in CAS, a SAS or data administrator (preferably a group they belong to, which is part of their host account at the operating system level) who are allowed to see all of the data in the table should load the table into memory in CAS.
The source data should be secured on the filesystem, or in the source database, in such a way that users who are not members of that SAS or data administrator group have no access to the source data.
Row level-grants should only ever be used to grant a principal (a group or a user) access to selected rows in a CAS table which that principal cannot access, in CAS or in any other way.
One final thought. Over several years of working with authorization model design, and occasionally touching on row-level access in a peripheral way to my main project work, I have seen the same two examples of identity-based substitutions over and over again - always something like these two:
FacilityRegion In ('SUB::SAS.IdentityGroups')
The overall filter expression syntax is SQL-like, and the documentation shows lots of other comparitors and operators you can use to build more complex expressions than these. (Just because you can, doesn't mean you should though - I'm a big fan of keeping it simple!) However, I haven't seen the 'SUB::variable' syntax [edited 25/03/19] anywhere outside of SAS 9 and CAS, where it has appeared in OLAP MDX examples, BI filters, and I think it is supported in LASR too. If you know where that form of syntax comes from, please add a comment below or email me directly.
To learn more about row-level access on CAS, see the documentation, which is well written, well organised and thorough.
See you next time!
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.