By Using Metadata-Bound Libraries, and a Comparison with Other Techniques
The need for applying row level access to data is old, and increasingly elevating due to that fact we want more people to use the data and reports to good things. But when offering more detailed data, it usually comes with the side effect that we need to consider privacy and other legal issues. We need ways to ensure that information only is available to whom it is authorized.
Historically in SAS, this usually meant that we have to build the access control within our applications. With time, this has been easier with the help of meta data, OLAP cubes and Information Maps. But what about users who should access data directly and in more flexible ways? If this is the case, we can’t hide behind applications, or techniques such as Information Maps.
So, if we can’t rely on authorization near the user, we need to put authorization closer to the data. Here’s a list of few of the methods that are somewhat easy to use for SAS applications. They are (but not limited to):
Independent of solution, we wish to look for methods that supports filtering based on groups memberships (and in some occasions the user id).
In this scenario, we will rely on methods present in the data base. Some having better support for this than others, so it’s hard give a common receipt how this can/will work. As an example SQL Server have no out of the box concept for this. Restricting access involves securing tables/columns, and use of views and Stored Procedures.
The Scalable Performance Data Server (SPDS) have had from the start a quite extensive security model, including management of users and groups, Access Control Lists and a security model that can be applied on tables and columns. And for quite some time, even row level filtering. It is enforced by using user/group symbol resolution, which is used in server side defined views.
There are some automatically defined and assigned symbols:
Let me exemplify:
connect to sasspds (dbq='Test' host='zen' serv='5400' user='Lisa' password='abcd1234');
from connection to sasspds
where sex = "@SPDSGRP");
disconnect from sasspds;
Lisa is a registered user in SPD Server, and belongs to a group “F”. After automatic symbolic substitution the where clause is transformed to:
where sex = "F"
This logic can be applied to a view. And of course, you need to secure the underlying table for direct access by the users. Another even “tighter” way is to use it as a where constraint on the table:
proc spdo lib=test;
constraint add class;
Where (sex = “@SPDSGRP”);
constraint describe class;
Follow the link under “References” for details.
This techniques is secure, and quite flexible. I.e. you can combine a separate organization table in a view (think star schema).
If you don’t already have SPD Server licensed, it will of course be an additional cost involved in fees and time learning the product. Also, there is no easy way to batch load/synchronize users and groups. So if you have vast amount of users with individual access patterns, the maintenance can be substantial. For some use cases, you could perhaps use “group accounts” in SPD Server, and hold individual user information within the Metadata Server alone.
The SAS Federation Server originated from the Data Flux product line. Federation means that it combines access to multiple data sources in a central point of administration. It has many features, and one is row level access restrictions for specific users and groups.
Another security mechanism allows you to set up rules for masking sensitive data, such as SSN.
Like the SPD Server case, if you don’t have the license yet, you need to do some initial investments. But if are looking for a data virtualization tool anyway, it’s definitely worth evaluating.
As the title implies, using a metadata-bound library will create an unbreakable connection between a physical table and its metadata registration. This means that any authorization defined in metadata will be enforced, even if you aren’t initially logged on to the metadata server. Moving the table by brute force doesn’t help – without its mother metadata server, the table is inaccessible. When binding a library to metadata, each table will get a corresponding “secured library object” created in the metadata under the folder /System/Secured Libraries/
Then, you are ready to apply row level access patterns, or “Providing Fine-Grained Access Using Condition Permissions” as SAS calls it in the documentation. This feature is quite similar to the symbolic substitution technique used in SPD Server. It’s called identity driven properties, and is maintained by the metadata server. When you define them as part of a condition that you set on the secured library object, the properties are evaluated at query time.
The properties are called in the form SUB::SAS.Propertyname
Example of property names are Userid, ExternalIdentity and IdentityGroups. For a complete list, follow the link under “References” bellow. The steps to make this work are:
Open Management Console, navigate to the /System/Secured Libraries folder.
Right click, and chose “New” - “Secured Library”
Give a name, the same name as the library you wish to bind is not a bad choice.
For this, you need to define a data set password (PW=), and an encryption key. If you need strong encryption, AES is available. Be sure to keep those in a safe place for future maintenance. These hasn’t to be supplied by the end users at query time.
Also, be aware of that the user that creates the metadata-bound library must have both write metadata rights, and the necessary right to rewrite all physical data in the secured library.
When choosing Finish, all data sets in the specified location will be rewritten using the specified data set password(s) and encrypted if specified. Also, all data sets in the location will also get an individual metadata object under the Secured Library folder.
This step is necessary unless you plan to do filtering on a user id level.
Define groups which names have the same value as the column values in your data set you wish to use for sub setting. Be aware that values returned from identity properties when used in Visual Analytics (see below) is returned in upper case, so it would be a best practice to define them as such, for clarity.
Assign user to respective group according to their authorization.
If you have a lot of users and groups, this step can be automated by using User Import Macros, see references at the end.
Find the secured library object for the table you wish to control. Open its properties, chose Authorization. Now, we want to define this condition for every user that can access the table. It could be SASUSERS, or some other group that holds all users authorized to view the data. Select that group, and grant the permission “Select”.
Now, the button “Conditions” is clickable. By doing so, a small dialog box is opened were you can define your subset criteria. Enter without the keyword “where”:
The syntax can be a bit confusing. This because if you are a member of multiple groups, SAS will automatically create your IN list for you and enclose each group name within quotes:
At this point, when a user access this table, the above defined filter will be applied automatically each time the data is accessed. The metadata-binding will prevent any user from bypassing this filter.
Footnote: Identity properties can also be utilized in Visual Analytics and in Information Maps. In Visual Analytics they are defined within the Visual Analytics Administrator application, or in batch mode. When used in Information Maps call them by the corresponding macro variables instead (e.g. &SAS.userid).
Yet another way to limit access is to use the metadata user id to limit access to data via a view. The metadata user id is available in SAS sessions as the pseudo-variable _Metadata_Authenticated_UserId_. It resolves in the same way as the identity property SAS.Userid described earlier. By using this in a where-clause in the view, you can limit access to data – in multiple tables if your view has join in it.
You need of course secure the underlying data for direct access. So I think this concept is a bit more complicated than and not as flexible as the metadata-bound/identity property one. But on the other hand, it offers possibility to limit access to columns, which is not possible with metadata-bound libraries in the current release.
There are a variety of ways to secure data on row level in SAS. If you wish to secure data created for direct access, metadata-bounds libraries is probably the most straight forward solution. Given that a substantial part of SAS sites license metadata server.
A limitation is that this is supported only for the Base SAS engine. Users that have the data in question stored in a SPDE library need to consider to move to a Base library. If you are using a star schema this doesn’t need to cause performance issues, since you can chose only to move the dimension tables that you are using for conditional permission.
For users with their data elsewhere should probably first evaluate techniques available to them in that environment, i.e. the RDBMS of choice.
My colleague Fredrik Englund for proofreading.
SAS R&D which provided valuable feedback, and for the mediation.