BookmarkSubscribeRSS Feed

Working with Masked Data in SAS Visual Investigator REST APIs (Part 1)

Started ‎02-14-2024 by
Modified ‎02-15-2024 by
Views 514

SAS Visual Investigator LTS 2023.10 introduced support for data masking. When an administrator configures a field of an object to be masked, the value contained in that field is obscured from users in the normal view for that object. If a user is authorized to view the value, that user can request the unmasked value specifically.

The SAS Help Center documents data masking for the Visual Investigator user interface, but how do consumers of SAS REST APIs work with data masking?

This three-part series of blog posts detail recent enhancements to the Datahub REST API that enables users to work with masked data.

Part one of this series covers the following:

  • the new REST representations that accommodate masked data
  • an example of requesting one of these new representations
  • how to “unmask” a value (that is, request an unmasked value from Datahub)

Part two covers:

  • how to update a masked value using a PUT request - See how to clear a masked value using a PUT request

Part three covers:

  • how to update a masked value using a PATCH request

Understanding Masked Data Representations

If you have worked with the Datahub REST API in the past, you are probably familiar with the JSON representation of a document object. A simple “person” object might look like this:

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "John",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
    "last_updated_by_user_id": "viuser",
    "version": 4
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2020-04-13T19:31:37.097Z",
  "validFrom": "2020-01-05T00:00:00.000Z"
}

In requests for object data like this, Datahub uses the Accept and Accept-Item headers in the request to decide what representation to send in response.

Previous to LTS 2023.10, there was only one representation for document objects, so the value in these headers didn’t matter. However, LTS 2023.10 introduced a new way of representing objects with masked field values. I’ll call the old representation the “legacy” representation and the new representation the “masked” representation. In LTS 2023.10, the legacy representation is still the default representation used when the Accept or Accept-Item media type is not specified, or when the media type is application/json. In order to request a new masked representation, the consumer must set the Accept or Accept-Item header to a masked media type. (Which media type and whether the header is Accept or Accept-Item depends on the endpoint. I’ll go into more detail later.)

Notice the fieldValues property: it contains a map where the map keys are field names, and map values are the data values for each of the fields. This map is where data can now be masked.

 

[!info] Link and transaction objects also have this fieldValues map, and those values can also be masked. While this blog post only addresses masking of document object fields, very similar principles and techniques can be applied to link and transaction object fields. Refer to the Datahub REST API Documentation for details on media types and API endpoints for these objects.

 

Let’s imagine that, now that VI supports data masking, the customer is able to introduce a field in to the “person” entity type that contains sensitive data–say, social security number (SSN). Let’s suppose they add this field, configure it to be masked so that only authorized users can reveal SSNs, and populate all the SSNs for their “person” data. The new masked representation is actually very similar to the legacy representation:

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "John",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "ssn": "•••••••••",
    "last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
    "last_updated_by_user_id": "viuser",
    "version": 4
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2020-04-13T19:31:37.097Z",
  "validFrom": "2020-01-05T00:00:00.000Z",
  "fieldRestrictions": {
    "person": {
      "ssn": {
        "masked": {
          "currentUserIsAuthorizedToReveal": true
        }
      }
    }
  }
}

Notice that the ssn field in the fieldValues map contains the mask value, which is nine dot characters. This is, of course, not the real value–that has been masked out.

The other difference you may notice is that there is now a fieldRestrictions map containing some helpful metadata about the object’s masked data. Here, you can see the representation indicates that the ssn field in the person entity type is masked, and that the current user (the one making the REST request) is authorized to reveal the value.

 

[!info] When an entity type has a child entity type, if one of the child entity type’s fields has masking enabled, the child entity type will also have an entry in this top-level fieldRestrictions object.

 

So, what happens when legacy clients request a legacy representation of an object containing masked data? The way Datahub handles this case is by omitting masked values. They will appear the same way a null or empty value does.

To receive the new masked representation, specify the media type application/vnd.sas.investigation.data.masked.document (for plain documents) or application/vnd.sas.investigation.data.masked.enriched.document for (enriched documents) in the Accept or Accept-Item header (which header depends on the endpoint). In the next section I’ll show an example.

Requesting a Masked Representation

This section will show how to use curl to request the masked representation of a document object from Datahub using the Datahub REST API. You might be using a REST client other than curl, such as Postman, Insomnia, or Visual Studio Code’s Thunder Client extension. If you are, adapting these instructions for your client should be straightforward.

These instructions assume you have the following already set up: 1. a VI system to send requests to 2. credentials to log in to that system (and you know how to fetch a bearer token) 3. an entity type configured with a masked field 4. a few documents of that entity type

If you aren’t sure how to perform this setup, refer to the Visual Investigator Administrator’s Guide and User’s Guide.

Before we begin, we need to request a bearer token in order to authenticate our GET request to the Datahub service. How you do this depends on what kind of VI deployment and environment you’re working with. Explaining how to do this is outside of the scope of the this post, but you can find detailed instructions in the Visual Investigator Administrator’s Guide. I’ll assume you’ve requested a bearer token and it’s stored in the $TOKEN shell variable.

I will also assume you have the Visual Investigator’s hostname stored in the $VI_BASE shell variable. Additionally, you will notice that I often pipe curl’s output to jq, a utility for processing JSON data. You can learn more about jq at its project website.

The endpoint to request a single document is:

GET /svi-datahub/documents/{entityType}/{documentId}

So, since I have a person entity type, and a person with ID 2f21e644-089a-47d8-a503-bbdd4d8dac3d in the database, I can request the legacy representation with the following curl command:

curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -H "Authorization: Bearer $TOKEN" | jq

The Datahub service responds with the legacy representation of the document object, since the Accept header was not specified:

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "John",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
    "last_updated_by_user_id": "viuser",
    "pin": null,
    "ssn": null,
    "version": 4
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2020-04-13T19:31:37.097Z",
  "validFrom": "2020-01-05T00:00:00.000Z"
}

 

By specifying the new masked media type in the request’s Accept header, I can request the new masked media type. The following curl command sets the Accept header to the correct media type:

curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -H "Authorization: Bearer $TOKEN" -H "Accept: application/vnd.sas.investigation.data.masked.enriched.document+json" | jq

Since the Accept header specified a masked media type, the response is a masked representation:

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "John",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "ssn": "•••••••••",
    "last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
    "last_updated_by_user_id": "viuser",
    "version": 4
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2020-04-13T19:31:37.097Z",
  "validFrom": "2020-01-05T00:00:00.000Z",
  "fieldRestrictions": {
    "person": {
      "ssn": {
        "masked": {
          "currentUserIsAuthorizedToReveal": true
        }
      }
    }
  }
}

 

 

Fetching a Collection of Masked Representations

When requesting collections of objects from a SAS REST service, the Accept header specifies the media type for the collection itself (probably application/vnd.sas.collection), and the Accept-Item header specifies the media type for items within the collection. Therefore, to request a collection of documents containing masked data, we should send the following request:

curl $VI_BASE/svi-datahub/documents/person -H "Authorization: Bearer $TOKEN" -H "Accept-Item: application/vnd.sas.investigation.data.masked.enriched.document+json" | jq

I won’t show the response here because it is somewhat long, but if you make this request for yourself you will see that the items within the response collection are masked representations, because the request’s Accept-Item header was a masked media type.

Requesting an Unmasked Value

Using the above requests, we can obtain a masked representation of a document object. But how do we retrieve the actual value of the masked field, ssn?

In LTS 2023.10, the Datahub service introduced a new family of REST endpoints to serve this purpose. For fields of top-level document objects, the new endpoint is:

GET /svi-datahub/documents/{entityTypeName}/{documentId}/fields/{fieldName}

So, in order to request the example person’s Social Security Number unmasked, we can use the following curl command:

curl curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d/fields/ssn -H "Authorization: Bearer $TOKEN" | jq

Note that since this endpoint only works with one media type, we can leave the Accept header empty and the endpoint will serve the default media type, which is application/vnd.sas.investigation.data.object.field.value.

Here is an example response:

{
    "raw": "012-34-5678"
}

This response indicates that the actual value of the ssn field is 012-34-5678.

There is also an endpoint for unmasking multiple fields at once. This is useful if the entity type has multiple masked fields and you want to reveal all their values using one request. You can find more details in the Datahub REST API documentation.

Conclusion

In this part, we explored the differences between masked and legacy representations, and learned how they are controlled by media types in the REST request. We also showed examples of requesting masked representations (both single instances and collections), as well as requesting unmasked values. In the next part, we’ll learn how to update documents with masked fields using PUT requests.

Version history
Last update:
‎02-15-2024 01:45 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags