This article is part two of a three-part series on the new data masking capabilities in SAS Visual Investigrator LTS 2023.10. In part one, we learned about masked representations and legacy representations in Datahub's REST API, and we looked at examples for fetching documents with masked data as well as fetching masked values themselves. In this part, we'll look at some examples of using PUT
requests to update documents containing masked data.
PUT
request?
Updating documents containing masked field values can get complicated. Let's say you fetch a document with a masked ssn
value, using the new masked representation I demonstrated in the previous post. Then you update the first_name
field and make a PUT
request to the server to save the new version of the document.
The PUT
request that you send to the server in this scenario has its ssn
value masked out to •••••••••
, however. So when Datahub receives this PUT
request, will it save the literal string •••••••••
to the database as the ssn
? What if a legacy client unaware of field masking requests the legacy representation and saves back a null
value for ssn
? If a client PUT
s a null
value for a masked field, how can Datahub know if the client intends for the new value to be null
or if the client intends to leave the value unchanged?
Datahub attempts to handle these cases intelligently, using the Content-Type
header of the PUT
request. If the client sends a masked media type in the Content-Type
, it will assume the incoming representation was fetched using a masked media type, and that incoming •••••••••
values are just masked values that should not be changed in the database. That way, masked fields can be cleared by sending a null
as the new value.
However, if the Content-Type
of the incoming PUT
request is a legacy media type, Datahub assumes the original representation was fetched with the legacy media type, which had masked values set to null
. This implies that incoming null
values for masked fields represent the client intending to leave the value the same. In this case, Datahub will leave the masked value unchanged.
One ramification of this behavior is that masked fields cannot be set to null
using the legacy media type for Content-Type
in a PUT
request. (This includes leaving the Content-Type
unset.) If you want to set a masked field value to null
in a PUT
request, you must set the Content-Type
to a masked media type.
(Conversely, you can't set a masked field value to the literal string •••••••••
using a masked media type in the Content-Type
header. Datahub will ignore the incoming new value. If you need to do that for some reason, you can do it by sending the value in a PUT
request with a legacy media type in the Content-Type
header.)
If that is hard to keep track of, don't worry. The basic rule is that you should set the Content-Type
header to whatever media type you used in requesting the original representation, with the caveat that you can't set a masked field's value to null
using the legacy media type.
Let's take a look at some examples.
PUT
with masked representations
Let's re-use our Person example from earlier. Also, let's suppose that the Person entity type has two masked fields, ssn
and pin
. Let's say we request the masked representation using curl
:
curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -H "Authorization: Bearer $TOKEN" -H "Accept: application/vnd.sas.investigation.data.masked.enriched.document+json" | jq > my-person.json
The above command saves the response into a file named my-person.json
. The resulting file might look like this:
{
"objectTypeName": "person",
"objectTypeId": 100515,
"objectTypeVersion": 4,
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"fieldValues": {
"birthday": "2020-01-05T00:00:00Z",
"created_at_dttm": "2020-04-13T19:17:47.84Z",
"created_by_user_id": "viuser",
"first_name": "John",
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"last_name": "Smith",
"ssn": "•••••••••",
"pin": "•••••••••",
"last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
"last_updated_by_user_id": "viuser",
"version": 4
},
"createdAt": "2020-04-13T19:17:47.840Z",
"lastUpdatedAt": "2020-04-13T19:31:37.097Z",
"validFrom": "2020-01-05T00:00:00.000Z",
"fieldRestrictions": {
"person": {
"ssn": {
"masked": {
"currentUserIsAuthorizedToReveal": true
}
},
"pin": {
"masked": {
"currentUserIsAuthorizedToReveal": false
}
}
}
}
}
We can edit the JSON in the file to set a new first_name
and a new pin
:
{
"objectTypeName": "person",
"objectTypeId": 100515,
"objectTypeVersion": 4,
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"fieldValues": {
"birthday": "2020-01-05T00:00:00Z",
"created_at_dttm": "2020-04-13T19:17:47.84Z",
"created_by_user_id": "viuser",
"first_name": "James",
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"last_name": "Smith",
"ssn": "•••••••••",
"pin": "9898",
"last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
"last_updated_by_user_id": "viuser",
"version": 4
},
"createdAt": "2020-04-13T19:17:47.840Z",
"lastUpdatedAt": "2020-04-13T19:31:37.097Z",
"validFrom": "2020-01-05T00:00:00.000Z",
"fieldRestrictions": {
"person": {
"ssn": {
"masked": {
"currentUserIsAuthorizedToReveal": true
}
},
"pin": {
"masked": {
"currentUserIsAuthorizedToReveal": false
}
}
}
}
}
Notice that in this scenario, the current user is not authorized to reveal the pin
value, but they will be authorized to save a new value. These two permissions operate independently of one another.
Before we can save this new version, we have to lock the document for editing:
curl "$VI_BASE/svi-datahub/locks/documents?type=person&id=2f21e644-089a-47d8-a503-bbdd4d8dac3d" -X POST -H "Authorization: Bearer $TOKEN"
Once we have the lock, we can save this new version by sending it to the server with curl
:
curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -X PUT --upload-file my-person.json -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "Accept: application/vnd.sas.investigation.data.masked.enriched.document+json" | jq
The response from the PUT
endpoint will be the new version of the document. Since a masked media type was sent in the Accept
header, the response representation will be masked:
{
"objectTypeName": "person",
"objectTypeId": 100515,
"objectTypeVersion": 4,
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"fieldValues": {
"birthday": "2020-01-05T00:00:00Z",
"created_at_dttm": "2020-04-13T19:17:47.84Z",
"created_by_user_id": "viuser",
"first_name": "James",
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"last_name": "Smith",
"ssn": "•••••••••",
"pin": "•••••••••",
"last_updated_at_dttm": "2024-01-09T17:01:11.567Z",
"last_updated_by_user_id": "viuser",
"version": 5
},
"createdAt": "2020-04-13T19:17:47.840Z",
"lastUpdatedAt": "2024-01-09T17:01:11.567Z",
"validFrom": "2020-01-05T00:00:00.000Z",
"fieldRestrictions": {
"person": {
"ssn": {
"masked": {
"currentUserIsAuthorizedToReveal": true
}
},
"pin": {
"masked": {
"currentUserIsAuthorizedToReveal": false
}
}
}
}
}
In the response we can see that the first_name
field has been updated. While the ssn
and pin
fields still appear as •••••••••
, the value of the ssn
field remains unchanged in the database, and the value of the pin
field has been updated to 9898
.
PUT
with legacy representations
Let's try the same exercise but with legacy media types.
First we'll request the legacy representation using curl
:
curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -H "Authorization: Bearer $TOKEN" -H "Accept: application/json" | jq > my-person.json
The response might look something like this:
{
"objectTypeName": "person",
"objectTypeId": 100515,
"objectTypeVersion": 4,
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"fieldValues": {
"birthday": "2020-01-05T00:00:00Z",
"created_at_dttm": "2020-04-13T19:17:47.84Z",
"created_by_user_id": "viuser",
"first_name": "John",
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"last_name": "Smith",
"last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
"last_updated_by_user_id": "viuser",
"pin": null,
"ssn": null,
"version": 4
},
"createdAt": "2020-04-13T19:17:47.840Z",
"lastUpdatedAt": "2020-04-13T19:31:37.097Z",
"validFrom": "2020-01-05T00:00:00.000Z"
}
Remember that since we requested the legacy media type, the masked field values (ssn
and pin
) are omitted from the response, even though the values are present.
To set some new values, we'll adjust the value of first_name
in the field values map, and we'll set a new value for pin
:
{
"objectTypeName": "person",
"objectTypeId": 100515,
"objectTypeVersion": 4,
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"fieldValues": {
"birthday": "2020-01-05T00:00:00Z",
"created_at_dttm": "2020-04-13T19:17:47.84Z",
"created_by_user_id": "viuser",
"first_name": "James",
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"last_name": "Smith",
"last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
"last_updated_by_user_id": "viuser",
"pin": "9898",
"ssn": null,
"version": 4
},
"createdAt": "2020-04-13T19:17:47.840Z",
"lastUpdatedAt": "2020-04-13T19:31:37.097Z",
"validFrom": "2020-01-05T00:00:00.000Z"
}
Before we can save the new version we must ensure we hold a lock on the document:
curl "$VI_BASE/svi-datahub/locks/documents?type=person&id=2f21e644-089a-47d8-a503-bbdd4d8dac3d" -X POST -H "Authorization: Bearer $TOKEN"
Now, with the document locked for editing, we can save the new version. When we send this PUT
request with curl
, we need to be sure to set the legacy media type for the Content-Type
header:
curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -X PUT --upload-file my-person.json -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "Accept: application/json" | jq
The response will be the legacy representation of the new version saved to the database:
{
"objectTypeName": "person",
"objectTypeId": 100515,
"objectTypeVersion": 4,
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"fieldValues": {
"birthday": "2020-01-05T00:00:00Z",
"created_at_dttm": "2020-04-13T19:17:47.84Z",
"created_by_user_id": "viuser",
"first_name": "James",
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"last_name": "Smith",
"last_updated_at_dttm": "2024-01-09T17:01:11.567Z",
"last_updated_by_user_id": "viuser",
"pin": null,
"ssn": null,
"version": 4
},
"createdAt": "2020-04-13T19:17:47.840Z",
"lastUpdatedAt": "2024-01-09T17:01:11.567Z",
"validFrom": "2020-01-05T00:00:00.000Z"
}
In the response, we can see that the first_name
field has been updated to James
. Since this is a legacy representation, the ssn
and pin
values have been omitted, but we know that the value for ssn
has not been changed, and the pin
is now set to 9898
.
PUT
to set a masked value to null
Remember that this must be done using the masked representation.
First, we'll request the latest masked representation:
curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -H "Authorization: Bearer $TOKEN" -H "Accept: application/vnd.sas.investigation.data.masked.enriched.document+json" | jq > my-person.json
The response, saved to disk in the file my-person.json
, might look like this:
{
"objectTypeName": "person",
"objectTypeId": 100515,
"objectTypeVersion": 4,
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"fieldValues": {
"birthday": "2020-01-05T00:00:00Z",
"created_at_dttm": "2020-04-13T19:17:47.84Z",
"created_by_user_id": "viuser",
"first_name": "John",
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"last_name": "Smith",
"ssn": "•••••••••",
"pin": "•••••••••",
"last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
"last_updated_by_user_id": "viuser",
"version": 4
},
"createdAt": "2020-04-13T19:17:47.840Z",
"lastUpdatedAt": "2020-04-13T19:31:37.097Z",
"validFrom": "2020-01-05T00:00:00.000Z",
"fieldRestrictions": {
"person": {
"ssn": {
"masked": {
"currentUserIsAuthorizedToReveal": true
}
},
"pin": {
"masked": {
"currentUserIsAuthorizedToReveal": false
}
}
}
}
}
We can edit the JSON to clear out the value of ssn
:
{
"objectTypeName": "person",
"objectTypeId": 100515,
"objectTypeVersion": 4,
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"fieldValues": {
"birthday": "2020-01-05T00:00:00Z",
"created_at_dttm": "2020-04-13T19:17:47.84Z",
"created_by_user_id": "viuser",
"first_name": "John",
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"last_name": "Smith",
"ssn": null,
"pin": "•••••••••",
"last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
"last_updated_by_user_id": "viuser",
"version": 4
},
"createdAt": "2020-04-13T19:17:47.840Z",
"lastUpdatedAt": "2020-04-13T19:31:37.097Z",
"validFrom": "2020-01-05T00:00:00.000Z",
"fieldRestrictions": {
"person": {
"ssn": {
"masked": {
"currentUserIsAuthorizedToReveal": true
}
},
"pin": {
"masked": {
"currentUserIsAuthorizedToReveal": false
}
}
}
}
}
Note that the Datahub REST API does not distinguish between a
null
field value and a field value that is omitted from the field values map. At this step you could also send a representation where thessn
field is missing from the field values map, and the result would be the same.
Next, we will lock the file for editing:
curl "$VI_BASE/svi-datahub/locks/documents?type=person&id=2f21e644-089a-47d8-a503-bbdd4d8dac3d" -X POST -H "Authorization: Bearer $TOKEN"
Finally, we will save the new version with a PUT
request through curl
:
curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -X PUT --upload-file my-person.json -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "Accept: application/vnd.sas.investigation.data.masked.enriched.document+json" | jq
Remember that since we fetched the original representation with the masked media type, we have to PUT
our update with the masked media type in the Content-Type
header.
The response might look something like this:
{
"objectTypeName": "person",
"objectTypeId": 100515,
"objectTypeVersion": 4,
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"fieldValues": {
"birthday": "2020-01-05T00:00:00Z",
"created_at_dttm": "2020-04-13T19:17:47.84Z",
"created_by_user_id": "viuser",
"first_name": "John",
"id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
"last_name": "Smith",
"ssn": "•••••••••",
"pin": "•••••••••",
"last_updated_at_dttm": "2024-01-09T17:01:11.567Z",
"last_updated_by_user_id": "viuser",
"version": 5
},
"createdAt": "2020-04-13T19:17:47.840Z",
"lastUpdatedAt": "2024-01-09T17:01:11.567Z",
"validFrom": "2020-01-05T00:00:00.000Z",
"fieldRestrictions": {
"person": {
"ssn": {
"masked": {
"currentUserIsAuthorizedToReveal": true
}
},
"pin": {
"masked": {
"currentUserIsAuthorizedToReveal": false
}
}
}
}
}
Since the Accept
header for the PUT
request was the masked media type, Datahub responded with a masked representation. This means that the value of ssn
(which is null
) is masked. So it still appears as •••••••••
in this representation. However, if we request to reveal the value of ssn
for this person using curl
:
curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d/fields/ssn -H "Authorization: Bearer $TOKEN" | jq
The response will be:
{}
This response may seem counterintuitive, but it is expected. Since Datahub often omits properties with null values from response JSON, this response is equivalent to:
{
"raw": null
}
So we can see that the actual value in the database is null
. Datahub has cleared the old ssn
value.
In this article, we discussed how the Datahub service handles masked values when processing PUT
requests. We also looked at some examples of using PUT
requests to update masked data using both masked representations and legacy representations. Finally, we saw an example of how to clear a masked value from an object using a PUT
request. In the next part, we will look at PATCH
requests, an alternative to PUT
requests that enables a different approach to updating documents with masked fields.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.