BookmarkSubscribeRSS Feed

Working with Masked Data in SAS Visual Investigator REST APIs (Part 2)

Started 3 weeks ago by
Modified 3 weeks ago by
Views 461

This article is part two of a three-part series on the new data masking capabilities in SAS Visual Investigrator LTS 2023.10. In part one, we learned about masked representations and legacy representations in Datahub's REST API, and we looked at examples for fetching documents with masked data as well as fetching masked values themselves. In this part, we'll look at some examples of using PUT requests to update documents containing masked data.

 

How are masked values handled in a PUT request?

 

Updating documents containing masked field values can get complicated. Let's say you fetch a document with a masked ssn value, using the new masked representation I demonstrated in the previous post. Then you update the first_name field and make a PUT request to the server to save the new version of the document.

 

The PUT request that you send to the server in this scenario has its ssn value masked out to •••••••••, however. So when Datahub receives this PUT request, will it save the literal string ••••••••• to the database as the ssn? What if a legacy client unaware of field masking requests the legacy representation and saves back a null value for ssn ? If a client PUTs a null value for a masked field, how can Datahub know if the client intends for the new value to be null or if the client intends to leave the value unchanged?

 

Datahub attempts to handle these cases intelligently, using the Content-Type header of the PUT request. If the client sends a masked media type in the Content-Type, it will assume the incoming representation was fetched using a masked media type, and that incoming ••••••••• values are just masked values that should not be changed in the database. That way, masked fields can be cleared by sending a null as the new value.

 

However, if the Content-Type of the incoming PUT request is a legacy media type, Datahub assumes the original representation was fetched with the legacy media type, which had masked values set to null. This implies that incoming null values for masked fields represent the client intending to leave the value the same. In this case, Datahub will leave the masked value unchanged.

 

One ramification of this behavior is that masked fields cannot be set to null using the legacy media type for Content-Type in a PUT request. (This includes leaving the Content-Type unset.) If you want to set a masked field value to null in a PUT request, you must set the Content-Type to a masked media type.

 

(Conversely, you can't set a masked field value to the literal string ••••••••• using a masked media type in the Content-Type header. Datahub will ignore the incoming new value. If you need to do that for some reason, you can do it by sending the value in a PUT request with a legacy media type in the Content-Type header.)

 

If that is hard to keep track of, don't worry. The basic rule is that you should set the Content-Type header to whatever media type you used in requesting the original representation, with the caveat that you can't set a masked field's value to null using the legacy media type.

 

Let's take a look at some examples.

 

Using PUT with masked representations

 

Let's re-use our Person example from earlier. Also, let's suppose that the Person entity type has two masked fields, ssn and pin. Let's say we request the masked representation using curl:

 

curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -H "Authorization: Bearer $TOKEN" -H "Accept: application/vnd.sas.investigation.data.masked.enriched.document+json" | jq > my-person.json

 

The above command saves the response into a file named my-person.json. The resulting file might look like this:

 

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "John",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "ssn": "•••••••••",
    "pin": "•••••••••",
    "last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
    "last_updated_by_user_id": "viuser",
    "version": 4
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2020-04-13T19:31:37.097Z",
  "validFrom": "2020-01-05T00:00:00.000Z",
  "fieldRestrictions": {
    "person": {
      "ssn": {
        "masked": {
          "currentUserIsAuthorizedToReveal": true
        }
      },
      "pin": {
        "masked": {
          "currentUserIsAuthorizedToReveal": false
        }
      }
    }
  }
}

 

We can edit the JSON in the file to set a new first_name and a new pin:

 

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "James",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "ssn": "•••••••••",
    "pin": "9898",
    "last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
    "last_updated_by_user_id": "viuser",
    "version": 4
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2020-04-13T19:31:37.097Z",
  "validFrom": "2020-01-05T00:00:00.000Z",
  "fieldRestrictions": {
    "person": {
      "ssn": {
        "masked": {
          "currentUserIsAuthorizedToReveal": true
        }
      },
      "pin": {
        "masked": {
          "currentUserIsAuthorizedToReveal": false
        }
      }
    }
  }
}

 

Notice that in this scenario, the current user is not authorized to reveal the pin value, but they will be authorized to save a new value. These two permissions operate independently of one another.

 

Before we can save this new version, we have to lock the document for editing:

 

curl "$VI_BASE/svi-datahub/locks/documents?type=person&id=2f21e644-089a-47d8-a503-bbdd4d8dac3d" -X POST -H "Authorization: Bearer $TOKEN"

 

Once we have the lock, we can save this new version by sending it to the server with curl:

 

curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -X PUT --upload-file my-person.json -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "Accept: application/vnd.sas.investigation.data.masked.enriched.document+json" | jq

 

The response from the PUT endpoint will be the new version of the document. Since a masked media type was sent in the Accept header, the response representation will be masked:

 

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "James",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "ssn": "•••••••••",
    "pin": "•••••••••",
    "last_updated_at_dttm": "2024-01-09T17:01:11.567Z",
    "last_updated_by_user_id": "viuser",
    "version": 5
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2024-01-09T17:01:11.567Z",
  "validFrom": "2020-01-05T00:00:00.000Z",
  "fieldRestrictions": {
    "person": {
      "ssn": {
        "masked": {
          "currentUserIsAuthorizedToReveal": true
        }
      },
      "pin": {
        "masked": {
          "currentUserIsAuthorizedToReveal": false
        }
      }
    }
  }
}

 

In the response we can see that the first_name field has been updated. While the ssn and pin fields still appear as •••••••••, the value of the ssn field remains unchanged in the database, and the value of the pin field has been updated to 9898.

 

Using PUT with legacy representations

 

Let's try the same exercise but with legacy media types.

 

First we'll request the legacy representation using curl:

 

curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -H "Authorization: Bearer $TOKEN" -H "Accept: application/json" | jq > my-person.json

 

The response might look something like this:

 

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "John",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
    "last_updated_by_user_id": "viuser",
    "pin": null,
    "ssn": null,
    "version": 4
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2020-04-13T19:31:37.097Z",
  "validFrom": "2020-01-05T00:00:00.000Z"
}

 

Remember that since we requested the legacy media type, the masked field values (ssn and pin) are omitted from the response, even though the values are present.

 

To set some new values, we'll adjust the value of first_name in the field values map, and we'll set a new value for pin:

 

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "James",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
    "last_updated_by_user_id": "viuser",
    "pin": "9898",
    "ssn": null,
    "version": 4
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2020-04-13T19:31:37.097Z",
  "validFrom": "2020-01-05T00:00:00.000Z"
}

 

Before we can save the new version we must ensure we hold a lock on the document:

 

curl "$VI_BASE/svi-datahub/locks/documents?type=person&id=2f21e644-089a-47d8-a503-bbdd4d8dac3d" -X POST -H "Authorization: Bearer $TOKEN"

 

Now, with the document locked for editing, we can save the new version. When we send this PUT request with curl, we need to be sure to set the legacy media type for the Content-Type header:

 

curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -X PUT --upload-file my-person.json -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "Accept: application/json" | jq

 

The response will be the legacy representation of the new version saved to the database:

 

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "James",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "last_updated_at_dttm": "2024-01-09T17:01:11.567Z",
    "last_updated_by_user_id": "viuser",
    "pin": null,
    "ssn": null,
    "version": 4
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2024-01-09T17:01:11.567Z",
  "validFrom": "2020-01-05T00:00:00.000Z"
}

 

In the response, we can see that the first_name field has been updated to James. Since this is a legacy representation, the ssn and pin values have been omitted, but we know that the value for ssn has not been changed, and the pin is now set to 9898.

 

Using PUT to set a masked value to null

 

Remember that this must be done using the masked representation.

 

First, we'll request the latest masked representation:

 

curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -H "Authorization: Bearer $TOKEN" -H "Accept: application/vnd.sas.investigation.data.masked.enriched.document+json" | jq > my-person.json

 

The response, saved to disk in the file my-person.json, might look like this:

 

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "John",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "ssn": "•••••••••",
    "pin": "•••••••••",
    "last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
    "last_updated_by_user_id": "viuser",
    "version": 4
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2020-04-13T19:31:37.097Z",
  "validFrom": "2020-01-05T00:00:00.000Z",
  "fieldRestrictions": {
    "person": {
      "ssn": {
        "masked": {
          "currentUserIsAuthorizedToReveal": true
        }
      },
      "pin": {
        "masked": {
          "currentUserIsAuthorizedToReveal": false
        }
      }
    }
  }
}

 

We can edit the JSON to clear out the value of ssn:

 

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "John",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "ssn": null,
    "pin": "•••••••••",
    "last_updated_at_dttm": "2020-04-13T19:31:37.097Z",
    "last_updated_by_user_id": "viuser",
    "version": 4
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2020-04-13T19:31:37.097Z",
  "validFrom": "2020-01-05T00:00:00.000Z",
  "fieldRestrictions": {
    "person": {
      "ssn": {
        "masked": {
          "currentUserIsAuthorizedToReveal": true
        }
      },
      "pin": {
        "masked": {
          "currentUserIsAuthorizedToReveal": false
        }
      }
    }
  }
}

 

Note that the Datahub REST API does not distinguish between a null field value and a field value that is omitted from the field values map. At this step you could also send a representation where the ssn field is missing from the field values map, and the result would be the same.

 

Next, we will lock the file for editing:

 

curl "$VI_BASE/svi-datahub/locks/documents?type=person&id=2f21e644-089a-47d8-a503-bbdd4d8dac3d" -X POST -H "Authorization: Bearer $TOKEN"

 

Finally, we will save the new version with a PUT request through curl:

 

curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d -X PUT --upload-file my-person.json -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "Accept: application/vnd.sas.investigation.data.masked.enriched.document+json" | jq

 

Remember that since we fetched the original representation with the masked media type, we have to PUT our update with the masked media type in the Content-Type header.

 

The response might look something like this:

 

{
  "objectTypeName": "person",
  "objectTypeId": 100515,
  "objectTypeVersion": 4,
  "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
  "fieldValues": {
    "birthday": "2020-01-05T00:00:00Z",
    "created_at_dttm": "2020-04-13T19:17:47.84Z",
    "created_by_user_id": "viuser",
    "first_name": "John",
    "id": "2f21e644-089a-47d8-a503-bbdd4d8dac3d",
    "last_name": "Smith",
    "ssn": "•••••••••",
    "pin": "•••••••••",
    "last_updated_at_dttm": "2024-01-09T17:01:11.567Z",
    "last_updated_by_user_id": "viuser",
    "version": 5
  },
  "createdAt": "2020-04-13T19:17:47.840Z",
  "lastUpdatedAt": "2024-01-09T17:01:11.567Z",
  "validFrom": "2020-01-05T00:00:00.000Z",
  "fieldRestrictions": {
    "person": {
      "ssn": {
        "masked": {
          "currentUserIsAuthorizedToReveal": true
        }
      },
      "pin": {
        "masked": {
          "currentUserIsAuthorizedToReveal": false
        }
      }
    }
  }
}

 

Since the Accept header for the PUT request was the masked media type, Datahub responded with a masked representation. This means that the value of ssn (which is null) is masked. So it still appears as ••••••••• in this representation. However, if we request to reveal the value of ssn for this person using curl:

 

curl $VI_BASE/svi-datahub/documents/person/2f21e644-089a-47d8-a503-bbdd4d8dac3d/fields/ssn -H "Authorization: Bearer $TOKEN" | jq

 

The response will be:

 

{}

 

This response may seem counterintuitive, but it is expected. Since Datahub often omits properties with null values from response JSON, this response is equivalent to:

 

{
    "raw": null
}

 

So we can see that the actual value in the database is null. Datahub has cleared the old ssn value.

 

Conclusion

 

In this article, we discussed how the Datahub service handles masked values when processing PUT requests. We also looked at some examples of using PUT requests to update masked data using both masked representations and legacy representations. Finally, we saw an example of how to clear a masked value from an object using a PUT request. In the next part, we will look at PATCH requests, an alternative to PUT requests that enables a different approach to updating documents with masked fields.

Version history
Last update:
3 weeks ago
Updated by:
Contributors

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags