SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Dynamic data decryption

Reply
Occasional Contributor
Posts: 5

Dynamic data decryption

Hi everybody. My apologies for my lack of knowledge in SAS but I am new to this.

I am leading a regulatory project that has a side effect on a SAS environment.

Our company has decided to encrypt some sensitive data in a source database. The side effect is that these data are also exported to the company data warehouse. This data warehouse is accessed/analysed/queried using SAS enterprise guide.

It has been decided to keep these data encrypted in the data warehouse. This cannot be challenged.

Users use SAS Enterprise guide to query the data warehouse in sql.

We have corporate APIs available to get the data decrypted.

I would like to understand what is that he best approach the get the result sets of the sql queries dynamically decrypted using our APIs.

It doesn’t seem that SAS enterprise guide is a good candidate for executing the transformation. I kind of understand that it is possible to use integration scripts or programs in connection with sas metadata server. But this is too new to me.

I would be more than grateful if somebody could point me to the right direction, so that I can understand how I can achieve this.

Our APIs.can be called using java, c or rest services. We can write integration components if necessary.

Many thanks in advance

Gilles
Super Contributor
Posts: 487

Re: Dynamic data decryption

For bulk work I would suggest to take the C road. Have a look at the MODULE() set of functions/call routines in association with the SASCBTBL specification. This would allow for small to large volume decryption in datastep and SQL alike. Java would be my second choice. REST is also possible but does not scale well. 

 

Regards,

- Jan

Occasional Contributor
Posts: 5

Re: Dynamic data decryption

Posted in reply to jklaverstijn

Thank you for your answer. The bulk work will be performed by the ETL when the data are exported to the data warehouse. So no bulk need in SAS.

 

But if I understand correctly the same approach can be used to dynamically decrypt the result of the SQL queries. I will have a look at this.

 

Thanks and regards

 

Gilles

Super User
Posts: 5,548

Re: Dynamic data decryption

I assume that just special PII fields are being encrypted. Also assuming you have been granted the right to decrypt for your specific application.
So the question is do you really need those decrypted for your analysis/reporting needs?

The API's you describe sound more suited for transactional use rather than data transport. I would start to challenge your DW to provide views that let you consume decrypted data.
Data never sleeps
Occasional Contributor
Posts: 5

Re: Dynamic data decryption

Hi,

 

You are right. Only few PII need to be encrypted, and this should not impact the analytics. But there are loads of SQL queries that are used by the end users to generate reports on their customers that are going to be impacted. These are the ones which will need to be decrypted.

 

The tokenisation tool used is very business oriented, and is very strong at anonymising the end user reports. 

 

The objective is to find the best integration with SAS.

 

Regards

 

Gilles

Occasional Contributor
Posts: 5

Re: Dynamic data decryption

[ Edited ]

With your feedbacks I started digging more precisely into SAS capabilities and also found the SAS Enterprise Guide Addin Task capability

 

COM Add-in

 

I was wondering if it would be worth considering a SAS Guide Com Add-In as well.

 

This add-in could get the results of the SQL queries, and decrypt them dynamically. This would be initiated on the client side.

 

Do you think this would be do-able?

 

Or even, would it be possible to extend the behaviour of the query builder?

 

Many thanks

 

Gilles

 

Respected Advisor
Posts: 4,235

Re: Dynamic data decryption

[ Edited ]

@Gilles74

I'm very much with @LinusH on this one. Also that SAS would likely be able to interface with the existing API's - especially REST - these API's appear very much built for transactional processes which is not like SAS is normally used.

 

SAS EG is just the client and could for example send code for execution to the server which then queries the database, enriches and aggregates the data and creates a report which gets written somewhere to a network location or directly sent via email attachment. That's why decryption must happen on the server side.

 

How I see this you need to solve two main challenges here:

1. How can you make the data available for SAS queries so that they remains useful?

2. What do you need to do so that also SAS processes are compliant?

 

As for 1

I can understand that the DW guys want to store the data at rest encrypted. What they "should" be doing is provide views which decrypt the data and then control who's got access to the decrypted columns in these views.

 

As for 2

That's in my mind going to be the bigger challenge. It's very simple and common to replicate/transform data to SAS tables for processing and though whatever decryption path gets chosen, you'll end up with non-decrypted sensitive data in - eventually permanent - SAS files. 

If these sensitive fields are just used for reporting - i.e. as classification variables - on a aggregated level then such reports can also get developed on development data. You would still need a business process for hand-over into production where these reports can be run against real decrypted data; and "someone" would need to create the development data if they don't exist already.

A bigger challenge would be if such sensitive fields are also used for analytics. There the modelers need the real unencrypted data and to comply with GDPR/PDP requirements you might end up with the need for a brand new locked down SAS environment for model development and testing (i.e. one where you can't send emails from, where you can't write to shared drives, where also the EG client is in an environment like Cytrix which you can lock-down, etc..). If this is what you need to solve then may-be contact SAS directly and get some of their consulting time as this must be due to GDPR compliance requirements something not only your company has to address.

Occasional Contributor
Posts: 5

Re: Dynamic data decryption

Thanks.

 

I understand your points. It is very clear.

 

The objective is also to make tis DW GDPR compliant, and the encryption/decryption solution is fit for purpose. The GDPR aspects you mention are also part of the project. It is also true that the solution is more transaction oriented.

 

But this is how SAS is being used at the moment in the company, for the project we are implementing.

 

The users are given access to SAS Enterprise Guide to execute SQL queries they write themselves. This is the use case. So the use is transactional. The decryption must happen on a transaction basis, when data are returned encrypted to the users.

 

(PS: having the data encrypted in the DW is a requirement. So having predefined decrypted view is not an option)

 

I agree that it is better that decryption happens on the server side.

 

In this case, do you think it is possible to have a generic approach based on modules, these modules calling the encryption/decryption APIs? I mean generic, because in this use case, every SQL queries are in scope, because they potentially embark sensitive data.

 

Many thanks

 

Gilles

 

 

Highlighted
Super Contributor
Posts: 487

Re: Dynamic data decryption

I am currently involved in a similar project. We do not deal with encrypting as much as tokenization or protection of data items as we call it. Data items can be protected and unprotected in many ways. Sometimes unprotection needs to be done only partially, eg. only first n digits of a social security number, part of a postal code or whatever. Everything on a need-to-know basis. So leaving everything up to the data warehouse is not always an option, especially when SAS plays an important role in the data logistics in and out of that dw. We have a dw in Teradata but data in SAS are also rather wide spread and they do not escape GDPR or similar rules.

 

Another aspect that hits home severely is data quality. One can live with having a SSN in mixed formats (num versus char) until protection/encryption comes in and values are no longer match and relationships between tables get lost. Solving these issues at the root can take years. Having the ability for SAS to participate in this protection game is VERY important. Having the API's encrypt()/decrypt()/protect()/unprotect() functions available in SAS datastep and SQL can be very helpful. Hence my advice to surface that API using mentioned tooling. 

 

The plethora of challenges that arises from data quality and authorizations make it too complicated to be addressed by just whipping up a couple of views and get on with it.

Ask a Question
Discussion stats
  • 8 replies
  • 266 views
  • 4 likes
  • 4 in conversation