BookmarkSubscribeRSS Feed
HeatherNewton
Quartz | Level 8
Hihi, relating to production data masking, how do people do it in SAS, is there material about generating random number with particular seed etc...
8 REPLIES 8
Kurt_Bremser
Super User

When you need to consistently anonymize data, you need to create a lookup dataset which contains the translation, and add new observations when new items arrive with new data. The best tool for this is the hash object, as long as the lookup can fit into memory.

For a practical example, show us some usable example data where you need to mask one or more columns.

HeatherNewton
Quartz | Level 8
Lets say i want to mass credit card number, account no, date of birth, name, address, email address, phone no.. and ideally the result after hashed keeps its data type and format so no requirement for additional change to schemas etc
SASKiwi
PROC Star

There's an official standard for the masking and encryption of credit card data called the Card Payment Industry Data Security Standard (PCI DSS). Here is a useful link if you want to know more: https://listings.pcisecuritystandards.org/documents/PCI_DSS-QRG-v3_2_1.pdf

 

Please bear in mind that masking is different from encryption. Masking is just hiding part or all of a data value while encrypting is applying a complex algorithm to convert the data value into something completely different that cannot be easily reversed. The PCI DSS masking standard is to only display the first 6 and last 4 digits of a credit card number which is normally 16 digits long). So this is why you often see on printed credit card payment receipts : 1234 56** **** 1234. 

LinusH
Tourmaline | Level 20

Sounds like you are more into tokenization/encryption rather than masking (hiding characters).

I don't know where you work, but at larger organisations, chances are that there are already functions in place to this, in initiatives for creating test data, or protecting production data. Maybe you could look around?

 

For masking, some SW vendors offer this OOTB, like in SAS Federation Server, or Snowflake to mention a few.

 

If you need to solve this yourself, there are functions in SAS that you could use, like the different hash functions (md5, sha256 etc). These are not format preserving (meaning you need to change your table schema and potentially programs that use this data). If you need format preserving I suggest to look for a SW that does this for you (Fortanix is one).

 

Data never sleeps
Patrick
Opal | Level 21

@LinusH The challenge I've always been facing with masking approaches using some md5/sha or whatever is that the masked string very often doesn't fit into source variable length.

I'm normally using the approach @Kurt_Bremser proposes as not only is it really simple to augment a sequence number, it's also a suitable approach for numerical variables and it doesn't require any changes to variable attributes (type and length).

Patrick
Opal | Level 21

@Ksharp Thanks for sharing these links. Really useful if I ever have to generate alphanumeric masked strings. 

...and what one of the blogs mentioned that I wouldn't have thought about for generating such strings: "Make sure there are no objectionable words in the set! " 😄 

Ksharp
Super User

Maybe @Rick_SAS know the way to do this.
Here is a simple example form me by adding offset into very character.

 

 

data have;
input have $80.;
want=have;
do i=1 to length(have);
 substr(want,i,1)=byte(rank(substr(have,i,1))+mod(i,5));
end;
cards;
Thanks for sharing these links.
Make sure there are no objectionable words in the set! 
relating to production data masking, how do people do it in SAS
;

proc print;run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1287 views
  • 3 likes
  • 6 in conversation