BookmarkSubscribeRSS Feed
HeatherNewton
Quartz | Level 8
string = COLLATE(1,256);
high4=substr(string,55);
APPLICANT_ID=translate(sup_id_no,high4,string);
sup_id_no=translate(sup_id_no,high4,string);
EMAIL=translate(EMAIL,high4,string);

looks like the above code is doing some masking or decryption... am I right? What exactly are they trying to do?  Are they reversible?

 

8 REPLIES 8
LinusH
Tourmaline | Level 20

I don't have SAS at my fingertips to test this, but, yes, it looks like some kind of masking.

I think you need more advanced algorithms to call it encryption 🙂

It looks like that you are shifting characters to another character 55 positions away in the ASCII collating sequence.

I guess it would be reversible by shifting it back.

So from a security perspective, it doensn't really meet up to any standards.

Data never sleeps
ChrisHemedinger
Community Manager

Agree with @LinusH. The result looks like some "unprintable" character values, but you can see for yourself by examining the hexadecimal view:

 

data orig;
 sup_id_no = "1234567";
 EMAIL = "SUPPLIER@company.com";
 put sup_id_no= $hex256. email= $hex256.;
run;

data shift(keep=sup_id_no EMAIL);
  set orig;
  string = COLLATE(1,256);
  high4=substr(string,55);
  sup_id_no=translate(sup_id_no,high4,string);
  EMAIL=translate(EMAIL,high4,string);
  put sup_id_no= $hex256. email= $hex256.;
run;

 

Before:
sup_id_no=31323334353637 EMAIL=535550504C49455240636F6D70616E792E636F6D

After:
sup_id_no=6768696A6B6C6D EMAIL=898B8686827F7B887699A5A3A697A4AF6499A5A3

It hides the values from clear-text viewing, but it's not encryption per se.

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.
HeatherNewton
Quartz | Level 8

if I have these kind of masking in sas dataset and I am converting sas dataset to csv and have csv load into DB2

 

do I need to apply encoding from sas dataset to csv

 

also encoding from csv to data in db2?

 

I tried without in both steps and the data in DB2 cannot show though it get loaded in successfully

and in csv it looks like alphabets and chinese characters

 

what do I need to do, please assist

thanks

Kurt_Bremser
Super User

Get in touch with the people responsible for privacy/data security within your organization. Set up a masking/encryption process which complies with your organization's rules (this current masking is much too easy to crack to be of any real value) and does not produce random unreadable characters (what you see is the UTF representation of multi-byte sequences you create). Then apply this process in your new data warehouse setup.

HeatherNewton
Quartz | Level 8

yes we are doing it properly but still need to input production data into new database (for SIT purpose) where I encounter encoding problem. I saw in sas document, collate will output some european language when value is over a certain value and it is kind of the ISO8859 standard. How can I know what code page to use when loading these data into DB2?                        

 

andreas_lds
Jade | Level 19

If the definition of high4 is known, reversing the masking is easy: just use translate and swap "high4" and "string".

HeatherNewton
Quartz | Level 8
Does these steps by any chance change the encoding to something not utf-8 ? After i loaded data in db2 using codepage 1208, still some characters cannot show even if I gave it lots of space.. what could the tranaformation collate, high4 have done?
Patrick
Opal | Level 21

As already stated by others this is a rather poor masking algorithm that likely doesn't meet your companies compliance standards.

 

The algorithm just shifts characters ending up with target characters that might be not printable. And if they are not printable then they are not printable whether they are stored in SAS or DB2 and whatever encoding and codepage gets used (unless conversion is wrong).

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1029 views
  • 1 like
  • 6 in conversation