Data have;
input cust_ID $;
length encode decode $ 200;
encode=put(cust_ID,$hex64.);
decode=input(encode,$hex64.);
cards;
33948398
54897733
98376333
;
run;
proc print;run;
Just make a new dataset that has the two id's.
The masked id can probably just be generated sequentially.
data masked_cust_id ;
set have;
by cust_id;
if first.cust_id;
masked_cust_id + 1 ;
keep cust_id masked_cust_id;
run;
Then when you want to send out the actual data merge the two and drop the real id.
data for_export ;
merge masked_cust_id real_data(in=in2);
by cust_id;
if in2;
drop cust_id;
run;
And when you get back data merge again and add back the real id.
data want ;
merge masked_cust_id for_import(in=in2);
by masked_cust_id;
if in2;
run;
Sorry, you didnt create data sets-Real_Data and For_Import
@Ronein wrote:
Thank you for the reply. However I forgot to mention that let's assume that it is forbidden by the rules to create such backup data set that convert fake customer id to real customer id. I would like to find other solution that convert real customer id to fake customer id and then can convert it back from fake customer id to real customer id,thanks
Then you are probably out of luck. You would need to create some type of algorithm for encrypting the values that you need to be able to reverse. That means someone could break it and convert the values back to the original ids.
It is probably safer to store the mapping in a secure location than to use a breakable algorithm.
Here is how we do it using the SAS MD5 function. This produces a unique and reproducable key that can't be (easily) unencrypted so you need to keep a table of the original and encrypted keys to match back to.
data want;
Cust_ID = '12345678';
Cust_ID_Encrypted = put(md5(Cust_ID),$hex16.);
put _all_;
run;
Okay , however when I get back the data set then I want to have the ability to add back the real cust_ID.
How can I do it?
You didn't add this step here and it is critical step
Data tbl_have;
input cust_ID X Y Z;
cards;
33948398 10 20 30
54897733 11 12 13
98376333 40 60 80
;
run;
data want;
set tbl_have;
Cust_ID_Encrypted = put(md5(Cust_ID),$hex16.);
run;
@Ronein - When they send back their data with Cust_ID_Encrypted, just match it back to the Want dataset which has both the encrypted and unencrypted customer IDs.
You either need to keep
- a translation dataset
or
- the encryption/decryption code and the key
in a safe location. Both methods therefore provide the same level of security. Since a randomly created key mapping kept in a dataset is much easier to implement, use this method.
I see that SHA256 and MD5 are for encryption without decryption.
I am looking for function that can do botyh encryption and decryption (So when I reciever the data set back I will be able to add the original customer ID)
SAS does not provide a two-way encryption/decryption function like AES, so you would have to roll your own for an implementation of one such (reasonably secure) algorithm.
Using the RAND function and keeping a mapping dataset in a safe place (read: the location of the original data) works, is easy to implement, and does not violate data security/protection.
Thanks,
1- I suspect that using RAND function to generate fake customer ID can lead to duplicate problem.
Please remember that Customer ID should be unique value for each customer .
Then, using sequence number as fake customer number can be better.
What do you think?
2-I understand that in SAS there is no built in function that provide a two-way encryption/decryption.
My question is maybe anyone created such algorithm via SAS code ?
Using a sequence weakens your encryption, as the original order becomes part of the encryption algorithm.
It also forces you to keep a mapping dataset, as the order may (will) change when new keys are inserted in your source data.
To prevent duplicates, you use the following method:
When you receive data back, use the mapping dataset in a hash with new key as key and original key as data.
To your question #2: I seriously doubt anyone has gone to the length of creating a reliable two-way encryption method, and if, they'll want money for their work.
Data have;
input cust_ID $;
length encode decode $ 200;
encode=put(cust_ID,$hex64.);
decode=input(encode,$hex64.);
cards;
33948398
54897733
98376333
;
run;
proc print;run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.