06-24-2015 03:17 PM
For a Analytics project I am working on, I want to import a CSV File from a local drive to a SAS AWS environment and at the same time Encrypt some columns (with sensitive information) of that file. So I am looking for a piece of code that at the same time import the CSV file and changes the values of some Columns in some arbitrary values.
Is there anyone who have experience with that and can me provide the code how to do that?
Thanks in Advance!
06-24-2015 05:18 PM
If you read the CSV file with a data step then you can add any other code to that data step you would like.
If you use proc import then generated data step code will be in the log that you can copy and paste into the editor for changes.
Make any desired changes to informats and formats as indicated. Then add any variable creation or modification code after the Input statement.
Suppose I want to change a text field that looks like: 123-45-6789 and decide that keeping the last 7 characters is sufficient for any purpose (I know this isn't encrypted but that's a whole other issue).
I could add:
string = substring(string,5); to keep the values from the 5th character onward.
If the values are actually numeric then you can do any numeric operation you want.
Note that SAS includes a bunch of randomization functions to generate a random number that could be added, subtracted or what not with numbers. You can also use random numbers to generate lists of characters with the BYTE function.
Issues to consider: Do you have values that must be transformed to the same value all the time? If so random isn't the way to go. If you have a specific algorithm for modifications in mind it likely can be translated into SAS datastep code.
06-25-2015 04:15 AM
Thanks for your comment!
Yes, the issue is that I have to Encrypt the Key column. So I have to link this Dataset at a later time to another dataset by using this Encrypted Key column, so the ecnryption should encrypt the same values in the same way . So, if anyone has some code to do such Encryption, which I can put in either a DATA step or in the Proc Import, that would be great!
09-19-2015 07:15 AM
You could appy the SAS MD5 digest function to the original values to create new encrypted keys and drop the plain text ones in datasteps once you've uploaded the plaintext key data.