I thought I'd share a technique for anonymizing Personally Identifiable Information data while maintaining field lengths (random x digit number):
Found this macro at https://blogs.sas.com/content/iml/2015/10/05/random-integers-sas.html
%macro RandBetween(min, max);
(&min + floor((1+&max-&min)*rand("uniform")))
%mend;
and so this is what I came up with to randomize PII (claim numbers, medicaid IDs, NPIs)
proc sql; create table foon as
SELECT %RandBetween(10**12, (10**13)-1) as CLCL_ID,
t1.hcas,
t1.hcss,
t1.hcep,
t1.hcaps,
%RandBetween(10**13, (10**14)-1) as MEDICAID_ID,
t1.lbpi,
%RandBetween(10**11, (10**12)-1) as BILL_PROV_NPI,
%RandBetween(10**11, (10**12)-1) as SERV_PROV_NPI,
t1.MIN_of_Line_FROM_DT,
t1.MAX_of_Line_TO_DT,
t1.CLCL_RECD_DT,
t1.dcabtmco,
t1.dcpbtm,
t1.SUM_of_CDML_CHG_AMT,
t1.SUM_of_CDML_PR_PYMT_AMT
FROM WORK.FOO t1;
QUIT;
Just thought I'd pass it on...maybe it'll be of use to others.
Thank you for this contribution.
To make it more useful, you could change the title to something like:
Here is a method to anonymise Personally Identifiable Information (PII)
This highlights that you are proposing a solution (rather than asking a question), and explains PII, which will not be a term familiar to everyone.
Thanks again 🙂
In the meantime a good title is the best way imho.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.