Just did something like that. The solution I used went more or less like this:
proc sql;
create table anonymization_id (id num,anon_id num);
create unique index id on anonymization_id(id);
quit;
data patient1_anon(drop=id) anonymization_id;
set patient1;
modify anonymization_id key=id/unique nobs=n_id;
if _iorc_ then do; /* assuming that the ID was not found */
n_id+1;
anon_id=n_id;
output anonymization_id;
_error_=0;
end;
output patient1_anon;
run;
To anonymize the next patient table, just use a similar datastep (but keeping the same anonymization_id table), and you will have one table with all the anonymizations used, and anonymized versions of the patient data tables (just remember to write/copy them to a permanent library, not WORK).
If you need to anonymize the text variable NAME as well, you could just insert something like
name=cats('Dummy',anon_id);
before the second output statement, the patients' name will then be anonymized as well.
By using the same anonymization table for all the tables containing Patients' IDs, you will get the same translation from real to anonymized ID in all the tables.
... View more