I have a dataset with almost more than 1000 subjects and they have subjid in the format of 303-101-120, 303-100-130, 303-103-140........ I need to mask the middle value with some other 3 digit random number and also the last 3 digit value with some other 3 digit value. Is there any possible way i can do that in few steps as its very hectic to change that manually. i have provided a sample code for an example, i have 1000+usubjid. Any help please
data ndsn;
infile datalines;
input subjid study site usubjid $15.;
datalines;
120 303 100 303-100-120
121 303 100 303-100-121
122 303 101 303-101-122
123 303 101 303-101-123
124 303 102 303-102-124
125 303 102 303-102-125
126 303 103 303-103-126
127 303 103 303-103-127
128 303 104 303-104-128
129 303 104 303-104-129
;
run;
In the code above, they will be assigned new random values.
We can control that with the Call Streaminit. If you run the code below now and in a year, the results will be the same
data want;
set ndsn;
call streaminit(123);
newid = catx('-', study, put(rand('integer', 1, 999), z3.),
put(rand('integer', 1, 999), z3.));
run;
When you say 'mask'.. Do you want the overwrite the original values with something else (like 'xxx') or do you want to create a format that does not change the original value, but can make the value appear different?
Here is a simple solution. This creates a new variable, so you can hide the 'old' one as you wish.
I simply generate two random integers and insert them with preceding zeros.
data want;
set ndsn;
newid = catx('-', study, put(rand('integer', 1, 999), z3.),
put(rand('integer', 1, 999), z3.));
run;
In the code above, they will be assigned new random values.
We can control that with the Call Streaminit. If you run the code below now and in a year, the results will be the same
data want;
set ndsn;
call streaminit(123);
newid = catx('-', study, put(rand('integer', 1, 999), z3.),
put(rand('integer', 1, 999), z3.));
run;
@Ravindra_ - If you want a masked key which is certain to be the same every time, then using a hash technique is a more reliable approach:
data test;
usubjid = '303-100-120';
usubjid_key = put(md5(cats('usubjid',usubjid)),$hex10.);
put _all_;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.