I have a dataset with almost more than 1000 subjects and they have subjid in the format of 303-101-120, 303-100-130, 303-103-140........ I need to mask the middle value with some other 3 digit random number and also the last 3 digit value with some other 3 digit value. Is there any possible way i can do that in few steps as its very hectic to change that manually. i have provided a sample code for an example, i have 1000+usubjid. Any help please
data ndsn;
infile datalines;
input subjid study site usubjid $15.;
datalines;
120 303 100 303-100-120
121 303 100 303-100-121
122 303 101 303-101-122
123 303 101 303-101-123
124 303 102 303-102-124
125 303 102 303-102-125
126 303 103 303-103-126
127 303 103 303-103-127
128 303 104 303-104-128
129 303 104 303-104-129
;
run;
In the code above, they will be assigned new random values.
We can control that with the Call Streaminit. If you run the code below now and in a year, the results will be the same
data want;
set ndsn;
call streaminit(123);
newid = catx('-', study, put(rand('integer', 1, 999), z3.),
put(rand('integer', 1, 999), z3.));
run;
When you say 'mask'.. Do you want the overwrite the original values with something else (like 'xxx') or do you want to create a format that does not change the original value, but can make the value appear different?
Here is a simple solution. This creates a new variable, so you can hide the 'old' one as you wish.
I simply generate two random integers and insert them with preceding zeros.
data want;
set ndsn;
newid = catx('-', study, put(rand('integer', 1, 999), z3.),
put(rand('integer', 1, 999), z3.));
run;
In the code above, they will be assigned new random values.
We can control that with the Call Streaminit. If you run the code below now and in a year, the results will be the same
data want;
set ndsn;
call streaminit(123);
newid = catx('-', study, put(rand('integer', 1, 999), z3.),
put(rand('integer', 1, 999), z3.));
run;
@Ravindra_ - If you want a masked key which is certain to be the same every time, then using a hash technique is a more reliable approach:
data test;
usubjid = '303-100-120';
usubjid_key = put(md5(cats('usubjid',usubjid)),$hex10.);
put _all_;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.