BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ravindra_
Quartz | Level 8

I have a dataset with almost more than 1000 subjects and they have subjid in the format of 303-101-120, 303-100-130, 303-103-140........ I need to mask the middle value with some other 3 digit random number and also the last 3 digit value with some other 3 digit value. Is there any possible way i can do that in few steps as its very hectic to change that manually. i have provided a sample code for an example, i have 1000+usubjid. Any help please

data ndsn;
infile datalines;
input subjid  study  site  usubjid $15.;
datalines;
120 303 100 303-100-120
121 303 100 303-100-121
122 303 101 303-101-122
123 303 101 303-101-123
124 303 102 303-102-124
125 303 102 303-102-125
126 303 103 303-103-126
127 303 103 303-103-127
128 303 104 303-104-128
129 303 104 303-104-129
;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

In the code above, they will be assigned new random values. 

 

We can control that with the Call Streaminit. If you run the code below now and in a year, the results will be the same

 

data want;
   set ndsn;
   call streaminit(123);
   newid = catx('-', study, put(rand('integer', 1, 999), z3.), 
                            put(rand('integer', 1, 999), z3.));
run;

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

When you say 'mask'.. Do you want the overwrite the original values with something else (like 'xxx') or do you want to create a format that does not change the original value, but can make the value appear different?

Ravindra_
Quartz | Level 8
The value should appear different, for example, if the usubjid is 303-100-104 then we need to make it appear as 403-200-304, few stake holders should not see the original value but the format of the variable should remain same.
PeterClemmensen
Tourmaline | Level 20

Here is a simple solution. This creates a new variable, so you can hide the 'old' one as you wish.

 

I simply generate two random integers and insert them with preceding zeros.

 

data want;
   set ndsn;
   newid = catx('-', study, put(rand('integer', 1, 999), z3.), 
                            put(rand('integer', 1, 999), z3.));
run;
Ravindra_
Quartz | Level 8
Thanks a lot for this, it really helps, but one last thing, if i run the same code again after few months when the number of subjects got increased, will the previous random number that was assigned to the old subject, will it be the same or will it get assigned with a new random value
PeterClemmensen
Tourmaline | Level 20

In the code above, they will be assigned new random values. 

 

We can control that with the Call Streaminit. If you run the code below now and in a year, the results will be the same

 

data want;
   set ndsn;
   call streaminit(123);
   newid = catx('-', study, put(rand('integer', 1, 999), z3.), 
                            put(rand('integer', 1, 999), z3.));
run;
Ravindra_
Quartz | Level 8
thanks a lot for this help. This had given me a solution i was looking for hours
PeterClemmensen
Tourmaline | Level 20

Then I'll charge you two hours only 😉 

 

Have a nice day.

SASKiwi
PROC Star

@Ravindra_  - If you want a masked key which is certain to be the same every time, then using a hash technique is a more reliable approach:

data test;
  usubjid = '303-100-120';
  usubjid_key = put(md5(cats('usubjid',usubjid)),$hex10.);
  put _all_;
run;

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1621 views
  • 1 like
  • 3 in conversation