<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Random Integer Number Generator to pair with ID numbers in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593189#M170196</link>
    <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/293211"&gt;@Beargrad04&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;Why do you need a 10-digit integer for your anonymous ID? Do you intend to do some calculations with it?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If not and it just an anonymous ID and nothing else, why muck around with random integers if you can simply associate the existing ID with its 16-byte MD5 digest:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;  anonID = put (md5 (cats(id)), $16.) ;&lt;BR /&gt; &amp;nbsp;format&amp;nbsp;anonID&amp;nbsp;$hex32.&amp;nbsp;;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;CATS is used just in case ID is numeric, so that the formula works for both numeric and character IDs. Attaching the $hex32. format to anonID is optional; but it makes anonID prettier to look at because the formatted image contains only hex digits (0-9, A-F) rather than a bunch of special characters.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This approach has a number of advantages:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Every ID will always have one and only one corresponding anonID and vice versa.&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;It's practically impossible to reconstruct ID from anonID, so if you send anonID to whomever it concerns, the real IDs are perfectly hidden.&lt;/LI&gt;
&lt;LI&gt;When you get a new ID you don't have anonID for yet, you don't have to do any lookups to make sure an already existing anonID is not reused.&lt;/LI&gt;
&lt;LI&gt;You don't even have to keep ID*anonID cross-reference because you can always regenerate the anonIDs from the IDs using the above formula.&amp;nbsp;&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;Due to the nature of MD5 (it's a one-way hash function), anonIDs will be highly random (if that matters and/or is part of your specs).&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 01 Oct 2019 19:17:07 GMT</pubDate>
    <dc:creator>hashman</dc:creator>
    <dc:date>2019-10-01T19:17:07Z</dc:date>
    <item>
      <title>Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593136#M170168</link>
      <description>&lt;P&gt;I'm trying to take my dataset of students with their ID and create a 10-digit integer "NEWid" that will allow me to create a key that pairs their "NEWid" with the student ID and make their data/info I disseminate anonymous.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've read through forums but nothing seems to be paired with what I'm looking for. Can anyone please help me?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Oct 2019 17:20:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593136#M170168</guid>
      <dc:creator>Beargrad04</dc:creator>
      <dc:date>2019-10-01T17:20:58Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593138#M170169</link>
      <description>&lt;P&gt;What have you tried (SAS code)?&lt;/P&gt;</description>
      <pubDate>Tue, 01 Oct 2019 17:24:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593138#M170169</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2019-10-01T17:24:18Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593162#M170182</link>
      <description>&lt;P&gt;&lt;A href="https://gist.github.com/statgeek/fd94b0b6e78815430c1340e8c19f8644" target="_blank"&gt;https://gist.github.com/statgeek/fd94b0b6e78815430c1340e8c19f8644&lt;/A&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;/*This program demonstrates how to create a basic anonymized 
key for a unique identifier. Ensure you set the value in CALL
STREAMINIT()/RANDOM_SEED macro variable to ensure you can 
replicate the keys if needed*/

%let random_seed = 30;

*list of unique values;
proc sql; 
create table unique_list as
select distinct name
from sashelp.class;
quit;

*add random values;
data random_values;
set unique_list;
call streaminit(&amp;amp;random_seed.);
rand = rand('normal', 50, 10);
run;

*sort;
proc sort data=random_values;
by rand;
run;

*Assign ID to N, note this is a character format;
data ID_key_pair;
set random_values;
label = put(_n_, z5.);

fmtname = 'anon_fmt';
type='C';
start=name;
run;

*Create a format;
proc format cntlin=id_key_pair;
run;

*Create dataset with anonymized IDs;
data want;
set sashelp.class;
RandomID = put(name, $anon_fmt.);
*drop name;
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 01 Oct 2019 18:08:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593162#M170182</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-10-01T18:08:55Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593170#M170188</link>
      <description>&lt;P&gt;I hadn't because I didn't know where to start tbh.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I currently have an excel spreadsheet that I will then import and pair with each cohort. It takes longer but it's the only way I know that works and I was trying to cut corners to expedite my process.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have about 30000 people per dataset.&lt;/P&gt;</description>
      <pubDate>Tue, 01 Oct 2019 18:27:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593170#M170188</guid>
      <dc:creator>Beargrad04</dc:creator>
      <dc:date>2019-10-01T18:27:50Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593172#M170190</link>
      <description>&lt;P&gt;This is confusing to me because I am not an expert in SAS.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This seems like what I would like to do. My question is where do I set parameters for each dataset I have so that a new "NEWid" field is created, and is paired with the exact number of people in my data file? I will have about 30000 and 9 datasets.&lt;/P&gt;</description>
      <pubDate>Tue, 01 Oct 2019 18:29:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593172#M170190</guid>
      <dc:creator>Beargrad04</dc:creator>
      <dc:date>2019-10-01T18:29:31Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593180#M170195</link>
      <description>&lt;P&gt;Did you run the code and see what was happening?&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/293211"&gt;@Beargrad04&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;This is confusing to me because I am not an expert in SAS.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This seems like what I would like to do. My question is where do I set parameters for each dataset I have so that a new "NEWid" field is created, and is paired with the exact number of people in my data file? I will have about 30000 and 9 datasets.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Oct 2019 18:53:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593180#M170195</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-10-01T18:53:07Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593189#M170196</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/293211"&gt;@Beargrad04&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;Why do you need a 10-digit integer for your anonymous ID? Do you intend to do some calculations with it?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If not and it just an anonymous ID and nothing else, why muck around with random integers if you can simply associate the existing ID with its 16-byte MD5 digest:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;  anonID = put (md5 (cats(id)), $16.) ;&lt;BR /&gt; &amp;nbsp;format&amp;nbsp;anonID&amp;nbsp;$hex32.&amp;nbsp;;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;CATS is used just in case ID is numeric, so that the formula works for both numeric and character IDs. Attaching the $hex32. format to anonID is optional; but it makes anonID prettier to look at because the formatted image contains only hex digits (0-9, A-F) rather than a bunch of special characters.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This approach has a number of advantages:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Every ID will always have one and only one corresponding anonID and vice versa.&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;It's practically impossible to reconstruct ID from anonID, so if you send anonID to whomever it concerns, the real IDs are perfectly hidden.&lt;/LI&gt;
&lt;LI&gt;When you get a new ID you don't have anonID for yet, you don't have to do any lookups to make sure an already existing anonID is not reused.&lt;/LI&gt;
&lt;LI&gt;You don't even have to keep ID*anonID cross-reference because you can always regenerate the anonIDs from the IDs using the above formula.&amp;nbsp;&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;Due to the nature of MD5 (it's a one-way hash function), anonIDs will be highly random (if that matters and/or is part of your specs).&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Oct 2019 19:17:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593189#M170196</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-10-01T19:17:07Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593190#M170197</link>
      <description>&lt;P&gt;FYI - MD5 hash in real life released data.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://www.theguardian.com/technology/2014/jun/27/new-york-taxi-details-anonymised-data-researchers-warn" target="_blank"&gt;https://www.theguardian.com/technology/2014/jun/27/new-york-taxi-details-anonymised-data-researchers-warn&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;Panduragan realised that the medallion and licence numbers both have a very specific format. Medallions only take one of three formats – either 5X55, XX555 or XXX555 – while licences are all six-digit or seven-digit numbers starting with a five. That means that there are only 2m possible license numbers, and 22m possible medallion numbers.&lt;/P&gt;
&lt;P&gt;That let Panduragan reverse-engineer the anonymised data to find out which trips were carried out by which drivers, and in which taxis.&lt;STRONG&gt;&lt;FONT color="#000000"&gt; The data had been anonymised by hashing, a cryptographic function which is supposed to be "one-way": it's very easy to find the hash of a given piece of data, and very hard – mathematically impossible, in theory – to find the piece of data which resulted in a given hash &lt;/FONT&gt;(for instance, the MD5 hash, the particular type used by NYC, of the data "Alex" is a08372b70196c21a9229cf04db6b7ceb). &lt;/STRONG&gt;As the same piece of data always results in the same hash, such functions are frequently used to anonymise just this sort of data.&lt;/P&gt;
&lt;P&gt;But once Panduragan had narrowed the possible entries down to 24m different numbers, it was the matter of only minutes to determine which numbers were associated with which pieces of anonymised data.&lt;/P&gt;
&lt;P&gt;"Modern computers are fast: so fast that computing the 24m hashes took less than two minutes," he said. "It took a while longer to de-anonymise the entire dataset, but… [I] had it done within an hour.&lt;/P&gt;
&lt;P&gt;"There’s a ton of resources on NYC Taxi and Limousine commission, including a mapping from licence number to driver name, and a way to look up owners of medallions. I haven’t linked them here but it’s easy to find using a quick Google search… This anonymisation is so poor that anyone could, with less than two hours work, figure which driver drove every single trip in this entire dataset. It would even be easy to calculate drivers' gross income or infer where they live."&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#000000"&gt;&lt;STRONG&gt;Paduragan points out that there are a number of ways that the city could have more successfully anonymised the data. The first is if they hadn't tried to be so smart: rather than going through the effort of hashing the data, if they had simply assigned random numbers to each licence plate, it would have been much more difficult to work backwards.&lt;/STRONG&gt; &lt;/FONT&gt;New York's Taxi and Limousine Commission was asked for comment, but didn't respond by publication time.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;</description>
      <pubDate>Tue, 01 Oct 2019 19:22:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593190#M170197</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-10-01T19:22:31Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593195#M170199</link>
      <description>&lt;P&gt;So on the one hand I have a method that is supposed to be beneficial; on the other hand that same method can be decrypted and is not as randomized as randomized digits I'm trying to find out how to do.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As it stands, it sounds as if the excel import process is the way to go so far.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So I import a set of random ID numbers paired with my IDs I want made anonymous, then I merge the file with said IDs and data with the anonymous ID imported by the said ID. That way my anonymous ID is paired with both files correctly and synched up with the rest of the fields I want to report.&lt;/P&gt;</description>
      <pubDate>Tue, 01 Oct 2019 19:47:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593195#M170199</guid>
      <dc:creator>Beargrad04</dc:creator>
      <dc:date>2019-10-01T19:47:48Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593196#M170200</link>
      <description>&lt;P&gt;By the way I appreciate all of your prompt responses and efforts to help. I know it can read a bit snippy on here, but on the contrary I'm very appreciative! Thank you all!&lt;/P&gt;</description>
      <pubDate>Tue, 01 Oct 2019 19:49:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593196#M170200</guid>
      <dc:creator>Beargrad04</dc:creator>
      <dc:date>2019-10-01T19:49:12Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593198#M170202</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"&lt;SPAN&gt;Panduragan realised that the medallion and licence numbers both have a very specific format. Medallions only take one of three formats – either 5X55, XX555 or XXX555 – while licences are all six-digit or seven-digit numbers starting with a five. That means that there are only 2m possible license numbers, and 22m possible medallion numbers ...&amp;nbsp;There’s a ton of resources on NYC Taxi and Limousine commission, including a mapping from licence number to driver name, and a way to look up owners of medallions."&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That's the key. If you don't know the nature of the original ID and/or its pool of possible distinct values is practically unlimited, good luck working it backwards. Plus, instead of using MD5, they could've used SHA256, which takes about 30 times longer to compute. Besides, the author admits that even if the anonymous IDs were picked randomly, he could still reconstruct the original IDs because he has so much information about their structure.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Another key is that he also knew that MD5 was used but once. Suppose that instead of using the single MD5(ID) they merely used MD5(MD5(MD5(ID))) and he didn't know the number of nestings (which, by the way, can be made varying depending on ID) ... you get the rest of the picture.&amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Oct 2019 19:52:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593198#M170202</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-10-01T19:52:08Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593200#M170203</link>
      <description>&lt;P&gt;I wouldn't call point 4. an advantage, unless you combine the ID with some secret password before calling MD5.&lt;/P&gt;</description>
      <pubDate>Tue, 01 Oct 2019 22:44:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593200#M170203</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2019-10-01T22:44:39Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593214#M170208</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/293211"&gt;@Beargrad04&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;So on the one hand I have a method that is supposed to be beneficial; on the other hand that same method can be decrypted and is not as randomized as randomized digits I'm trying to find out how to do.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As it stands, it sounds as if the excel import process is the way to go so far.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So I import a set of random ID numbers paired with my IDs I want made anonymous, then I merge the file with said IDs and data with the anonymous ID imported by the said ID. That way my anonymous ID is paired with both files correctly and synched up with the rest of the fields I want to report.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Lets go through what &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;did with an example data set&lt;/P&gt;
&lt;P&gt;This gets a unique list of names for associating. If your data already does not have any duplicates of the personal id values that would not be needed. BUT you really should verify that you have no duplicates before you start anything and with 30K+ plus records in an Excel sheet how did you verify no duplicates? You would use the name of your data set instead of SASHELP.Class.&lt;/P&gt;
&lt;PRE&gt;proc sql; 
create table unique_list as
select distinct name
from sashelp.class;
quit;
&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This added a random number with the RAND function. If your data has no duplicates you could place the name of your data set than needs random numbers on the SET statement. There are different RAND function options, if you really need an integer you could use the FLOOR function on the result of the RAND function. Or something like (rand('integer',1000000000,9999999999) to directly generate 10 significant digit values.&lt;/P&gt;
&lt;P&gt;This is the FIRST step you need if your list is 100% unique.&lt;/P&gt;
&lt;PRE&gt;data random_values;
set unique_list;
call streaminit(&amp;amp;random_seed.);
rand = rand('normal', 50, 10);
run;&lt;/PRE&gt;
&lt;P&gt;This scrambles your list of ids into a random order based on the random number result. It is needed here because Reeza's example does not actually have an existing ID number, only the name and she want to create&amp;nbsp;an id that is not based ont&amp;nbsp;the alphabetical order of the names in the SASHELP.CLASS data set.&amp;nbsp;The code below will create an "id" value based on the random order.&lt;/P&gt;
&lt;PRE&gt;proc sort data=random_values;
by rand;
run;
&lt;/PRE&gt;
&lt;P&gt;Important note at this time: You likely should make sure that the data set Random_values is stored in a permanent library as you might need it later.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This step adds the Id, a simple numeric order assigned after the sort. You would not need that if you already have an ID value. The additional pieces are preparing to create a Format, which is one of the ways SAS can look up values quickly. A format will display one value, the "START" value with the actual value of the "LABEL" . In your case you would want the RAND value as the LABEL and your existing ID variable name as the START. If your ID is actually a numeric value (bad idea put folks do that) then change the Type='C' to Type='N' so SAS understand it will be manipulating a numeric value. This would be the SECOND step that you need.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data ID_key_pair;
set random_values;
label = put(_n_, z5.);

fmtname = 'anon_fmt';
type='C';
start=name;
run;
&lt;/PRE&gt;
&lt;P&gt;This code actually creates the format. CNTLIN tells the procedure that the data set contains the instructions to create a format.&lt;/P&gt;
&lt;P&gt;If you have a permanent library associated with this project you could add LIBRARY= lib to create the format in that library. You would use the SAS option FMTSEARCH to add that library to the search path to make it useable. The default shown below would create the format in the WORK library and be found by default. BUT the proc format code shown would need to be rerun every time you want to use the format. The THIRD step.&lt;/P&gt;
&lt;PRE&gt;proc format cntlin=id_key_pair;
run;
&lt;/PRE&gt;
&lt;P&gt;And the following code uses the PUT function to create the random Id based on the value of the START values used above. In the example that is the value of the Name variable.&lt;/P&gt;
&lt;PRE&gt;data want;
set sashelp.class;
RandomID = put(name, $anon_fmt.);
*drop name;
run;
&lt;/PRE&gt;
&lt;P&gt;Your code would look very similar something like: and the LAST step needed to add the Newid to your data set. Note: you really do want to create a new data set. If you make a mistake you do not want to take the chance of destroying or corrupting your original data unless it is very easy to recreate &lt;STRONG&gt;exactly&lt;/STRONG&gt; as it was at the start.&lt;/P&gt;
&lt;PRE&gt;data want;
set yourdatsetname;
NewId = put(youridvariable, $anon_fmt.);

run;
&lt;/PRE&gt;
&lt;P&gt;I can't be any more specific with data set names or variables because you have not actually shared what you have.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Now, a critic piece of information that may have forgotten to mention: &lt;STRONG&gt;will you have to add new student ids to this list?&lt;/STRONG&gt; Because you do not want to restart this process with a full list unless you actually intend to remove the previous "newid" with something else so any previous reports or such that seem to have needed the "newid" would not reference the current look up created between the added list and the old list of students.&lt;/P&gt;</description>
      <pubDate>Tue, 01 Oct 2019 21:53:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593214#M170208</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2019-10-01T21:53:51Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593239#M170225</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt;&amp;nbsp;Exactly. Salting is a must, and there is never a reason not to.&lt;/P&gt;</description>
      <pubDate>Wed, 02 Oct 2019 02:16:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593239#M170225</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2019-10-02T02:16:03Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593247#M170227</link>
      <description>&lt;P&gt;Hey guys&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt;&amp;nbsp;and&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961"&gt;@ChrisNZ&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;The scenario under which I see the whole thing making the topic of the thread needed is:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;I have a file with ID and DATA&lt;/LI&gt;
&lt;LI&gt;I want to send the file to a client with the ID encrypted as AnonID, as I don't want the client to see the real IDs&lt;/LI&gt;
&lt;LI&gt;After the file, processed in some way by the client, is sent back to me, I want to be able to pair AnonIDs I got back with the real IDs on my original file&lt;/LI&gt;
&lt;LI&gt;I can do it by either (a) keeping the ID*AnonID xref or (b) not keeping it but instead regenerating AnonIDs from the IDs on the original file and matching them up with what I got back&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Of course, adding a password to the ID before MD5-ing it costs nothing ... but pray tell me wise guys why under the above scenario I should do it? I own the file with the original real IDs. What the adding of the password will protect and from whom?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 02 Oct 2019 04:07:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593247#M170227</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-10-02T04:07:11Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593252#M170231</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;A nice recap and explanation; thanks 1e6 for that.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But let's step back a bit and look at what this program does and at what cost.&lt;/P&gt;
&lt;P&gt;What it does is merely the following:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Creates contiguous integers from 1 to the number of unique IDs on the file&lt;/LI&gt;
&lt;LI&gt;Assigns them randomly to the corresponding IDs&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;What is costs:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;First pass through the input file to get the list of the unique IDs and create file Unique_List&lt;/LI&gt;
&lt;LI&gt;Read Unique_List, generate a random variate for each record and create file Random_Values&lt;/LI&gt;
&lt;LI&gt;Sort Random_Values by the random variate&lt;/LI&gt;
&lt;LI&gt;Read Random_Values and create a CNTLIN= file ID_Key_Pair&lt;/LI&gt;
&lt;LI&gt;Read ID_Key_Pair to create a format pairing each unique value if ID with _N_ from the sorted file&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Besides, the sole purpose of formatting the output RandomID as Z5. is to circumvent the fact that no informat or format can pair a numeric variable to a numeric variable. Basically, using Z5. is tantamount to hard coding that will get busted as soon as the number of unique ID values exceeds 99,999.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;None of which would matter if the exact same thing couldn't be easily done via two passes through the input file and writing only one file in the process - that is, the output file WANT itself. But it can be; for example (note that HAVE below is created to represent a file with duplicate IDs):&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have ;                                   
  set sashelp.class (rename=name=id) ;        
  do _n_ = 1 to ceil (ranuni (1) * 3) ;       
    output ;                                  
  end ;                                       
run ;                                         
                                              
proc sql noprint ;                            
  select count (unique id) into :n from have ;
quit ;                                        
                                              
data want (drop = _:) ;                       
  array rr [&amp;amp;n] _temporary_ (1:&amp;amp;n) ;          
  if _n_ = 1 then do ;                        
    call streaminit (30) ;                    
    dcl hash h () ;                           
    h.definekey ("id") ;                      
    h.definedata ("randomID") ;               
    h.definedone () ;                         
  end ;                                       
  set have ;                                  
  if h.find() = 0 then return ;               
  _count + 1 ;                                
  _index = rand ("integer", &amp;amp;n - _count + 1) ;
  RandomID = rr [_index] ;                    
  rr [_index] = rr [&amp;amp;n - _count + 1] ;        
  h.add() ;                                   
run ;                                         
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 02 Oct 2019 05:21:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593252#M170231</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-10-02T05:21:49Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593269#M170240</link>
      <description>&lt;P&gt;Using a random number generator to get the anonymous IDs is one possibility. This is one of the few times I would use the deprecated RANUNI function, or rather the corresponding CALL routine. One of the reasons it is deprecated is that if you use it to generate random integers between 0 and 2^31-1, there will be no repeats of any values before you have gone through all the possible values. Which is not good in a random number generator, but ideal for your purpose - unless you have more than 2^31 (about 2 billion) subjects, but there aren't that many students in the world yet.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is an example:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
  set sashelp.class;
  id=_N_;
run;

data anonymous(drop=name id) translate_table(keep=id NewId);;
  retain NewID 33; /* any positive integer less than 2^31 will do here */
  set have;
  call ranuni(NewID,_N_);
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I used the _N_ variable as the "output" in the CALL RANUNI routine, as we are not interested in the "real" output at all. It is the seed (NewID) we want, which is a pseudo-random integer, guaranteed not to repeat it self in a long time.&lt;/P&gt;</description>
      <pubDate>Wed, 02 Oct 2019 08:18:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593269#M170240</guid>
      <dc:creator>s_lassen</dc:creator>
      <dc:date>2019-10-02T08:18:49Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593460#M170306</link>
      <description>&lt;P&gt;Adding a password will make it difficult for your client to identify the real IDs when he/she has access to a superset of the IDs (i.e. your sampling frame). Take for example, a sample of students from a university campus. Without a password, it wouldn't be difficult for your client to feed all university student IDs to MD5 and identify the sampled students.&lt;/P&gt;</description>
      <pubDate>Wed, 02 Oct 2019 16:38:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593460#M170306</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2019-10-02T16:38:50Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593468#M170310</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;Understood. Sure, this little (and basically free) precaution wouldn't hurt in such a case.&lt;/P&gt;</description>
      <pubDate>Wed, 02 Oct 2019 16:55:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/593468#M170310</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-10-02T16:55:15Z</dc:date>
    </item>
    <item>
      <title>Re: Random Integer Number Generator to pair with ID numbers</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/759529#M240031</link>
      <description>&lt;P&gt;Hi Hashman:&lt;/P&gt;
&lt;P&gt;I am using your code and find it very useful in assigning unique id to my customer_no. I do not have any duplicates (n=322,391) in my data (sorted and removed them) but by using your code want data is doubled (644,457) generating duplicates. Any suggestion on why this might be happening? I am quite a beginner and learning to understand your code. Thanks!&lt;/P&gt;</description>
      <pubDate>Wed, 04 Aug 2021 21:33:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Random-Integer-Number-Generator-to-pair-with-ID-numbers/m-p/759529#M240031</guid>
      <dc:creator>sasuser_sk</dc:creator>
      <dc:date>2021-08-04T21:33:02Z</dc:date>
    </item>
  </channel>
</rss>

