building compact unique id using A-Z,0-9

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 11
Accepted Solution

building compact unique id using A-Z,0-9

Hello,

I want to create a compact unique id value?  For example if  I have a dataset with a billion rows - I could create a variable named id with a value from 1-1,000,000,000

(0000000001 to 1000000000) .  But I don't want my variable to be $10 - I want my variable to be less than 10.  I was thinking of using {A-Z, 0-9} to come up with unique id combinations.

Therefore I would have 36 unique characters to build the id variable with. I then figured out that I  can create an id value of $6 length (36^6 = 2,176,782,336 unique 6 char combinations). The $5 would not provide at least 1 billion unique values.

I played around with arrays and macro lists but didn't get anywhere.  I am stuck on how to start building this new id variable - does anyone have ideas?

Julie


Accepted Solutions
Solution
‎02-24-2015 09:51 PM
Respected Advisor
Posts: 3,156

Re: building compact unique id using A-Z,0-9

Seems to me all you need is bunch of (in this case, 6) do-loops:

data _null_;

     length str $ 200;

     do i=48 to 57, 65 to 90;

           bt=byte(i);

           str=catx(',', str, quote(bt));

     end;

     call symputx('str', str);

run;

data uniq_id;

     length ID  $ 6;

     do i1=&str.;

           do i2=&str.;

                do i3=&str.;

                     do i4=&str.;

                           do i5=&str.;

                                do i6=&str.;

                                     id=cats(of i1-i6);

                                     output;

                                end;

                           end;

                     end;

                end;

           end;

     end;

run;


On my 7-years old box, it shows:

NOTE: The data set WORK.UNIQ_ID has 2176782336 observations and 7 variables.

NOTE: DATA statement used (Total process time):

      real time           5:18.71

      cpu time            4:57.10



View solution in original post


All Replies
Community Manager
Posts: 2,891

Re: building compact unique id using A-Z,0-9

The UUIDGEN function meets some of your criteria.  See this blog post:

http://blogs.sas.com/content/sasdummy/2012/10/19/creating-a-somewhat-unique-id-using-the-uuidgen-fun...

data wastedGuids;
do x=1 to 10;
  guid = compress(uuidgen(),
'-');
  output;
end;
run;

/* OUTPUT of 10 records:

131f5c0e12344f8b9c3a02014663677a

afb7719250bd446b8c7d1285b9f773bb

18f3332aff7448a4bc2e3a41eb1653f8

223ce6024cd74038bfc9a5d2523524d1

efb82bb00d374181b36111e4ad7150db

9c412af5fb7140d4995ff25aec13969c

8dfd6b1b7d3045d887d7481a3faa4548

125250a10f814aa2b72ac5f727049e0b

5bd93cc573524a6589bead495f301f3d

7f3c6f4517f141e381d4f2c4a0c4bbb3

*/

The downside is that the length will be $32 -- that's longer than you asked for.  But the result is guaranteed unique and you've saved your own brainpower.

Another option is to use a HEX8. format, more concise than Decimal:

data last10_IDs;
do x=999999990 to 1000000000;
  id = put (x,
hex8.);
  output;
end;
run;

       x           id

   999999990    3B9AC9F6

   999999991    3B9AC9F7

   999999992    3B9AC9F8

   999999993    3B9AC9F9

   999999994    3B9AC9FA

   999999995    3B9AC9FB

   999999996    3B9AC9FC

   999999997    3B9AC9FD

   999999998    3B9AC9FE

   999999999    3B9AC9FF

  1000000000    3B9ACA00

Chris

Occasional Contributor
Posts: 11

Re: building compact unique id using A-Z,0-9

Thank you.  This really helpful to know - I have never come across the UUIDGEN function before.

Julie

Solution
‎02-24-2015 09:51 PM
Respected Advisor
Posts: 3,156

Re: building compact unique id using A-Z,0-9

Seems to me all you need is bunch of (in this case, 6) do-loops:

data _null_;

     length str $ 200;

     do i=48 to 57, 65 to 90;

           bt=byte(i);

           str=catx(',', str, quote(bt));

     end;

     call symputx('str', str);

run;

data uniq_id;

     length ID  $ 6;

     do i1=&str.;

           do i2=&str.;

                do i3=&str.;

                     do i4=&str.;

                           do i5=&str.;

                                do i6=&str.;

                                     id=cats(of i1-i6);

                                     output;

                                end;

                           end;

                     end;

                end;

           end;

     end;

run;


On my 7-years old box, it shows:

NOTE: The data set WORK.UNIQ_ID has 2176782336 observations and 7 variables.

NOTE: DATA statement used (Total process time):

      real time           5:18.71

      cpu time            4:57.10



Occasional Contributor
Posts: 11

Re: building compact unique id using A-Z,0-9

You are a SAS Rock Star - This is exactly what I wanted.

Now that I see the code - it is perfectly clear how to do this.

It took over 8 minutes on my 2 month old dell.

Thanks,  Julie

Respected Advisor
Posts: 3,156

Re: building compact unique id using A-Z,0-9

Glad it worked out for ya. And don't feel bad about your Dell, which I assume it is for office use. So it may not be fair comparison. My box was a gaming setup, it has 4 core cpu and 8G ram, that's a pretty big deal back to 7 years ago when mainstream market started promoting 2xcore CPU and 2G RAM machines.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 505 views
  • 5 likes
  • 3 in conversation