## building compact unique id using A-Z,0-9

Solved
Occasional Contributor
Posts: 15

# building compact unique id using A-Z,0-9

Hello,

I want to create a compact unique id value?  For example if  I have a dataset with a billion rows - I could create a variable named id with a value from 1-1,000,000,000

(0000000001 to 1000000000) .  But I don't want my variable to be \$10 - I want my variable to be less than 10.  I was thinking of using {A-Z, 0-9} to come up with unique id combinations.

Therefore I would have 36 unique characters to build the id variable with. I then figured out that I  can create an id value of \$6 length (36^6 = 2,176,782,336 unique 6 char combinations). The \$5 would not provide at least 1 billion unique values.

I played around with arrays and macro lists but didn't get anywhere.  I am stuck on how to start building this new id variable - does anyone have ideas?

Julie

Accepted Solutions
Solution
‎02-24-2015 09:51 PM
Posts: 3,167

## Re: building compact unique id using A-Z,0-9

Seems to me all you need is bunch of (in this case, 6) do-loops:

data _null_;

length str \$ 200;

do i=48 to 57, 65 to 90;

bt=byte(i);

str=catx(',', str, quote(bt));

end;

call symputx('str', str);

run;

data uniq_id;

length ID  \$ 6;

do i1=&str.;

do i2=&str.;

do i3=&str.;

do i4=&str.;

do i5=&str.;

do i6=&str.;

id=cats(of i1-i6);

output;

end;

end;

end;

end;

end;

end;

run;

On my 7-years old box, it shows:

NOTE: The data set WORK.UNIQ_ID has 2176782336 observations and 7 variables.

NOTE: DATA statement used (Total process time):

real time           5:18.71

cpu time            4:57.10

All Replies
Community Manager
Posts: 3,424

## Re: building compact unique id using A-Z,0-9

The UUIDGEN function meets some of your criteria.  See this blog post:

http://blogs.sas.com/content/sasdummy/2012/10/19/creating-a-somewhat-unique-id-using-the-uuidgen-fun...

data wastedGuids;
do x=1 to 10;
guid = compress(uuidgen(),
'-');
output;
end;
run;

/* OUTPUT of 10 records:

131f5c0e12344f8b9c3a02014663677a

afb7719250bd446b8c7d1285b9f773bb

18f3332aff7448a4bc2e3a41eb1653f8

223ce6024cd74038bfc9a5d2523524d1

9c412af5fb7140d4995ff25aec13969c

8dfd6b1b7d3045d887d7481a3faa4548

125250a10f814aa2b72ac5f727049e0b

7f3c6f4517f141e381d4f2c4a0c4bbb3

*/

The downside is that the length will be \$32 -- that's longer than you asked for.  But the result is guaranteed unique and you've saved your own brainpower.

Another option is to use a HEX8. format, more concise than Decimal:

data last10_IDs;
do x=999999990 to 1000000000;
id = put (x,
hex8.);
output;
end;
run;

x           id

999999990    3B9AC9F6

999999991    3B9AC9F7

999999992    3B9AC9F8

999999993    3B9AC9F9

999999994    3B9AC9FA

999999995    3B9AC9FB

999999996    3B9AC9FC

999999997    3B9AC9FD

999999998    3B9AC9FE

999999999    3B9AC9FF

1000000000    3B9ACA00

Chris

Occasional Contributor
Posts: 15

## Re: building compact unique id using A-Z,0-9

Thank you.  This really helpful to know - I have never come across the UUIDGEN function before.

Julie

Solution
‎02-24-2015 09:51 PM
Posts: 3,167

## Re: building compact unique id using A-Z,0-9

Seems to me all you need is bunch of (in this case, 6) do-loops:

data _null_;

length str \$ 200;

do i=48 to 57, 65 to 90;

bt=byte(i);

str=catx(',', str, quote(bt));

end;

call symputx('str', str);

run;

data uniq_id;

length ID  \$ 6;

do i1=&str.;

do i2=&str.;

do i3=&str.;

do i4=&str.;

do i5=&str.;

do i6=&str.;

id=cats(of i1-i6);

output;

end;

end;

end;

end;

end;

end;

run;

On my 7-years old box, it shows:

NOTE: The data set WORK.UNIQ_ID has 2176782336 observations and 7 variables.

NOTE: DATA statement used (Total process time):

real time           5:18.71

cpu time            4:57.10

Occasional Contributor
Posts: 15

## Re: building compact unique id using A-Z,0-9

You are a SAS Rock Star - This is exactly what I wanted.

Now that I see the code - it is perfectly clear how to do this.

It took over 8 minutes on my 2 month old dell.

Thanks,  Julie

Posts: 3,167