BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
EinarRoed
Pyrite | Level 9

We've got several large tables where the primary key consists of multiple char variables. These are used in multiple heavy joins. My example table uses these 5 chars as PK: CUSTOMER_ID, CUSTOMER_SYSTEM_ID, PRODUCT_ID, COVERAGE_ID, OBJECT_ID.

 

I'd like to create 1 numeric composite key instead. Primarily to make the joins run faster. But also because our reusable script (used to process multiple tables) can be simplified by always having 1 PK per table.

 

My plan is to append the char variables together, and then use MD5 & HEX to create the numeric composite key value. Like this:

 

data want;
	set have;
	attrib pk_char format=$200.;
	attrib pk_nr format=20.0;
	pk_char = CUSTOMER_ID || '_' || CUSTOMER_SYSTEM_ID || '_' || PRODUCT_ID || '_' || COVERAGE_ID || '_' || OBJECT_ID;
	pk_nr = input(put(md5(pk_char),$hex13.),hex13.);
run;

1) Is this a sensible approach, or do you have better suggestions?

2) I'm satisfied with format=$200 for the concatinated character string. This value should never be exceeded for any of the tables. An example value it might get today (from the example table) is "123684525218_562448_V48_26457_3". But I'm clueless about what hex format to go for. Would hex13 be fine for creating a uniqe numerical value for a string consisting of up to 200 symbols?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

The MD5() function returns 128 bits.  So at 8 bits per byte that is 16 bytes.  That is waaaaaay too many to convert into a single number.  So leave it as character and use the $HEX32 format to display it.

data want;
  set have;
  attrib pk_char length=$200;
  attrib pk length=$16 format=$hex32.;
  pk_char = catx('_',of CUSTOMER_ID CUSTOMER_SYSTEM_ID PRODUCT_ID COVERAGE_ID OBJECT_ID);
  pk = md5(pk_char);
run;

Also there is no need to attach the $200 format to PK_CHAR. SAS does not need any special instructions for how to display display character strings.

View solution in original post

2 REPLIES 2
Tom
Super User Tom
Super User

The MD5() function returns 128 bits.  So at 8 bits per byte that is 16 bytes.  That is waaaaaay too many to convert into a single number.  So leave it as character and use the $HEX32 format to display it.

data want;
  set have;
  attrib pk_char length=$200;
  attrib pk length=$16 format=$hex32.;
  pk_char = catx('_',of CUSTOMER_ID CUSTOMER_SYSTEM_ID PRODUCT_ID COVERAGE_ID OBJECT_ID);
  pk = md5(pk_char);
run;

Also there is no need to attach the $200 format to PK_CHAR. SAS does not need any special instructions for how to display display character strings.

EinarRoed
Pyrite | Level 9
Very useful, I will go for your suggestion. Thanks!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1576 views
  • 1 like
  • 2 in conversation