BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
EinarRoed
Pyrite | Level 9

We've got several large tables where the primary key consists of multiple char variables. These are used in multiple heavy joins. My example table uses these 5 chars as PK: CUSTOMER_ID, CUSTOMER_SYSTEM_ID, PRODUCT_ID, COVERAGE_ID, OBJECT_ID.

 

I'd like to create 1 numeric composite key instead. Primarily to make the joins run faster. But also because our reusable script (used to process multiple tables) can be simplified by always having 1 PK per table.

 

My plan is to append the char variables together, and then use MD5 & HEX to create the numeric composite key value. Like this:

 

data want;
	set have;
	attrib pk_char format=$200.;
	attrib pk_nr format=20.0;
	pk_char = CUSTOMER_ID || '_' || CUSTOMER_SYSTEM_ID || '_' || PRODUCT_ID || '_' || COVERAGE_ID || '_' || OBJECT_ID;
	pk_nr = input(put(md5(pk_char),$hex13.),hex13.);
run;

1) Is this a sensible approach, or do you have better suggestions?

2) I'm satisfied with format=$200 for the concatinated character string. This value should never be exceeded for any of the tables. An example value it might get today (from the example table) is "123684525218_562448_V48_26457_3". But I'm clueless about what hex format to go for. Would hex13 be fine for creating a uniqe numerical value for a string consisting of up to 200 symbols?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

The MD5() function returns 128 bits.  So at 8 bits per byte that is 16 bytes.  That is waaaaaay too many to convert into a single number.  So leave it as character and use the $HEX32 format to display it.

data want;
  set have;
  attrib pk_char length=$200;
  attrib pk length=$16 format=$hex32.;
  pk_char = catx('_',of CUSTOMER_ID CUSTOMER_SYSTEM_ID PRODUCT_ID COVERAGE_ID OBJECT_ID);
  pk = md5(pk_char);
run;

Also there is no need to attach the $200 format to PK_CHAR. SAS does not need any special instructions for how to display display character strings.

View solution in original post

2 REPLIES 2
Tom
Super User Tom
Super User

The MD5() function returns 128 bits.  So at 8 bits per byte that is 16 bytes.  That is waaaaaay too many to convert into a single number.  So leave it as character and use the $HEX32 format to display it.

data want;
  set have;
  attrib pk_char length=$200;
  attrib pk length=$16 format=$hex32.;
  pk_char = catx('_',of CUSTOMER_ID CUSTOMER_SYSTEM_ID PRODUCT_ID COVERAGE_ID OBJECT_ID);
  pk = md5(pk_char);
run;

Also there is no need to attach the $200 format to PK_CHAR. SAS does not need any special instructions for how to display display character strings.

EinarRoed
Pyrite | Level 9
Very useful, I will go for your suggestion. Thanks!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 804 views
  • 1 like
  • 2 in conversation