BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
EinarRoed
Pyrite | Level 9

We've got several large tables where the primary key consists of multiple char variables. These are used in multiple heavy joins. My example table uses these 5 chars as PK: CUSTOMER_ID, CUSTOMER_SYSTEM_ID, PRODUCT_ID, COVERAGE_ID, OBJECT_ID.

 

I'd like to create 1 numeric composite key instead. Primarily to make the joins run faster. But also because our reusable script (used to process multiple tables) can be simplified by always having 1 PK per table.

 

My plan is to append the char variables together, and then use MD5 & HEX to create the numeric composite key value. Like this:

 

data want;
	set have;
	attrib pk_char format=$200.;
	attrib pk_nr format=20.0;
	pk_char = CUSTOMER_ID || '_' || CUSTOMER_SYSTEM_ID || '_' || PRODUCT_ID || '_' || COVERAGE_ID || '_' || OBJECT_ID;
	pk_nr = input(put(md5(pk_char),$hex13.),hex13.);
run;

1) Is this a sensible approach, or do you have better suggestions?

2) I'm satisfied with format=$200 for the concatinated character string. This value should never be exceeded for any of the tables. An example value it might get today (from the example table) is "123684525218_562448_V48_26457_3". But I'm clueless about what hex format to go for. Would hex13 be fine for creating a uniqe numerical value for a string consisting of up to 200 symbols?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

The MD5() function returns 128 bits.  So at 8 bits per byte that is 16 bytes.  That is waaaaaay too many to convert into a single number.  So leave it as character and use the $HEX32 format to display it.

data want;
  set have;
  attrib pk_char length=$200;
  attrib pk length=$16 format=$hex32.;
  pk_char = catx('_',of CUSTOMER_ID CUSTOMER_SYSTEM_ID PRODUCT_ID COVERAGE_ID OBJECT_ID);
  pk = md5(pk_char);
run;

Also there is no need to attach the $200 format to PK_CHAR. SAS does not need any special instructions for how to display display character strings.

View solution in original post

2 REPLIES 2
Tom
Super User Tom
Super User

The MD5() function returns 128 bits.  So at 8 bits per byte that is 16 bytes.  That is waaaaaay too many to convert into a single number.  So leave it as character and use the $HEX32 format to display it.

data want;
  set have;
  attrib pk_char length=$200;
  attrib pk length=$16 format=$hex32.;
  pk_char = catx('_',of CUSTOMER_ID CUSTOMER_SYSTEM_ID PRODUCT_ID COVERAGE_ID OBJECT_ID);
  pk = md5(pk_char);
run;

Also there is no need to attach the $200 format to PK_CHAR. SAS does not need any special instructions for how to display display character strings.

EinarRoed
Pyrite | Level 9
Very useful, I will go for your suggestion. Thanks!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 702 views
  • 1 like
  • 2 in conversation