BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
wedens
Calcite | Level 5

I have a dataset of values store as $hex32 character variables. Is there any way to copy the displayed HEX value to a string?

 

For example, the displayed HEX value is "FF938C281FEB8DC357B37032976326A3". I would like that exact string (not the underlying unprintable data) in a character variable that I can then manipulate using substr(). I have searched around and can't figure out a way to do this. Any help would be appreciated. Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Please consider:

data work.test; 
   length hex $32.; 
   format hex $hex32.; 
/*   set book1;*/
   Year=2019;
   State='GA';
   id = '123456';
	hex = md5(cats(year,state,id));
	newid = cats(state,mod(year,100),substr(put(hex,$hex32.),1,6));
run;

You didn't provide the BOOK1 set so I hard coded in the values.

 

Note that your Substr was likely way wrong as it would start at position 6 and return everything through the end of the string.

You likely want to set a length for NEWID as well. As is it will default to 200 characters.

View solution in original post

13 REPLIES 13
Reeza
Super User

I think you're asking how to create a new variable with the formatted value.

 

newVar = PUT(oldVar, $hex32.);

PUT() will apply the format and convert it to a character value.

 


@wedens wrote:

I have a dataset of values store as $hex32 character variables. Is there any way to copy the displayed HEX value to a string?

 

For example, the displayed HEX value is "FF938C281FEB8DC357B37032976326A3". I would like that exact string (not the underlying unprintable data) in a character variable that I can then manipulate using substr(). I have searched around and can't figure out a way to do this. Any help would be appreciated. Thanks.


 

hashman
Ammonite | Level 13

@Reeza :

Nope, it will not work.

It would if the OP needed to convert (print, display) some characters to (in) their hex representation. But what is needed here is to read a string in the radix16 representation (whose digits are the characters from 0 to 9 and A to F) and convert it to the radix 256 representation (whose digits are all the 256 characters in the collating sequence). Since reading in SAS is done using informats, that's what the doctor ordered here; and the conversion function used with informats is INPUT, not PUT. See what you get if you try to use the format instead of the informat:

data _null_ ;                
  c = "AB" ;                 
  x = put (c, $hex4.) ;      
  right = input (x, $hex4.) ;
  wrong =   put (x, $hex4.) ;
  put right= wrong= ;        
run ;                        
-------------------
right=AB wrong=3431

What we are getting here with the format instead of the correct original value "AB", is the first two characters of X (themselves hex digits, but the format doesn't care) in their 16-radix representation. 

 

Moreover, the $HEXw. format does nothing to evaluate the input to see if it is in the correct hex digit form. But the $HEXw. informat does: If its argument should contain anything but hex digits, it will set _ERROR_=1, write a note "Invalid argument to function INPUT" in the log (provided, of course, that the ?? modifier is not used with it), and set the response to a missing value (i.e. space in this case).    

 

Kind regards

Paul D.

Reeza
Super User

For example, the displayed HEX value is "FF938C281FEB8DC357B37032976326A3". I would like that exact string (not the underlying unprintable data) in a character variable that I can then manipulate using substr()

@hashman I'm assuming this part is true and what the OP has is actually a variable with the HEX32 format. I could be wrong of course 🙂

hashman
Ammonite | Level 13

@Reeza :

I think you're actually right. Of course, that would mean that the "stored as" part is wrong. Go figure Smiley Happy.

hashman
Ammonite | Level 13

@wedens :

You need the $HEXw. INFORMAT coupled with the INPUT function to do that. If your variable (let's call it HEX) has 32 hex digits (i.e. 0-9, A-F), you should use W=32. If the character string, to which the hex value is being converted (let's call it CHAR), hasn't been preassigned a length, reading HEX using the $HEX32. informat will result in CHAR with length $16, as 2 hex digits correspond to 1 byte. 

  char = input (hex, $hex32.) ;

Note that using the $HEX32. format with the PUT function will not work because the format interprets its argument merely as a collection of  characters to be converted to their hex representation, regardless of whether these characters also happen to be hex digits or not.   

 

Kind regards

Paul D.   

 

Tom
Super User Tom
Super User

Whether you need to use PUT() or INPUT() depends on what type of data the original poster actually has. They words are ambiguous and could be interpreted either way.

 

hashman
Ammonite | Level 13

@Tom :

Agreed; it's ambiguous, indeed.

 

Now that I've reread it a few times, it looks as though I might have been duped by "stored as", and it is most likely that what the OP sees is the variable with the $HEX32. format attached to it, so it appears displayed as hex in OP's UI. Which of course would mean that it's not "stored as" hex at all, but the OP needs a different variable where the content of the variable in question would indeed be stored in its hex representation. In which case @Reeza would be the one who's interpreted the specs correctly, and I would be the one who's goofed.    

 

Kind regards

Paul D.

wedens
Calcite | Level 5

Thanks to both of you for the comments and apologies for the initial ambiguities. I don't typically interact with this type of variable structure so my language may have been incorrect.

 

The HEX variable is generated using the MD5 function and the output is assigned the $hex32 format to make it readable. So I believe I was incorrect in my initial post. The underlying data is whatever the default output of the MD5 function is and I have applied the $hex32 format myself. My issue is that when I try to then truncate or combine the HEX variable with normal char or num variables the result is gibberish. My goal is to combine a 2-digit year, a 2-digit state code, and the first 6 digits of the HEX variable into a new combined ID. See below for what currently happens when I run the following code:

 

 

data test; length hex $32.; format hex $hex32.; set book1;
	hex = md5(cats(year,state,id));
	newid = cats(state,mod(year,100),substr(hex,6));
run;
hexYearStateIDnewid
0FC31C920C2957877533B187C24FDB1D2019GA123456GA19)W‡u3±‡ÂOÛ

 

 

My actual goal is for the newid variable in this case to be GA190FC31C with a standard $char10 format. Does that make more sense? I'm having a tough time putting it into words. Thanks for the help.

hashman
Ammonite | Level 13

@wedens :

Now it's much clearer. You need to replace the SUBSTR function with the PUT function coupled with the $HEX6. format (it will grab 3 first bytes, each byte is 2 hex digits, so you will have 3*2=6). HEX as a variable name for the MD5 digest is confusing, so below it's changed to just MD5. You also need to give NEWID the needed length 10, otherwise CATS will default it to 200:

data test ;                                                
  set book1 ;                                              
  format md5 $hex32. ;                                     
  md5 = md5 (cats (year, state, id)) ;  
length newid $ 10 ; newid = cats (state, mod (year, 100), put (md5, $hex6.)) ; run ;

Kind regards

Paul D.

Tom
Super User Tom
Super User
The number in the $HEXn. format specification is the WIDTH of the output, not the LENGTH of the input.
So $HEX6. takes the first 3 characters of the input string and generates 6 hex characters that are the hexadecimal representation of those three bytes.
hashman
Ammonite | Level 13

@Tom :

Correct. Shame on my old goofy head. Thanks for catching it - need to edit my response.

 

Kind regards

Paul D.

ballardw
Super User

Please consider:

data work.test; 
   length hex $32.; 
   format hex $hex32.; 
/*   set book1;*/
   Year=2019;
   State='GA';
   id = '123456';
	hex = md5(cats(year,state,id));
	newid = cats(state,mod(year,100),substr(put(hex,$hex32.),1,6));
run;

You didn't provide the BOOK1 set so I hard coded in the values.

 

Note that your Substr was likely way wrong as it would start at position 6 and return everything through the end of the string.

You likely want to set a length for NEWID as well. As is it will default to 200 characters.

wedens
Calcite | Level 5

Thanks to all. This worked perfectly! Looks like the suggested code from both @hashman / @Tom  and @ballardw did exactly what I needed. Sorry again for the initial confusion.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 14529 views
  • 8 likes
  • 5 in conversation