I have a dataset of values store as $hex32 character variables. Is there any way to copy the displayed HEX value to a string?
For example, the displayed HEX value is "FF938C281FEB8DC357B37032976326A3". I would like that exact string (not the underlying unprintable data) in a character variable that I can then manipulate using substr(). I have searched around and can't figure out a way to do this. Any help would be appreciated. Thanks.
Please consider:
data work.test; length hex $32.; format hex $hex32.; /* set book1;*/ Year=2019; State='GA'; id = '123456'; hex = md5(cats(year,state,id)); newid = cats(state,mod(year,100),substr(put(hex,$hex32.),1,6)); run;
You didn't provide the BOOK1 set so I hard coded in the values.
Note that your Substr was likely way wrong as it would start at position 6 and return everything through the end of the string.
You likely want to set a length for NEWID as well. As is it will default to 200 characters.
I think you're asking how to create a new variable with the formatted value.
newVar = PUT(oldVar, $hex32.);
PUT() will apply the format and convert it to a character value.
@wedens wrote:
I have a dataset of values store as $hex32 character variables. Is there any way to copy the displayed HEX value to a string?
For example, the displayed HEX value is "FF938C281FEB8DC357B37032976326A3". I would like that exact string (not the underlying unprintable data) in a character variable that I can then manipulate using substr(). I have searched around and can't figure out a way to do this. Any help would be appreciated. Thanks.
@Reeza :
Nope, it will not work.
It would if the OP needed to convert (print, display) some characters to (in) their hex representation. But what is needed here is to read a string in the radix16 representation (whose digits are the characters from 0 to 9 and A to F) and convert it to the radix 256 representation (whose digits are all the 256 characters in the collating sequence). Since reading in SAS is done using informats, that's what the doctor ordered here; and the conversion function used with informats is INPUT, not PUT. See what you get if you try to use the format instead of the informat:
data _null_ ;
c = "AB" ;
x = put (c, $hex4.) ;
right = input (x, $hex4.) ;
wrong = put (x, $hex4.) ;
put right= wrong= ;
run ;
-------------------
right=AB wrong=3431
What we are getting here with the format instead of the correct original value "AB", is the first two characters of X (themselves hex digits, but the format doesn't care) in their 16-radix representation.
Moreover, the $HEXw. format does nothing to evaluate the input to see if it is in the correct hex digit form. But the $HEXw. informat does: If its argument should contain anything but hex digits, it will set _ERROR_=1, write a note "Invalid argument to function INPUT" in the log (provided, of course, that the ?? modifier is not used with it), and set the response to a missing value (i.e. space in this case).
Kind regards
Paul D.
For example, the displayed HEX value is "FF938C281FEB8DC357B37032976326A3". I would like that exact string (not the underlying unprintable data) in a character variable that I can then manipulate using substr()
@hashman I'm assuming this part is true and what the OP has is actually a variable with the HEX32 format. I could be wrong of course 🙂
@Reeza :
I think you're actually right. Of course, that would mean that the "stored as" part is wrong. Go figure .
@wedens :
You need the $HEXw. INFORMAT coupled with the INPUT function to do that. If your variable (let's call it HEX) has 32 hex digits (i.e. 0-9, A-F), you should use W=32. If the character string, to which the hex value is being converted (let's call it CHAR), hasn't been preassigned a length, reading HEX using the $HEX32. informat will result in CHAR with length $16, as 2 hex digits correspond to 1 byte.
char = input (hex, $hex32.) ;
Note that using the $HEX32. format with the PUT function will not work because the format interprets its argument merely as a collection of characters to be converted to their hex representation, regardless of whether these characters also happen to be hex digits or not.
Kind regards
Paul D.
Whether you need to use PUT() or INPUT() depends on what type of data the original poster actually has. They words are ambiguous and could be interpreted either way.
@Tom :
Agreed; it's ambiguous, indeed.
Now that I've reread it a few times, it looks as though I might have been duped by "stored as", and it is most likely that what the OP sees is the variable with the $HEX32. format attached to it, so it appears displayed as hex in OP's UI. Which of course would mean that it's not "stored as" hex at all, but the OP needs a different variable where the content of the variable in question would indeed be stored in its hex representation. In which case @Reeza would be the one who's interpreted the specs correctly, and I would be the one who's goofed.
Kind regards
Paul D.
Thanks to both of you for the comments and apologies for the initial ambiguities. I don't typically interact with this type of variable structure so my language may have been incorrect.
The HEX variable is generated using the MD5 function and the output is assigned the $hex32 format to make it readable. So I believe I was incorrect in my initial post. The underlying data is whatever the default output of the MD5 function is and I have applied the $hex32 format myself. My issue is that when I try to then truncate or combine the HEX variable with normal char or num variables the result is gibberish. My goal is to combine a 2-digit year, a 2-digit state code, and the first 6 digits of the HEX variable into a new combined ID. See below for what currently happens when I run the following code:
data test; length hex $32.; format hex $hex32.; set book1;
hex = md5(cats(year,state,id));
newid = cats(state,mod(year,100),substr(hex,6));
run;
hex | Year | State | ID | newid |
0FC31C920C2957877533B187C24FDB1D | 2019 | GA | 123456 | GA19)W‡u3±‡ÂOÛ |
My actual goal is for the newid variable in this case to be GA190FC31C with a standard $char10 format. Does that make more sense? I'm having a tough time putting it into words. Thanks for the help.
@wedens :
Now it's much clearer. You need to replace the SUBSTR function with the PUT function coupled with the $HEX6. format (it will grab 3 first bytes, each byte is 2 hex digits, so you will have 3*2=6). HEX as a variable name for the MD5 digest is confusing, so below it's changed to just MD5. You also need to give NEWID the needed length 10, otherwise CATS will default it to 200:
data test ;
set book1 ;
format md5 $hex32. ;
md5 = md5 (cats (year, state, id)) ;
length newid $ 10 ;
newid = cats (state, mod (year, 100), put (md5, $hex6.)) ;
run ;
Kind regards
Paul D.
@Tom :
Correct. Shame on my old goofy head. Thanks for catching it - need to edit my response.
Kind regards
Paul D.
Please consider:
data work.test; length hex $32.; format hex $hex32.; /* set book1;*/ Year=2019; State='GA'; id = '123456'; hex = md5(cats(year,state,id)); newid = cats(state,mod(year,100),substr(put(hex,$hex32.),1,6)); run;
You didn't provide the BOOK1 set so I hard coded in the values.
Note that your Substr was likely way wrong as it would start at position 6 and return everything through the end of the string.
You likely want to set a length for NEWID as well. As is it will default to 200 characters.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.