I am trying to encrpt a set of IDs using SHA256 in SAS 9.4.
However, when I finished that I found the length of the hash does not always the 64 characters.
My code is:
data B;
length hashID $64;
format hashID $hex64.;
set A;
catxID = catx("XXX", ID, "XXXX");
hashID = sha256(catxID);
drop catxID ID;
run;
XXX are some random salts.
However, when I examined the hashID, most of them were 64 characters, but some were only 22 characters and some even only contained 6 or 9 or othe number of characters.
Can anyone advice whether it is normal or not? and What would result in such variations?
Thanks.
SHA256() returns always a 256Bit hash as per documentation.
I've done some testing and it appears that what you observe has to do with the table viewer in EG.
I've used below code:
data test;
length hashid1Test $32 l_hashid1Test 8 hashID1 hashID2 $64;
format hashid1Test $hex64.;
do id=21 to 21;
catxID = catx("XXX", ID, "XXXX");
hashid1Test=sha256(catxID);
l_hashid1Test=lengthn(hashid1Test);
hashID1 = put(sha256(catxID),$hex64.);
hashID2 = sha256hex(catxID);
output;
end;
drop catxID ID;
run;
proc print data=test;
run;
That's what you see in the EG table viewer:
Even though variable hashid1Test doesn't show the whole string, it still has a length of 32 characters. The length of 32 indicates that the values as such are still there internally; they just don't print in the data grid.
Having done a bit more analysis it appears that this always happens for HEX00.
If you print the values to an output destination then everything shows up.
So... No serious issue and your data is o.k. It's just a glitch in the EG data grid viewer.
SHA256() returns always a 256Bit hash as per documentation.
I've done some testing and it appears that what you observe has to do with the table viewer in EG.
I've used below code:
data test;
length hashid1Test $32 l_hashid1Test 8 hashID1 hashID2 $64;
format hashid1Test $hex64.;
do id=21 to 21;
catxID = catx("XXX", ID, "XXXX");
hashid1Test=sha256(catxID);
l_hashid1Test=lengthn(hashid1Test);
hashID1 = put(sha256(catxID),$hex64.);
hashID2 = sha256hex(catxID);
output;
end;
drop catxID ID;
run;
proc print data=test;
run;
That's what you see in the EG table viewer:
Even though variable hashid1Test doesn't show the whole string, it still has a length of 32 characters. The length of 32 indicates that the values as such are still there internally; they just don't print in the data grid.
Having done a bit more analysis it appears that this always happens for HEX00.
If you print the values to an output destination then everything shows up.
So... No serious issue and your data is o.k. It's just a glitch in the EG data grid viewer.
Thanks for the detailed reply.
Yes, I am using EG too.
I will check as adviced tomorrow whether those cases with only 6 or 9 characters hold the same story.
We encounter similar difficulties with working with digests. Please check if the situation with shorter digest is related to 0x20 bytes (space character) at the end of digest. Similar situation is when you have
hashID = md5('a98');
l = length (hashid);
output;
hashID = md5('a99');
l = length (hashid);
output;
First case generates digest with 0x20 at last character and this causes some issues with e.g. length function or further exporting.
You should not use md5 at all, see a blog-post by @ChrisHemedinger from 2014.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.