- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to encrpt a set of IDs using SHA256 in SAS 9.4.
However, when I finished that I found the length of the hash does not always the 64 characters.
My code is:
data B;
length hashID $64;
format hashID $hex64.;
set A;
catxID = catx("XXX", ID, "XXXX");
hashID = sha256(catxID);
drop catxID ID;
run;
XXX are some random salts.
However, when I examined the hashID, most of them were 64 characters, but some were only 22 characters and some even only contained 6 or 9 or othe number of characters.
Can anyone advice whether it is normal or not? and What would result in such variations?
Thanks.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SHA256() returns always a 256Bit hash as per documentation.
I've done some testing and it appears that what you observe has to do with the table viewer in EG.
I've used below code:
data test;
length hashid1Test $32 l_hashid1Test 8 hashID1 hashID2 $64;
format hashid1Test $hex64.;
do id=21 to 21;
catxID = catx("XXX", ID, "XXXX");
hashid1Test=sha256(catxID);
l_hashid1Test=lengthn(hashid1Test);
hashID1 = put(sha256(catxID),$hex64.);
hashID2 = sha256hex(catxID);
output;
end;
drop catxID ID;
run;
proc print data=test;
run;
That's what you see in the EG table viewer:
Even though variable hashid1Test doesn't show the whole string, it still has a length of 32 characters. The length of 32 indicates that the values as such are still there internally; they just don't print in the data grid.
Having done a bit more analysis it appears that this always happens for HEX00.
If you print the values to an output destination then everything shows up.
So... No serious issue and your data is o.k. It's just a glitch in the EG data grid viewer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SHA256() returns always a 256Bit hash as per documentation.
I've done some testing and it appears that what you observe has to do with the table viewer in EG.
I've used below code:
data test;
length hashid1Test $32 l_hashid1Test 8 hashID1 hashID2 $64;
format hashid1Test $hex64.;
do id=21 to 21;
catxID = catx("XXX", ID, "XXXX");
hashid1Test=sha256(catxID);
l_hashid1Test=lengthn(hashid1Test);
hashID1 = put(sha256(catxID),$hex64.);
hashID2 = sha256hex(catxID);
output;
end;
drop catxID ID;
run;
proc print data=test;
run;
That's what you see in the EG table viewer:
Even though variable hashid1Test doesn't show the whole string, it still has a length of 32 characters. The length of 32 indicates that the values as such are still there internally; they just don't print in the data grid.
Having done a bit more analysis it appears that this always happens for HEX00.
If you print the values to an output destination then everything shows up.
So... No serious issue and your data is o.k. It's just a glitch in the EG data grid viewer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the detailed reply.
Yes, I am using EG too.
I will check as adviced tomorrow whether those cases with only 6 or 9 characters hold the same story.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
We encounter similar difficulties with working with digests. Please check if the situation with shorter digest is related to 0x20 bytes (space character) at the end of digest. Similar situation is when you have
hashID = md5('a98');
l = length (hashid);
output;
hashID = md5('a99');
l = length (hashid);
output;
First case generates digest with 0x20 at last character and this causes some issues with e.g. length function or further exporting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You should not use md5 at all, see a blog-post by @ChrisHemedinger from 2014.