BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jeff_DOC
Pyrite | Level 9

Good afternoon.

 

I wonder if someone might be able to answer a question on a hash values? I am very new to this concept, sorry.

 

We are having a discussion on the security of hash values and I came up with a question.

 

The question is if I have a hashed value (and I know the original value) and a table of many other values (that does contain the original unhashed value someplace in the many table) can I then hash all values in the "many" table and then compare the original hash value with the hashed values from the "many" table to find the matched value?

 

For some reason I cannot get SAS to match the values an pull out the orignal variable value.

 

Thank you so much for any help.

 

Jeff

 

 

Example Code:

 

data test_code;

format hash hash_copy hash_diff $hex64.;

hash = sha256(ss_number);

hash_copy = sha256("fox");

hash_diff = sha256("fox");

if(hash=hash_copy) then

test_copy = 'T';

else test_copy = 'F';

if(hash=hash_diff) then

test_diff = 'T';

else test_diff = 'F';

run;

 

Original value - fox

Hash Value - 05A8475FEEBBFEA685ECE06045BE63722EDC61BB2ADE8060A9F03AED6D49F454

Hash Copy - 776CB326AB0CD5F0A974C1B9606044D8485201F2DB19CF8E3749BDEE5F36E2

Hash Diff - 776CB326AB0CD5F0A974C1B9606044D8485201F2DB19CF8E3749BDEE5F36E2

 

Many table

hen -

Hash Value - 05A8475FEEBBFEA685ECE06045BE63722EDC61BB2ADE8060A9F03AED6D49F454

Hash Copy - D44A08DE3981889DAE0F7BA3980EA9FD24768279FC4EC2FBBDBCF3BF9FE65C22

Hash Diff -  D44A08DE3981889DAE0F7BA3980EA9FD24768279FC4EC2FBBDBCF3BF9FE65C22

 

pig -

Hash Value - 05A8475FEEBBFEA685ECE06045BE63722EDC61BB2ADE8060A9F03AED6D49F454

Hash Copy - F0B8C9D84DD2B877E0B952130B73E218106FEC04C23852271D390213A1FF96F4

Hash Diff -  F0B8C9D84DD2B877E0B952130B73E218106FEC04C23852271D390213A1FF96F4

 

fox -

Hash Value - 05A8475FEEBBFEA685ECE06045BE63722EDC61BB2ADE8060A9F03AED6D49F454

Hash Copy - 776CB326AB0CD5F0A974C1B9606044D8485201F2DB19CF8E3749BDEE5F36E2

Hash Diff -  776CB326AB0CD5F0A974C1B9606044D8485201F2DB19CF8E3749BDEE5F36E2

 

horse -

Hash Value - 05A8475FEEBBFEA685ECE06045BE63722EDC61BB2ADE8060A9F03AED6D49F454

Hash Copy - FD62862B6DC213BEE77C2BADD6311528253C6CB3107E03C16051AA15584ECA1C

Hash Diff -  FD62862B6DC213BEE77C2BADD6311528253C6CB3107E03C16051AA15584ECA1C

1 ACCEPTED SOLUTION

Accepted Solutions
Sven111
Pyrite | Level 9

That makes a lot of sense then, that general approach actually how passwords are validated in most cases. You should be able to match everything up, but like I mentioned previously you need to ensure everything is hashed identically.  Also if security is the driving concern, you're best off doing as @LinusH recommended and adding a salt to the plain text prior to hashing, otherwise you're much more vulnerable to brute-force attacks trying to determine the plaintext value from the hash value.  GPU's today can hash at a rate of billions of hashes a second on older algorthims like MD5 and SHA1, and there's a form of attack that uses sometihng called rainbow tables that have precomputed plaintext values for various hashed values, but using a salt makes those approaches much harder.  There are also alternative hashing algorithms out there specifically designed to minimize the effectiveness of GPU or other brute-force attacks, but I don't think they're implemented in Base SAS.  You can also just iterate the hash multiple times to somewhat increase security, or at least increase the cost to brute-force it, but you need to make sure you doing it exactly the same on both sides or they won't match up.

View solution in original post

10 REPLIES 10
Jeff_DOC
Pyrite | Level 9

Apologies.

 

Line 3 of the example code should read: hash = sha256("fox");

 

Jeff

ballardw
Super User

If you click on the "cog" or "gear" icon in you post there should be an option to "Edit Reply". You can click there to make corrections or additions needed. Then people won't have to find your post with the correction. Which can be fun with this forum as we have miminal control over the order posts appear to readers and that correction could end up below many posts. Some of which may ask about that value.

Jeff_DOC
Pyrite | Level 9

Thank you for the advice.

Sven111
Pyrite | Level 9

I'm not sure I'm 100% following your question, let me see if I got this.  You have a table (call it TABLE_A) that contains (at least) 2 columns, one of the columns being some text or numeric values (we'll call it column PLAIN_A) and another column which contains the hashed value of contents of column A (call it column HASH_A).  Is your question can you lookup values in that table using either PLAIN_A or HASH_A as a key?  If so, then yes if you know one of the values in the columns.

 

You need to be careful with hash functions though since they're very sensitive to how data is fed to them, make sure everything is (or isn't) quoted identically.  Any change in the input to the hash function will (by design) completely change the hash value.

Jeff_DOC
Pyrite | Level 9

You stated it better than I did. Yes, I have two tables. Table A has a value PLAIN_A, and hash value HASH_A. Table B has many vaues PLAIN_B. Can I use the same hash function to hash all vaues in Table B by creating HASH_B and then compare HASH_A and HASH_B to find the PLAIN_B value in Table B?

Sven111
Pyrite | Level 9

Yes that should work, but if you have the plain values for both columns in both tables, why not just match up PLAIN_A to PLAIN_B directly?  You wouldn't need to worry about hashing anything, you could just use a standard PROC SQL JOIN.  You could do a join on the hashed values too once you created HASH_B, but again that seems like it's just an extra step that isn't really providing any value. 

 

What problem are you actually trying to solve?  Hashing is a great tool to have handy, but I'm not sure it's doing you any favors in this case, unless you're just trying to learn how to use it for its own sake.

Jeff_DOC
Pyrite | Level 9
The question we were discussing was is it a valid strategy to hash a value prior to sending date and have the receiving end match the information to their own records. The hashed value may not be pubic information so it would be a way to obscure it prior to transport. My colleague stated it was a valid attempt at security. I know little about hashing but it seems as though if the information can be reconstructed this simply it's not the best option.
For some reason I cannot get SAS to match the hashed values in as being alike so I assumed I was doing something incorrect.
Sven111
Pyrite | Level 9

That makes a lot of sense then, that general approach actually how passwords are validated in most cases. You should be able to match everything up, but like I mentioned previously you need to ensure everything is hashed identically.  Also if security is the driving concern, you're best off doing as @LinusH recommended and adding a salt to the plain text prior to hashing, otherwise you're much more vulnerable to brute-force attacks trying to determine the plaintext value from the hash value.  GPU's today can hash at a rate of billions of hashes a second on older algorthims like MD5 and SHA1, and there's a form of attack that uses sometihng called rainbow tables that have precomputed plaintext values for various hashed values, but using a salt makes those approaches much harder.  There are also alternative hashing algorithms out there specifically designed to minimize the effectiveness of GPU or other brute-force attacks, but I don't think they're implemented in Base SAS.  You can also just iterate the hash multiple times to somewhat increase security, or at least increase the cost to brute-force it, but you need to make sure you doing it exactly the same on both sides or they won't match up.

LinusH
Tourmaline | Level 20
Your data step code isn't working since you refer ta a non existent variable.

Not sure if I follow your inquiry but I think the answer is yes.

Hashes isn't considered safe when the domain values are known. It could be better protected by using a salt (a secret string that is concatenated prior to the hash operation).
Data never sleeps
Jeff_DOC
Pyrite | Level 9
Thank you. I inadvertently left an artifact. I replied to my post that it should have been fox. My apologies.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 4627 views
  • 0 likes
  • 4 in conversation