[Cryptography] SHA-256 decrypted (8 rounds)

Mon Apr 8 10:13:18 EDT 2024

> Okay, let's cut a little slack from the terminology discussion.
> 
> Do people reuse code from encryption and decryption operations to build
> hash functions? Sure, sometimes. If somebody used a round function
> from a cipher to build a hash, and didn't update the comments in that
> code, then you might have found a comment referring to encryption and
> decryption. If someone used a full cipher implementation to build a
> hash, the same applies. Okay, fine, whatever.
> 
> For example if you use a fixed key for a cipher and loop through a file
> twice, encrypting in CBC mode, then the final block of the
> doubly-encrypted ciphertext is a good hash function (assuming the cipher
> is a good cipher). And most of the program is going to be code that's
> also for the cipher.
> 
> We don't actually use that doubled-CBC construction, because it's slow
> and inefficient. The reason a cipher has to be applied to the doubled
> input stream when constructing a hash from a cipher that way is to
> prevent people from using cipher decryption to find a one-block preimage
> of the hash, because hash functions aren't supposed to have decryption
> operations. That means it takes twice as long as encrypting the file if
> you do it that way, so we don't. But it's a good thing to teach students
> because a lot of them need to learn the thinking habits needed to
> realize that someone could otherwise find a one-block preimage by
> decrypting the cipher.
> 
> Anyway if I give you that final block of ciphertext you cannot "decrypt"
> the 90-megabyte file that it's a hash of. Even if you know what cipher
> I used, even if you know what key I used, even if you could just simply
> decrypt blocks of plaintext from the cipher. There simply isn't enough
> information in the hash value to specify the contents of a file longer
> than 256 bits, and the way a hash function is constructed should prevent
> even a one-block decryption, even if it was constructed from a cipher
> you could decrypt.
> 
> If it's a bad or broken hash, then you can possibly find another input
> that hashes to the same value. Whether you call it a preimage or a
> decryption, the ability to find it (with less than 2^255 guesses on
> average) would mean that it's a bad or broken hash.
> 
> From what I saw of your code it appears that you were trying to find a
> 256-bit input that hashes to the same value. That's legit. That would
> be a significant result. Hashes are supposed to be hard to reverse even
> when operating on small values, and most inputs have, at least in
> theory, preimages much smaller than the input.
> 
> While success in finding a one-block preimage would reveal a bad or
> broken hash function, it's not going to reveal any information I was
> trying to protect. It won't reveal any plaintext because what you get
> won't be the same 90-megabyte value that I hashed. It's not going to be
> a value that I've ever seen before. It's not going to be a value that I
> transmitted to you whether in encrypted form or as plaintext. Even if
> that 90-megabyte file is, in fact, already contained on your own hard
> drive, you should not be able to derive any information at all about the
> contents of the file without running that same hash function on all your
> files until you find the file that produces a matching hash.
> 
> _______________________________________________
> The cryptography mailing list
> cryptography at metzdowd.com
> https://www.metzdowd.com/mailman/listinfo/cryptography

Indeed, if the 256-bit output would contain all the information needed to reconstruct the 90-megabyte file, the SHA-2 hash function would double as the best file compression function ever, which doesn't seem plausible within this universe.

But we're talking about a single block and initially 8 rounds of processing (which disregards inputs greater than 256 bits, and even disregards additional mixing with derived words).

Please keep in mind that in case of password hashes for instance (probably better to use alternative functions for this), the original input (password/key) is mostly less or equal than 256 bits, even when it's rehashed a number of times. So there are real-world cases where being able to revert such a hash is not desired, to say the least.

McDair