[Cfrg] HMAC-MD5

Wed Mar 29 15:04:24 EST 2006

A couple of (rather uninformed) thoughts regarding HMAC-MD5:  First,
how could collision attacks be extended to preimage attacks?  And second,
how would preimage attacks affect HMAC-MD5?

For a preimage attack, consider the simplest case, a single input
block of 64 bytes.  Then Hash = IV + Compress(IV,Input).  We can try
to run this backwards: Decompress(Hash-IV,Input).  We need to choose
Input such that the result of this backwards run equals IV, the fixed
"magic number" that MD5 starts with.  This is the hard part.

One idea is to split the compression function into two halves:
Compress1 and Compress2, such that Compress() = Compress2(Compress1()).
Then Decompress, which is backwards, is Decompress1(Decompress2()).

We could aim for a meet-in-the-middle attack, where we would run
Compress1(IV,Input) and Decompress2(Hash-IV,Input) and try to get them to
match.  Then this value of Input would be a preimage of the desired Hash.

The problem is that Input affects both Compress1 and Decompress2 in
complicated ways.  The solution would perhaps be to aim to find a family
of Input values which caused only moderate changes to the outputs of
Compress1 and Decompress2.  This is similar to what happens now with the
hash collision attacks.  They find pairs of Inputs that have almost no
change through the various sub-parts of the compression functions.

If this could be extended so that there were not just a pair of Inputs,
but larger numbers of them that produced almost-collisions after halfway
through the compression function, then this could be a direction towards
making this MITM work.  At the most extreme case, if we could find 2^64
inputs which all collided through half the compression and half the
decompression functions, then we'd have success, we'd have a preimage
in 2^64 work.  In practice we would not reach this extreme perfection,
but perhaps we could approximate it enough that with much more work and
good ideas, a preimage could still be found with substantially less than
2^128 work.

As for the other question, the impact of preimages on HMAC-MD5: The goal
of breaking a MAC is, given a bunch of known or chosen MAC'd inputs,
but not knowing the MAC key, generate a valid MAC on a new input.
Using preimages we would aim to generate an input which matched an
output value we chose.

The structure of HMAC is to hash one block (64 bytes) of the secret
key xored a fixed repeated pad value, then the block(s) of the message.
We take the output of that hash and do it again, hashing one block of the
secret key xor a (different) fixed pad, then the output of the first hash.
This is the HMAC.

To reverse this, we would first need to invert the outer (second) hash.
The tricky part here is that the input block (after the key) has a
special form, consisting of the hash from the first step, padded per
the MD5 spec.  This padding will force fixed values (mostly zeros)
into most of the input block and only give us 16 bytes to manipulate.

So probably we would just fix the value from the input hash, fix the
IV that results from hashing the outer key block, and find the output
from this second block as the MAC value we will show an input for.
Then we will turn our attention to the first block, which is key xor pad.
We have its output value (the fixed intermediate IV we just chose) and
so we would apply the inversion algorithm to find the input.  This can
be xored with the pad to get the key.  Note that this is not the user's
key, this is just a key that works for the outer hash.

Now we do the inner hash.  We use the key we found, xor with the
appropriate fixed pad value, and hash to do the first block of the
inner MD5.  This gives us the IV for the second block, and we have
the output for that block - it is the fixed value we chose above.
We apply the inversion function again to get an Input value that
works.

Now we have succeeded: this Input value, along with the key we found in
the first step, will produce the MAC we also found in the first step.
It is not a MAC we have seen before so we have an official break.

Therefore the ability to invert single blocks of MD5 will likely lead
to an effective break of HMAC-MD5.  Whether the current attacks against
MD5 can be advanced to that point remains to be seen.  If it works it
will certainly be one of the premier cryptographic accomplishments of
recent years.

Hal Finney

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com