A collision in MD5'

Eric Rescorla ekr at rtfm.com
Mon Aug 16 21:02:24 EDT 2004


I've now successfully reproduced the MD5 collision result. Basically
there are some endianness problems.

The first problem is the input vectors. They're given as hex words, but
MD5 is defined in terms of bitstrings. Because MD5 is little-endian, you
need to reverse the written byte order to generate the input data. A
related problem is that some of the words are given as only 7 hex
digits. Assuming that they have a leading zero fixes that
problem. Unfortunately, this still doesn't give you the right hash
value.

The second problem, which was found by Steve Burnett from Voltage
Security, is that they authors aren't really computing MD5. The
algorithm is initialized with a certain internal state, called an
Initialization Vector (IV). This vector is given in the MD5 RFC as:

word A: 01 23 45 67
word B: 89 ab cd ef
word C: fe dc ba 98
word D: 76 54 32 10

but this is little-endian format. So, the actual initialization values
should be 0x67452301, etc...

The authors use the values directly, so they use: 0x01234567,
etc... Obviously, this gives you the wrong hash value. If you use these
wrong IVs, you get a collision... though strangely with a different hash
value than the authors provide. Steve and I have independently gotten
the same result, though of course we could have made mistakes...

So, this looks like it isn't actually a collision in MD5, but rather in
some other algorithm, MD5'. However, there's nothing special about the
MD5 IV, so I'd be surprised if the result couldn't be extended to real
MD5.

-Ekr

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com



More information about the cryptography mailing list