MD5 collisions?

Wed Aug 18 10:49:17 EDT 2004

In the light of day and less inebriated, I'd like to clarify some of what I 
wrote last night, and maybe expand a bit. My original account wasn't what 
I'd like to think of as a record for posterity.

Greg.

At 13:11 2004-08-18 +1000, Greg Rose wrote:

>Xiaoyun Wang was almost unintelligible.

This was not meant as a criticism. It just meant you had to concentrate 
really hard. Her English is much better than my Chinese.

>  But the attack works with "any initial values", which means that they 
> can take any prefix, and produce collisions between two different 
> suffixes. The can produce the first collision for a given initial value 
> in less than an hour, and then can crank them out at about one every 5 minutes.

As mentioned previously, the idea is to produce a good "partial collision" 
with the first blocks input to the hash, and then from two similar starting 
points, find subsequent blocks that correct them back to the same output. 
So, for a given input chaining vector, it takes about an hour to get the 
partial collisions in the first input block. From there, they can compute 
subsequent "second blocks" to produce the collisions in a few minutes.

Note that they did two entirely new collisions for MD5 overnight, 
communicating back to China, when they found out about the endianness 
problems. No-one can now argue that they can't find collisions at will.

>  It seems to be a straightforward differential cryptanalysis attack, so 
> one wonders why no-one else came up with it.

With further hindsight, and Phil Hawkes' help, I understand now. The 
technique needs to alternate between single bit (xor) differences and 
subtractive differences, with careful conditioning of the bits affected to 
make sure the right carries do, or don't, propagate. This is explained in 
Phil's upcoming paper about SHA-2 cancellations, which was presented later 
in the rump session. That should be on e-print in the next couple of days. 
The Chinese team is also writing a much more detailed paper, but it will 
take longer.

There has been criticism about the Wang et. al paper that "it doesn't 
explain how they get the collisions". That isn't right. Note that from the 
incorrect paper to the corrected one, the "delta" values didn't change. 
Basically, if you throw random numbers in as inputs, in pairs with the 
specified deltas, you should eventually be able to create your own MD5 
collisions for fun or profit. How they got this insight, we'll have to wait 
for... but the "method" is already there.

Greg.

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com