MD5 collisions?

Tim Dierks tim at
Wed Aug 18 14:01:20 EDT 2004

On Thu, 19 Aug 2004 00:49:17 +1000, Greg Rose <ggr at> wrote:
> >  It seems to be a straightforward differential cryptanalysis attack, so
> > one wonders why no-one else came up with it.
> With further hindsight, and Phil Hawkes' help, I understand now. The
> technique needs to alternate between single bit (xor) differences and
> subtractive differences, with careful conditioning of the bits affected to
> make sure the right carries do, or don't, propagate. This is explained in
> Phil's upcoming paper about SHA-2 cancellations, which was presented later
> in the rump session. That should be on e-print in the next couple of days.
> The Chinese team is also writing a much more detailed paper, but it will
> take longer.
> There has been criticism about the Wang et. al paper that "it doesn't
> explain how they get the collisions". That isn't right. Note that from the
> incorrect paper to the corrected one, the "delta" values didn't change.
> Basically, if you throw random numbers in as inputs, in pairs with the
> specified deltas, you should eventually be able to create your own MD5
> collisions for fun or profit. How they got this insight, we'll have to wait
> for... but the "method" is already there.

How likely is it that this attack could be extended to creating
common-prefix collisions? This is equivalent to specifying two IVs for
the two hashes. Looking at X.509 ASN.1, it seems like it would be
pretty easy to construct a certificate request such that, if you could
predict the certificate serial number you were to be issued, you could
create a hash collision in an X.509 certificate:

You would construct two strings, each the same length, an even
multiple of 512 bits long, which contained the prefix of the
certificate you were going to be issued and the prefix of the
certificate you would like to claim to have been issued (P1 =
authentic prefix, P2 = forged prefix). The trailing part of each
prefix would be the first few bits of the value of n in the RSA key.
These first few bits can be different in P1 and P2, and randomly
selected, as long as they begin with a 1. Call the first few bits of n
in P1 "p1", and the first few bits of n in P2 "p2".

You would take IV1 = internal state of MD5 after hashing P1 and IV2 =
internal state of MD5 after hashing P2. You can tweak the first few
bits of n if you want IV1 and IV2 to have a particular relation, as
long as it's not infeasible to find that relation. Then find two
1024-bit 2-block messages, M1, and M2, such that they create a
collision. Define v1 = p1 concatentated with M1, and similarly for v2.
Calculate D = abs(v1-v2). Factor D, finding a 1024-bit prime that
evenly divides D ("i"). If you can't easily factor D, or it doesn't
yield a 1024-bit prime factor, find another collision pair. Now
construct the last bits of n (call them t), which will be used in both
requests, such that n1 = v1 concatentated with k and n2 = v2
concatenated with k are both equal to i times j1 and j2 (each 1024-bit
primes). I think it may be possible to acclerate it, but even if you
just have to calculate collisions until it's true, you should be able
to find a collision D and a value k that gives primes i, j1, and j2
once in every ~30 million collisions. (I'm taking a stab in the dark
that values of D that have a single 1024-bit factor have 1/300
density, but this is demonstrably worst-case.) If you can find such a
collision with 5 minutes of CPU time, the attack will take, at worst,
275 CPU-years. This is university-feasible.

Of course, I don't have any idea how hard it is to extend the known
attack with common IVs to one in which the IVs are not common.

I'd recommend that anyone operating a CA choose serial numbers which are
long & unpredictable; if you want some structure, append a random 128 bit
value to your structure.

- Tim

The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at

More information about the cryptography mailing list