[Cryptography] let's kill md5sum!

Jerry Leichter leichter at lrw.com
Mon Jun 8 05:51:49 EDT 2015


On Jun 7, 2015, at 9:25 AM, Heinz Diehl <htd+ml at fritha.org> wrote:
> There are many use cases where its vulnerabilities are not a weakness, as e.g.
> in data mining, probabilistic string and pattern matching and many
> more. So why remove it (and breaking a lot of software)?
> 
> The point is that its use as a cryptographic hash should be abandoned,
> but not its use in general.
Recall Zooko's comment in the base post:  "I did a quick and dirty benchmark ... and was delighted that b2sum (in BLAKE2sp mode) was almost twice as fast as md5sum on my Intel Core-i5 laptop!"

It's *possible* that one could find other cases where md5sum is actually faster, but it seems unlikely:  Even when it was first introduced, MD5 was the "slower but more conservative hash, compared to MD4".  (Not long after, MD4 was broken badly enough that its use security-related applications never caught on - though in non-security use cases it was perhaps a better alternative!)

Given this, the only possible reason in *any* use case to choose MD5 is to compare to existing, historical records of checksums.  It's hard to judge how many of these are out there - and, more to the point, how many are out there *and use md5sum as a command line utility*, rather than using the MD5 algorithm internally.  Probably not many, would be my guess.

BTW, Zooko's goal, broad as it is, barely scratches the surface.  I personally never think of using md5sum - "openssl dgst" is probably more broadly available and by default it provides a MD5 checksum!  (Of course, openssl tries to cover the historical bases - it also supports MD2, among other long-dead algorithms.)  If we're talking about killing md5sum, we should simultaneously be planning to change the default for openssl dgst.

Then there's the flip side:  There are plenty of sites on the Internet that publish the MD5 checksums of things like software distributions to allow people to check that they have an untampered-with download.  This is *exactly* the use case most subject to significant attack!  I can't recall the last time I saw a site that provided *only* MD5 - most provide both MD5 and SHA1 - but as long as the checksums are there and the md5sum and openssl dgst command lines provide for quick checks ... people will rely on them.

                                                        -- Jerry



More information about the cryptography mailing list