[Cryptography] let's kill md5sum!
Jerry Leichter
leichter at lrw.com
Sat Jun 6 06:00:31 EDT 2015
On Jun 5, 2015, at 10:22 PM, Zooko Wilcox-OHearn <zooko at leastauthority.com> wrote:
> [S]ome people tell me "Okay, we're
> going to switch from MD5 to BLAKE2, but our hash values have to fit
> into the fields where we used to store our MD5 hashes.". I tried my
> hardest to explain that no matter how good the hash function is,
> truncating the output to 128 bits is going to leave users potentially
> vulnerable to collision attacks at some point down the road. The
> response was "Well, we'll just take our chances, because we can't
> change the schema."....
While more bits for a hash function is certainly better, 2^64 is a *big* number. You really need to run the economics here: Assuming technology keeps advancing at current rates, in (say) 25 years, how much will it cost to do 2^64 BLAKE2 computations? How does that compare to the value of one collision?
Many applications that store a checksum also store a data length. To be useful, would a collision have to be for data of (close to) the same length? If so, an attack gets harder, as you can't simultaneously attack all protected items - only those with (close to) the same length; and each computation has to be over data of that length, which is more expensive. Since we're only talking about brute force here, defining a standard salting mechanism and choosing a per-site salt would force an attacker to pick a particular site to attack. Some applications could be finer-grained - e.g., a per database salt.
So ... I wouldn't dismiss their decision completely. Schema changes at large scale can be very disruptive, and people do try to avoid them. (I'm actually not even sure how someone would transition from a large collection of existing MD5 checksums to shrunken BLAKE2 checksums. Would they recompute all the checksums at once? This could be an impractically long operation. Do they have a spare bit somewhere they can use as an algorithm flag? I suppose if records always have a creation date, you could define a cutover time: Any record created before T uses MD5, any record at T or later uses BLAKE2. You are forever subject to attacks against old records, but maybe they get less interesting over time - and if you're really worried you can do a long-term project to recompute checksums for old records and move T backwards.)
-- Jerry
More information about the cryptography
mailing list