A risk with using MD5 for software package fingerprinting
Arnold G. Reinhold
reinhold at world.std.com
Sun Jan 27 12:07:21 EST 2002
The cryptographic hash function MD5 is often used to authenticate
software packages, particularly in the Unix community. The MD5 hash
of the entire package is calculated and its value is transmitted
separately. A user who downloads the package computes the hash of the
copy received and matches the value against the original.
Putting aside the question of how the the hash value can be safely
transmitted separately, there is a potential attack on this method
due to the 128 bit length of the MD5 hash output.
If all the individuals having input to the creation of the original
software package are trustworthy, then 128 bits would appear to
provide adequate security. A man-in-the-middle attacker would have to
solve a 128 bit problem to create a Trojan horse infected package
that passed the hash verification. That is considered computationally
infeasible, at least until the advent of quantum cryptography.
One might think the above argument proves MD5 is sufficient, since if
an attacker had an agent working inside the organization that
produced the package, the agent could simply insert the Trojan
software patch in the original package. However such an insertion is
very risky. A sophisticated software company would likely have code
reviews that would make introduction of the Trojan code difficult. In
an open source model, anyone could detect the insertion. The
insertion would then be foiled, the agent would be uncovered and the
technical means that the Trojan employed would be compromised.
A safer attack would be for the agent to insert an apparently
innocent modification to the package selected so that the MD5 hash of
the package with the Trojan code matches the hash of the original
package. Since the attacker controls the Trojan code, calculating the
value of this modification is subject to the birthday paradox and
presents presents a 64-bit problem. Solving such a problem is within
the means of a well-funded attacker today.
The modification could be designed to get past code reviews in a
number of ways. For example, 64 low order bits in a JPEG icon might
be altered. The agent would have to be in a position to make the last
modification to the software package prior to release and to send a
final pre-release version of the package to the attacker, but those
are hardly insurmountable hurdles. In the open source model, where
new releases can be frequent, it may suffice to carry out this attack
only occasionally, say to recover private keys.
The obvious solution to this problem is to use a wider hash. For
example, SHA-256 would present an group using this attack with a
128-bit problem. Even SHA1 would be preferable, making such an attack
an 80 bit problem. The cost of using a wider hash in this situation
is trivial. It would seem the prudent thing to do.
Arnold Reinhold
---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com
More information about the cryptography
mailing list