[Cryptography] let's kill md5sum!

Sun Jun 14 09:28:30 EDT 2015

On Sat, Jun 13, 2015 at 3:52 PM, Ray Dillinger <bear at sonic.net> wrote:

>
>
> On 06/06/2015 12:48 PM, Alexandre Anzala-Yamajako wrote:
> > Just a thought...
> > If we re going to kill of md5sum and break user's habits and scripts we
> > might as well do it once and for all.
> > Why not build a tool called hashsum whose options are md5 sha2 sha3 and
> > blake2 ? This tool could be transparently updated wo breaking
> compatibility
> > in the future and the man page would explain the rationale for each
> option
> > (md5 would be indicated as deprecated but there for verifying old file
> > hashes for example)
>
> But the problem with 'hiding' the hash algorithm behind a toool
> named hashsum is that if the algorithm behind it ever changes,
> then a bunch of big userland software archives, repositories,
> filesharing systems, and databases will immediately break.
>

Breaking existing userland stuff isn't something you can fix
> by hiding the change behind a generic name suitable for scripts
> etc.  Any change means that all the existing checksums are no
> longer good, and all the data in those vital applications is
> suddenly useless.
>
> What's needed is a way for migration of userland applications
> from one hashing algorithm to another to happen.  That means
> additional functionality has to be added to all that database,
> archive, and repository software:  It needs to be able to take
> a one-time command to replace (all and _only_ the correct)
> checksums of the current algorithm, with checksums that are
> correct according to the new algorithm.
>

We had a long discussion of this on the OpenPGP list. This is the result:

https://tools.ietf.org/html/draft-hallambaker-udf-00

The basic idea is that the fingerprint consists of a version/algorithm
identifier which is initially one byte and an optionally truncated hash
value. The default encoding is Base32 which is the best compromise between
density and convenience.

The initial spec has code points for SHA-2-512 and SHA-3-512 (the latter
obviously not implemented yet).

   The binary encoding of a fingerprint is calculated using the formula:

   Fingerprint = Version-ID + H ( Content-ID  + ':' + H(Data))

The point to this scheme is that it allows one fingerprint to be used to
declare a trusted anchor of any type or format. In the crypto space these
might be:

* Open PGP Key Binding   application/openpgp-key-v5
* PKIX KeyInfo element    application/pkix-keyinfo
* DNSSEC trust anchor   application/dnsssec-key

If you don't know the content type of a data blob, you can efficiently
check against a list of known content types without having to reprocess the
content each time.

A code distribution is a type of trust anchor. We could use the content
type application/executable but it is probably better to differentiate
according to processor architecture, binary format, etc.

One of the changes in the industry that more folk should be taking notice
of is that Microsoft has put their .NET code out under an MIT open source
license and will be supporting OSX and Linux platforms. Starting to use
managed code by default for crypto and network code would be a huge step
forward in security.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.metzdowd.com/pipermail/cryptography/attachments/20150614/0835a912/attachment.html>