[Cryptography] let's do something intelligent about md5sum!

John Denker jsd at av8n.com
Sat Jun 13 17:18:36 EDT 2015


On 06/13/2015 11:00 AM, Russ Nelson wrote:

> Fix first problems first:
...
> b2sum: command not found
...
> E: Unable to locate package b2sum

Well, fixing "first things first" is nice, but
sometimes the first thing we see is a tree that 
obscures our view of the forest.

The fact is, introducing b2sum is not sufficient
to "kill md5sum".  It might not even be a step
in the right direction.

This gets back to our previous discussion of
life-cycle strategy:  Even a neophyte designer
realizes his pet project needs an introduction
strategy.
  In contrast it takes more sophistication to
  appreciate the importance of the corresponding
  outroduction strategy later in the cycle.

We have a problem because the md5sum command
never had an outroduction strategy.  Introducing
a new b2sum command will not solve this problem,
and indeed could easily make it worse, insofar
as b2sum doesn't have an outroduction strategy
either.

This is due partly to the nature of the commands
themselves, and partly to the way they are used
within larger systems ... but the point remains,
we need to take a step back and look at the whole
forest.  We need to /solve the whole problem/.

As numerous people have pointed out, it is not
practical to "kill md5sum", and will not be 
practical anytime soon.  A simplified view of
the life cycle should look something like this:
   first-generation thing
   transition / overlap phase
   second-generation thing
   transition / overlap phase
   third-generation thing
   et cetera.

There has been some talk of "agility".  That's
too vague to be useful.
  -- Agility is bad if it leads to promiscuity.
   It leads to downgrade attacks, i.e. the worst
   of all worlds.
  ++ Agility is good if it allows us to construct
   a well-behaved upgrade path.

Therefore I suggest that rather than focusing on
b2sum, we should be thinking in terms of a
   chksum
package that takes a grown-up approach to the 
life-cycle issue:
  -- It should know what to do with a legacy md5sum.
  -- It should know how to generate something better
   if-and-when possible.
  -- It should function correctly during the
   transition periods between generations.
  -- That includes recognizing that it, too, will 
   have to be phased out at some point!
  -- It may have to be a package of things, not
   a just a single program.

================

True story about quick-and-dirty solutions:  The first
program I ever wrote was a solution to a very specific 
problem.  It was written between dinner and midnight 
one evening.  It wasn't designed;  it was hatched.

The funny thing was, the program was still in heavy
use several years later.  People complained that it
had some nasty limitations.
  jsd:        If you don't like it, why don't you improve it?
  other guy:  It's too badly designed, and there's not a
              a single comment in the whole source file.
  jsd:        So why don't you write something better,
              starting from scratch?
  other guy:  Not worth the trouble.  The thing sorta
              works well enough at the moment.....

The 500th time I had this conversation, I got really
tired of it.  I learned my lesson.  The cost of a program
has to include the cost of the whole life-cycle, including
support, evolution, and outroduction.  There's always a 
lot of pressure to get "something" out the door, but 
sometimes that leads to disastrously false economies.

As a tiny but specific example of the sort of thing I'm
talking about, consider a manifest containing filenames
and checksums.  I say the manifest ought to contain some
sort of version marker that indicates which checksum
algorithm was used (md5sum or whatever).  The counter-
argument is that the file is shorter and easier to parse
if it does not contain the version marker ... but still
I insist that omitting the marker is a false economy,
because it makes it harder to blaze a migration path
later in the life cycle.

  "Premature optimization is the root of all evil."
                -- Knuth

  "Solve the /whole problem/."

  "That includes support, evolution, and outroduction."



More information about the cryptography mailing list