[Cryptography] letter versus spirit of the law ... UB delenda est

Jerry Leichter leichter at lrw.com
Sat Oct 24 15:16:49 EDT 2015


Further on the matter of "undefined".  Early computers often left the result of some operations "undefined" - implementations did whatever they did.  The PDP-8 was famous for the "undefined" operations that people tested out and found uses for.  Of course, this made it hard on designers of new implementations of old architectures, as they were often called on to make new designs duplicate incidentals of old ones.

By the time DEC designed the VAX - intending to a whole series of machines conforming to a single architecture - they explicitly specified as much as possible.  For example, unused input bits were specified as MBZ (Must Be Zero) and implementations were required to take a fault if any MBZ bit was passed as 1.

Still, there were some situation in which it was undesirable to pin down the exact results.  The VAX architecture defined two terms:  Undefined and unspecified.  Undefined was essentially as you think of it:  You got results written where you would  get them in the normal case, but the could be anything.  Unspecified meant the machine could do anything at all.  Unspecified results were allowed only in kernel mode.

Sounds like they pinned everything down fully, right?  Well ... a friend of mine who liked to stir up trouble brought up the following question (which brings us back to security and crypto):  Could an "undefined" result include information that the current process was not entitled to - e.g., the value that had been in a register while the previous process was running?  He brought this up in the internal VAX Architecture discussion group (Notesfile, for those who remember such things.)  All the VAX designers hung out there.

DEC was an engineer's company, and people didn't ignore challenges like this.  They quickly determined that the *architecture* didn't forbid this.  But the hardware guys went of and checked every implementation for each of the "undefined" results.  As it happened, in all cases, the actual result was either whatever had been there before, or 0.  Big sigh of relief by all - followed by discussions of how to modify the architecture spec to make sure no such leakage was possible.  We all eventually decided that there was no effective way to add the appropriate language at that point, so we left it alone - with internal guidance to future architects to warn them of the potential exposure.

The Alpha architecture, designed a few years later, inherited the notion of Undefined and Unspecified behavior.  However, the definition of Undefined was pinned down:  There's a defined "user state" of the machine at any point in time, all of it accessible to user-mode code; and the value produced by an Undefined operation must be a function of the user state and nothing else.

The point of all this?  The problem with the C standard is that it's use of "Undefined" is really like DEC's use of "Unspecified".  The reasoning is roughly similar:  For a VAX or Alpha, a kernel-mode programmer is assumed to know what he's doing, and the software he's writing is tightly integrated with the hardware.  The traditional C "contract" is "trust the programmer".  Same idea:  The hardware, the C compiler, and the programmer are assumed to be tightly coupled partners in getting the job done.

Unfortunately, that description covers only a tiny fraction of users of C today.  DEC's notion of "Undefined" things would be much safer and more predictable.  The compiler jocks will claim the resulting code won't be as small or fast ... but that's a claim that needs to be defended on real code, not on artificial examples concocted to show off some new kind of optimization.
                                                        -- Jerry



More information about the cryptography mailing list