[Cryptography] defending against common errors (was: secure erasure)

Sun Sep 11 16:43:43 EDT 2016

On 09/09/2016 03:22 PM, Ray Dillinger wrote:

> operating systems are written in C, so you can't get secure erasure 
> in anything else unless you can get it in C.

That is a false premise leading to an incorrect conclusion.

A typical operating system such as Linux is not entirely written
in C, and the non-C stuff is highly significant to this discussion.

By way of analogy: a typical operating system nowadays supports
multiple processes running on a single processor ... and a single
process running on multiple processors.  Implementing this on an
x86 machine requires using features like MFENCE and LOCK.
  http://x86.renejeschke.de/html/file_module_x86_id_159.html
  http://x86.renejeschke.de/html/file_module_x86_id_170.html

Such features do not exist in the C language spec.  To argue that
they therefore do not exist at all would be incorrect.

I would argue that any "secure erase" routine ought to do an MFENCE.
This concept is foreign to C, but it still needs to be done.

I have yet again changed the Subject: line.  If you are still
asking whether genuinely secure erase can be implemented entirely
in C, the answer is still NO.

As for the more interesting question of how to defend against
common errors, such as we saw in connection with heartbleed, I
suggest that getting the compiler to enforce runtime bounds-
checking is more useful than zeroization.
  http://williambader.com/bounds/example.html
I reckon that turning off bounds-checking is the sort of thing
Knuth was talking about when he said "premature optimization is
the root of all evil."  If you want to optimize things properly,
find the 1% of the code where it actually matters, seriously audit
that code, and then turn off bounds checking for that code only.

Also, as previously mentioned, there are things like munmap()
and electric fence.

When the code makes an out-of-bounds reference, if the buffer
has been zeroized it reduces the seriousness of the problem ...
but it will not detect the fundamental problem, i.e. the broken
code.  Furthermore, compile-time "optimizations" often magnify the
seriousness of broken code.  This includes defeating the attempted
zeroization.  In contrast, munmap(), efence, and bounds-checking
have a better chance of /detecting/ the error so you can fix it
properly.  Zeroizing the buffers is not an acceptable substitute
for fixing the code.

> you can't get it in assembly language unless you can get it in
> silicon.

I agree with that.

This gets back to the point I've been making for a while:  Real
security needs support at every level from the transistors on up.

As a small step in that general direction, one could argue for
the opposite of MFENCE, perhaps an ENCLAVE directive, i.e. a way
to declare that certain information must spend its entire life
inside some well-specified enclave.
 *) Familiar example:  If you assume that main memory is secure
  but the swap device is not, then it makes sense to disable
  swapping.  This requires support from the OS.
 *) If you assume the L2 cache is secure but main memory is not,
  it would be nice to have a way to require that selected bits
  of cache never get spilled to main memory.  This requires
  support from the hardware at a rather low level.  This is
  not a completely crazy idea; note that there already exist
  "some" cache control instructions that do "almost" (but alas
  not quite) what you want:
    http://x86.renejeschke.de/html/file_module_x86_id_143.html
    https://en.wikipedia.org/wiki/Cache_control_instruction

More generally:  As part of asking What's Your Threat Model, it
always pays to ask What's Your Security Perimeter.

Here's another tiny step in that general direction.  On 09/09/2016
09:10 AM, Henry Baker suggested "linear types" as a  style of
programming where the hardware is not allowed to make copies of
certain information.  That's tricky to implement.  Tricky, but
not impossible.  Consider the assembly code shown below.  Alas,
the hypothetical "shy" and "unshy" instructions do not exist in
present-day hardware.  The rest is perfectly ordinary.

The idea is that we don't trust anybody, not even the interrupt
handler, with the information in variables a,b,c,d,e, and f.

The specification goes like this:  The instruction "shy %eax"
means that from now on (until the next unshy), whenever there is
an interrupt, hardware will clear the %eax register before the
interrupt handler gets control.  Also, the return-from-interrupt
must return to the most recent "restartable point".

For this to make sense, the code between restartable points
must be idempotent.

mul:
        shy     %eax            /* hypothetical */

        /* shy restartable point */
        movl    a(%rip), %eax   /*   a *= b;     */
        imull   b(%rip), %eax
        movl    %eax, a(%rip)

        /* shy restartable point */
        movl    c(%rip), %eax   /*   c *= d;     */
        imull   d(%rip), %eax
        movl    %eax, c(%rip)

        /* shy restartable point */
        movl    e(%rip), %eax   /*   e *= f;     */
        imull   f(%rip), %eax
        movl    %eax, e(%rip)

        /* shy restartable point */
        xorl    %eax, %eax      /* for security */

        /* shy restartable point */
        unshy                   /* hypothetical */

        ret

The key idea here is that if you want to control the making
of extraneous copies, it's doable, but not easy.  As always,
security requires attention to detail, at every level, from
the fundamental physics on up.

Tangential technical points:

*) It should be obvious at runtime where the "restartable points"
 are.  This includes before loading the register.  It also includes
 before and after clearing the register.

*) All points outside the shy...unshy block are restartable, in
 the ordinary old-fashioned way.

*) In case you're wondering why we don't just turn off interrupts
 in the mul() routine (instead of backtracking over idempotent
 code):

 a) Some modes on the x86 have a non-maskable interrupt.
  Whether that is a good idea or not is debatable, but let's not
  go there.

 b) More to the point:  Suppose the user code has a low priority,
  and interrupts are rare but urgent.  That's a good reason to
  grant the interrupt immediately and backtrack over the user
  code when necessary.

 c) If you prefer an alternative implementation that disallows
  interrupts except at restartable points, that's OK with me.
  The key idea is the same either way.