[Cryptography] Secure erasure

Tue Sep 13 13:43:33 EDT 2016

At 08:14 AM 9/13/2016, Ron Garret wrote:

>On Sep 12, 2016, at 8:44 PM, Henry Baker <hbaker1 at pipeline.com> wrote:
>
>> At 03:48 PM 9/12/2016, Watson Ladd wrote:
>>> If only there was a widely used embedded language designed by the DoD with built-in bounds checks, widespread compiler support on different architectures, and in GCC.
>>> 
>>> Maybe it's even named after a woman.
>> 
>> Having studied Ada at some length, I see no significant advantage
>> of Ada over C for any of the purposes we discuss on these lists.
>> 
>> My opinion isn't based on the characteristics of the language,
>> per se, but on the inability to control the fine details of the
>> implementation.
>> 
>> What's the point of having a garbage collector (in some versions
>> of Ada), if you can't control fine details enough to make sure
>> that it isn't going to ruin your real-time performance?
>
>Who are you, and what have you done with Henry Baker?  Because Henry Baker would surely know that:
>
>1.  Ada doesn’t have GC, not even in "some versions of Ada."  Ada was designed for writing embedded controllers, and everyone (you included, apparently, whoever you are) thinks that you can’t do that in a GC'd language.  (Everyone is wrong, and Henry Baker would surely know that too.)
>
>2.  Even if some versions of Ada had a GC you could easily avoid any problems that might cause by simply not using those versions.
>
>3.  Crypto code very rarely has real-time requirements.  Constant-time performance is desirable not for its own sake, but in order to avoid side-channel attacks.  Random timing variations produced by circumstances unrelated to key material (like a GC) can actually be beneficial.
>
>4.  Constant-time code can easily be written even in a GC'd language by writing code that doesn’t cons.
>
>5.  Hard-real-time GC's exist if for some reason you absolutely had to have real-time consing code.
>
>6.  The only thing standing between you and the ability to "control the fine details of the implementation" of a language is your willingness to hack a compiler, or write a vendor a check.
>
>#6 is the smoking gun that proves you aren't Henry Baker, because the real Henry Baker could write three Ada compilers before breakfast.  ;-)
>
>All that notwithstanding, I’ll go ahead and take the bait: What language would you advocate using for crypto if not Ada?  Surely not Scheme if your principal complaint about Ada is the GC?

Ada compilers targeting the Java Virtual Machine ... which itself has GC.

Also:

http://www.enyo.de/fw/software/gnat-gc/

Conservative Garbage Collection for GNAT

https://link.springer.com/chapter/10.1007/978-3-540-73230-3_18

Incorporating Precise Garbage Collection in an Ada Compiler

----

GC delays won't do very much to obfuscate crypto code (unfortunately); I suppose that one could conceivably schedule the GC based on calls to /dev/(u)random, but normal (non-randomized) GC delays might themselves be more of an information leak even than other side channels.

---

WWHBD (What Would Henry Baker Do) ?

This is becoming very, very tough.

There are untrustworthy HW CPU's, where "instructions" in the benign form of "load r3,<magic-backdoor-constant#1>" are really "dog whistles" to hidden HW that starts doing bad things like storing registers in hidden memories, etc.  Let's ignore this class of attacks for this message.

There are HW CPU's which simply ignore standard semantics; programmers used to issue *instructions*, historically called "order codes".  Now, however, "instructions" and "orders" are converted into merely "suggestions" and "hints", so there's very little a programmer can 100% rely on to actually be executed.  We started with "imprecise interrupts", graduated to "out of order execution", "weak memory ordering", and now have nVidia's "dynamic code optimization" -- a kind of machine language JIT compiler.  We'll try desperately to ignore this class of attacks for this message.

But even prior to nVidia, we have split instruction & data caches, so that obvious looking instructions like "load r3,*" won't actually load the load instruction itself, but something completely arbitrary, depending upon what the TLB and data caches happened to have had for breakfast that morning.  (This is Torrey's "split-TLB" hack.)  We'll again try desperately to ignore this class of attacks for this message.

There are HW CPU's that allow interrupts between any two instructions; this was, and always has been, a REALLY BAD IDEA, because it made critical sections extremely hard to do correctly.  Anyone who has ever programmed in microcode will fully appreciate the flexibility to mask all interrupts for a few instructions.  Yes, such absolute power can corrupt absolutely, but the answer is code-stripping: the machine refuses to execute more than N instructions non-interruptibly before cancelling the whole program.  But once again, we'll try to ignore this class of attacks.

Due to these *virtualizations* and constraint "relaxations", we programmers are now forced to pleading on bended knee with the compiler, the operating system, and the hardware to please, please do as I ask.  (Cue diatribe re "volatile", etc., here.)  Or more usually, tricking the compiler, the operating system, and the hardware to do as we want -- in *spite of* the best efforts of the compiler, the operating system, and the hardware to interfere.

Given the continued pressure on CPU designers to optimize performance, I'm not certain that "constant time execution" is even possible anymore, so data-dependent information leaks will become ubiquitous.

Part of my background was in *asynchronous* systems, where logic gates were free-running, and took "as long as they needed", rather than being slaved to a crystal clock.  Clocked logic can dramatically reduce gate-count, but it can also dramatically reduce performance.  While fully asynchronous designs haven't ever gotten a lot of traction, clock speeds have increased substantially faster than the instruction speeds, so that instruction timings have become more and more variable.  I dare say that now that logic is so cheap, fully asynchronous designs should be re-evaluated.  Thus, I expect information leakage due to differential timing will go from a trickle to a torrent.

All this means that timing attacks need to be *blinded* by some (probably quite sophisticated) mechanisms, so that the information that does leak will be complete garbage.

Energy attacks will also become more common.  Unless every bit is also associated with an "anti-bit", the differential currents will produce voltage drops which can be measured.  In fact, with asynchronous logic, such voltage measurements will be easy, since different voltages will produce different timings for the same computation.

I'm delighted to see all the interest in IoT.  Now that CPU's and flash memories are *dirt cheap*, it will be possible to throw out the baby with the bath water and design new architectures with totally new goals.  We no longer have to be so stingy with memory (cue plea for address-size-word-oriented architectures); we no longer have to be so stingy with CPU cycles.  We should be able to experiment with *clean sheet architectures* that can *run* C and Linux, but aren't saddled with all of the legacies and compromises of x86 and ARM.  And we can work on new stripped-down versions of C that re-enable *orders that must actually be obeyed*.

The new "container-oriented" programming (incl. "unikernels") allows new types of firewalls that can hinder hackers.  These models also offer the ordinary programmer access to HW features like virtual memory page tables that were previously only available to the OS priesthood.

The biggest problem will be finding students whose minds haven't been so polluted with current architectures and languages that they can't imagine anything else.  So these students will probably NOT be CS students, but more likely physicists, mechanical engineers, pure mathematicians and/or artists.

So WWHBD ?  *Cut the Gordion Knot!*

The current morass ("mess"?) of compilers/operating systems/architectures is no longer worth trying to untie.

I would plea for such "clean sheet architectures", because the amount of money required to develop these will be far less than the amount of money required to deal with the insecurities of the current code base and architectures.

I take inspiration from the engineers at Tesla, who re-imagined the automobile that hadn't fundamentally changed in a century.  Revolution is possible, but only through the "creative destruction" of worn-out ideas.