[Cryptography] "Flip Feng Shui: Hammering a Needle in the Software Stack"

Sat Sep 3 06:51:40 EDT 2016

> 
>> Yes, this attack does show that hardware that's vulnerable to this attack simply cannot be trusted to run the software you think it's supposed to be running.
> 
> Jerry hits the nail on the head here. The bug is unreliable hardware. Rowhammer raises the probability of this bug occurring, but it could occur without an attack. So the short answer is, "Fix the hardware." Any other fix is a bandaid.
Thanks, but this is only half of it.

Hardware has been unreliable since we started building it.  Nothing in the physical world can ever be fully reliable.  The challenge of building reliable computation on top of unreliable parts has been with us since the Industrial Revolution.  A big part of the reason that digital computation won out of analogue computation is that you can regenerate bits as they decay.  As long as you do so while random perturbations below a threshold - the point at which 0 and 1 have decayed toward each other enough that you can't be sure which you started with - you can get back the exact signal you started with. The same principle applies in error-correcting codes:  As long as no more than a certain threshold number of bits has gone bad, you can unambiguously decode the original value.

In fact, looked at in the most general possible setting, a digital encoding is *ideally* the choice of two distinct points in some metric space.  But in *practice* the points get smeared into some probability distributions.  As long as the distributions don't overlap at all, you can with certainty recover the original points; and if they do overlap, the size of the overlap gives the probability of getting the decoding wrong.  In practice, we can make that probability low enough that it's unimportant with respect to other factors in the entire system.

Abstractly, the original Rowhammer attacks showed a technique by which the error rate could be artificially raised.  Other such attacks have existed over the years - exposing parts to radiation, modifying supply voltages, raising or lower part temperatures, and so on.  What was really new in Rowhammer was that it was an error-inducing attack via physical means which could be carried out entirely through software, without physical access to the hardware.  (You could argue that fuzzing is a predecessor, though it operates on a different level of abstraction.)  Rowhammer is also much more targeted than most earlier techniques, which can only induce an error somewhere in, at best, a single chip; and sometimes in an entire assembly.

Like all failures, induced or otherwise, Rowhammer is the result of a perturbation outside the range covered by the error protections designed into the hardware.  That it's *induced* doesn't change that fact - though it does confirm that when designing parts, you can't just consider statistical models.  Statistically, a Rowhammer-style attack will never happen.  When your opponent is not statistics but an intelligent entity's manipulation, you need a different analysis - worst case, not average or expected case.  As it happens, not all hardware is actually vulnerable anyway, and now that the attack is known, hardware will be designed to avoid it.  Of course, changing out hardware is a much, much slower process than patching software....

Beyond showing that they could induce faults, the Rowhammer authors also showed actual exploitation by flipping "high value" bits, particularly things like privilege bits.  FFS showed a different exploit vector, one in which the targeting need not be as precise.  Indeed, the idea of breaking RSA by inducing faults in its implementation has been explored in the past.

Cryptographic implementations inherently have a much harder job than most code.  Most code doesn't face an intelligent opponent.  And most algorithms don't exhibit the exquisite sensitivity to small failures that cryptographic code - especially RSA - does.  We've seen plenty of examples at higher levels of abstraction.  "Trivial" errors in protocol design, protocol implementation, encryption design and implementation, have lead to broken systems.  There are reasonable but unprovable arguments that intelligent opponents have deliberately steered the design of protocols and algorithms (particularly ECC) into areas where implementation errors are so hard to avoid as to be inevitable.

The broad question of how to design *the entire stack* to be resistant to failures, from the hardware layer all the way up to the operational layer, is a difficult one. "Eternal vigilance is the price of security".  :-)

                                                        -- Jerry