[Cryptography] Does RISC V solve Spectre ?

Sat Mar 24 09:49:18 EDT 2018

> Shouldn't all of that stuff be in the compilers, where it can be done
> ahead of time and without occupying silicon nanoacres, and without
> converting watts of electricity into heat during runtime?
> 
> Speculative caching is strictly less efficient than a compiler-generated
> instruction that says "precache xxxxx to yyyyy now" getting executed
> well before needing the data....
You're forgetting the joke "back derivation" or the acronym RISC:  Relegate Important Stuff to Compilers.

The idea pretty much failed in the market - and not just in the market of money, where the x86 won for extrinsic reasons.  If you look at the idea as expressed in the earliest RISC designs - the IBM 801, which started the whole "RISC revolution", and the MIPS, which stood for Microprocessor without Interlocked Pipeline Stages - they really did rely on the compiler to do "unnatural" things to accommodate the hardware, like dealing with the load delay slot on the MIPS.  But designs that grew from them - the IBM Power, which is not at all a RISC according to the original design criteria; and the MIPS II, which eliminated the load delay slot - all trended in the direction of putting more of the stuff back into hardware.  (SPARC could never get rid of its branch shadow because that had visible semantic effects; whether it actually *helped* much in later incarnations, I have no idea.)  A kind of side-evolution, mentioned by others, was VLIW machines, which also left all sorts of stuff to compilers.  They didn't do well either.

Exactly why would make for an interesting dissertation.  I suspect a big part, however, was that if you expose all the innards and corners of the hardware to the compiler, you end up having to compile separately for each instantiation (or you need to control the details of instantiations so tightly that the hardware guys choke).  The VLIW guys explicitly acknowledged this from the beginning, and chose to attack an area (large-scale scientific software, then the realm of traditional "super-computers") where they thought it wouldn't matter.  But even there, as it turned out, it did.

If you look at late RISC designs, like the Alpha, the real "Relegate" part settled on areas like control of details of paging.  This put it squarely into the area of the OS, which always had to deal with variations among instantiations anyway.

Of course, with the rise of JIT compilation, perhaps this is no longer an issue.  It would make for an interesting experiment.  The JIT compilers I'm aware of do little to accommodate hardware variations - though perhaps they feel they don't need to, because those details don't much matter in the machines and for most code we actually use today.

But, OK, enough history:  Suppose there were a "load speculative" instruction available to the compiler.  *Would it actually help* to block the attacks in question?

The answer to that is, of course, depends on where compiler issues them.  At one extreme, it *never* issues them and the code is safe but slow.   At the other extreme, where it issues them based on actual projected *possible* need for the data - it's really doing nothing more "secure" against these attacks than the hardware itself.  In fact, it arguably makes the job of an attacker easier, as one can now look at the code and tell, without guessing about details of internal hardware prediction models, and completely deterministically, exactly where data of interest might get loaded into the caches.

So I guess the theory is:  The compiler is smart enough to know where "sensitive" data is present, and *not* request speculative loading of such data.  Exactly how it determines that ... who knows.  But even if that problem is solved:  The only way you can hope to get significant performance out of speculative loading is if *the vast majority of data* can be safely speculatively loaded.  Otherwise, you're doing work that rarely actually gains you anything - so why do it at all.

But if *most* loads can be speculated safely; and we assume we can tell the difference at compile time; doesn't it make more sense to have an instruction to *disable* speculative loads in the rare locations where appropriate than one you have to insert in every piece of code - significantly decreasing instruction density and hence pressure on the instruction loading path, already a bottleneck?

And indeed, *that's exactly what the hardware designers are doing*:  Adding ways to force non-speculation where appropriate.
                                                        -- Jerry