<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Jan 22, 2018 at 2:34 PM, Bill Frantz <span dir="ltr"><<a href="mailto:frantz@pwpconsult.com" target="_blank">frantz@pwpconsult.com</a>></span> wrote:<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The immediate question that occurs to me is, "How do we handle shared memory? Both R/O and R/W?".<br></blockquote><div><br></div><div>For an architecture like lowRISC (based on Berkeley's Rocket RISC-V core, I believe), the answer is, for every word of memory (i.e. 64-bits) include a set of attribute bits that control a number of rich attributes. lowRISC also features complete physical separation of control and data, meaning that it should be physically impossible for these out-of-band tags to be interfered with by program logic.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

If we have a separate cache for each protection domain, then that domain is the only thing that can affect what is in that cache. Can we afford to fetch a shared word from a sister cache entry, or does that signal too much information? What about memory which is shared between mutually suspicious actors? (Probably R/O, but there may be uses for R/W.) There are a whole lot of questions.<br></blockquote><div><br></div><div>Much in the same way Meltdown was mitigated using KPTI to check page-level attributes, an every-word-tagged memory architecture could respect word-level attributes. This gives you the benefits of having shared memory with the same protections you might hope to gain from a coarsely-grained partitioned cache.</div><div><br></div><div>In this sort of architecture, speculation units could proceed until they hit a memory access violation, at which point the memory becomes inaccessible as enforced (via a synchronous check) by the various memory subsystems. This should prevent the sidechannel from even happening in the first place (i.e. all memory subsystems synchronously deny access to the CPU for requests with insufficient privilege), without placing an undue burden on speculation unit designers.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

One approach I like is a massive number of simple cores that don't speculate.</blockquote><div><br></div><div> This approach is where RISC-V soars today: <a href="https://fuse.wikichip.org/news/686/esperanto-exits-stealth-mode-aims-at-ai-with-a-4096-core-7nm-risc-v-monster/">https://fuse.wikichip.org/news/686/esperanto-exits-stealth-mode-aims-at-ai-with-a-4096-core-7nm-risc-v-monster/</a></div></div><div><br></div>-- <br><div class="gmail_signature">Tony Arcieri<br></div>

</div></div>