[Cryptography] Making sure memory erasure is not optimized away

Phillip Hallam-Baker phill at hallambaker.com
Sat Aug 27 12:32:55 EDT 2022


On Sat, Aug 27, 2022 at 11:54 AM Ron Garret <ron at flownet.com> wrote:

>
> On Aug 25, 2022, at 12:24 PM, Phillip Hallam-Baker <phill at hallambaker.com>
> wrote:
>
> [Before we start, use language X instead is not an answer here.]
>
> We all know array bounds checking etc. is a good thing to have to write
> any application code. But the cost of abstracting away memory management is
> that cryptographic code has the very particular property that we want to
> ensure data is cleared AFTER USE rather than BEFORE REUSE.
>
> A lot of applications written in high level languages are vulnerable to
> attacks when someone uses portable assembly language, aka C to write a
> program that allocates large chunks of memory and then greps through it
> looking for 'good stuff'.
>
> So the question is how to ensure this does not happen by implementing
> disposal mechanisms THAT DO NOT GET OPTIMIZED AWAY.
>
> See here is the thing. I can check my code and check my code but I can
> only check the current version of the compiler/optimizer. And some of the
> things I know the C# optimizer is now doing are pretty hard core. Yes, when
> generating assemblies, it can optimize across assembly boundaries now.
>
> I am pretty sure most other high level languages suffer from the same
> thing unless there is a mechanism to explicitly state 'do not optimize'.
> Same goes for things like the Montgomery ladder which isn't as reliably
> constant time as some people might imagine.
>
> Has anyone got pointers to ways to make sure this is done right?
>
>
> This is a Really Hard Problem that goes beyond the language level.
> Nowadays these kinds of optimizations happen *in hardware*.  Even in
> assembly language there is often no way to guarantee that if you write a
> word to memory that anything actually happens beyond the L0 cache.  It is
> simply not possible to draw a reliable security perimeter around a
> *process* on modern hardware.  This is one reason secure enclaves have
> become a thing.
>

Worse may be better in this case.

I spent the past fortnight implementing Kyber and Dilithium in C#. Hence
bringing it up. Looks to me like a hardware implementation on chip is
actually pretty straightforward. The costliest is Dilithium which requires
about 64KB of memory plus a SHA3 core and a multiplier.

On chip hardware is the only way we are going to get security against the
likes of ROWHAMMER and SPECTRE.

The big issue would be how to trust an on chip crypto cpu.

Threshold makes that a lot easier. The application layer generates a key
pair {x.p, x}, the HSM has a keypair {y.p, y} built in that cannot be
extracted (could be randomized/cleared though). The composite key
{(x.P+y.P, x+y} is used to do all the crypto. The application can just hand
off the additional share key to the co-processor to compute on.

We don't have PQC Threshold yet but it may be possible for Kyber at least.
The private key shares become huge (over 8KB) because you have to send the
expanded matrix about. I haven't tried to make that work yet but it would
be good if we could at least get threshold key generation working.


PHB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.metzdowd.com/pipermail/cryptography/attachments/20220827/00370ad3/attachment.htm>


More information about the cryptography mailing list