[Cryptography] cheap sources of entropy

Tue Feb 4 07:33:36 EST 2014

On 2014-02-03, Tom Mitchell wrote:

>       (BTW, it's not even clear that those measurements are relevant to
>       today's disk drives and adapters.
> 
> Bingo... not relevant in the presence of modern SSD and also the built 
> in disk buffer  prefetch and more tricks of modern disks that minimize 
> some or all of the assumptions  for spinning media.

I wonder whether scrypt-like tricks can be used to force even a VM to 
deliver some of the underlying timing variability into the software. 
Okay, you're going to need some rather interesting assumptions, but...

Suppose you can somehow force your working set to be larger than any 
reasonable cache. At one point you're bound to spill onto disk, so that 
seek latency becomes an issue. If you know what's going to come back 
from the disk because you wrote it, but the adversary does not, you can 
now verify which writes actually came from disk and which ones didn't. 
Thus you just amplified any existing randomness you had a little bit if 
you measured the roundtrip time.

You could attack something like that at least by simulating the program 
or quantizing the times. Quantization you can combat at least by working 
with longer time scales, too: it only rounds out the lower order bits, 
but not longer term drift. If you collect that too, at some point you 
will have enough randomness to reseed. Simulation/prediction you can 
mitigate a little bit by issuing your own probes deterministically so 
that they don't leak (all the) state, but full simulation of course 
still works. So, can we make at least some simulation strategies 
relatively expensive?

A roundtrip even to external cache costs latency, as do context switches 
and the like. If you can somehow do something within the CPU which is so 
tightly locked into real time that it's below such latencies, yet has 
two different outcomes which are deterministic but based on state which 
is not immediately observable from the outside, maybe you could derive a 
deterministic execution path which is prohibitively expensive to 
emulate in real time, at least without custom hardware? My idea goes 
something like this:

Do a tight loop which times against the CPU's cycle counter, while 
exercising the L1 cache. Find the minimum over a longer time and store 
it; you can't go faster than no context switches, TLB misses &c. Now 
watch for a single non-minimum, store the cycle counter in a register, 
and if the next loop is again minimum time, exit. Now you know 
"something" happened in between besides straight execution, which will 
have affected the stored counter, and can be used to branch in a manner 
which can't be simulated with zero latency. Then compound the trick a 
dozen times; suddenly the internal state of the program ought to be such 
that it can't be replicated too easily. Now amplify that via the working 
set construction above, and repeat.

Nothing's absolutely sure in a VM environment, but as soon as you have 
those outside, linearly increasing clocks at your disposal, I'm betting 
you can make life pretty miserable for the adversary: what you need to 
emulate something like that isn't standard hardware, and when done 
right, will in any case lead to such extra latency in the inner, faster 
manipulation that when amplified, it'll show. If it doesn't, you can be 
reasonably sure the adversary doesn't know the entire state of your 
program in time, and then you can amplify that, one bit at a time, via 
real randomness like disk turbulence. Then in the future you can do a 
quantized rekey, and suddenly you have those 160 bits of real seed 
within your state.

Or, so goes the theory/sketch.
-- 
Sampo Syreeni, aka decoy - decoy at iki.fi, http://decoy.iki.fi/front
+358-40-3255353, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2