[Cryptography] cheap sources of entropy

Sat Feb 1 23:27:08 EST 2014

On Feb 1, 2014, at 4:58 PM, James A. Donald wrote:
> On 2014-02-02 06:38, Bill Stewart wrote:
>> Definitely not.  If you're on a VM, you have 0..n virtual disk drives, which the hypervisor simulates from a datastore pool and maybe some cache.
> 
> Underneath all that are real material disk drives, which have turbulence.  The turbulence causes random and entirely unpredictable timing variations, which unpredictability and variation propagate all the way to the VM
No, Bill Stewart is right.  There are multiple layers of software with all kinds of buffering, queuing, operations that are kicked off by clocks at fairly long intervals (way longer than the timing variations seen in disk responses), in between.  It's highly unlikely that any low-level variation in disk response times will be visible by the time you reach the guest OS.

There *will* be variations, but exactly what produces them, what they are correlated with, how predictable they are, would be extremely difficult to answer.  If you go back to the original paper on disk drive timing variations, you'll see careful work to figure out exactly what kinds of variations disk drive timings will produce, and then actual measurements to show that the results really match the physical models.  No one, as far as I know, has done any work like that in a virtual environment - and frankly I doubt anyone could.  The pieces are just too complicated.

Now, you could if you wanted just say, well, if it's too complicated to analyze, it's too complicated to attack.  *You* could.  Personally, having read Knuth's introductory section on PRNG's in TAOCP, I would not.  (And also having had to debug and fix distributed systems that failed exactly because of completely unanticipated correlations in behavior, I would not.)

>>  You don't get any access to the real device, even though the hardware drivers look like they're talking to a disk.
> 
> You don't need direct access to the real device.  The the real turbulence in the real device causes random variation in the time that your buffer gets filled.   So just hash the cpu clock into your stockpile of randomness every time that you read data that is likely to need to come from disk.  And then your VM is reading real randomness from real turbulence on the real disk.
Go back to the paper that proposed using turbulence and repeat some of their tests in a virtual environment.  Let us know what you *actually observe*.

(BTW, it's not even clear that those measurements are relevant to today's disk drives and adapters.  Technology changes.  The intelligence in every component is much higher today than in was even a few years ago.  That intelligence gets used to adjust behavior in ways that potentially wash out low-level variations.  We don't need to repeat the idiocy surrounding Peter Gutmann's "magic 35 disk erase patterns" - carefully chosen to cover all the relevant technologies of 1996, not one of which has been used in a decade or more.  If you want to take the old analyses and apply them to modern technologies, you might produce something worthwhile.  Hard work, though.  Just applying the old *results* without considering the context, though, is nonsense.)

                                                        -- Jerry