[Cryptography] cheap sources of entropy

Sun Feb 2 12:27:53 EST 2014

   -----Original Message 1 -----
From: James A. Donald
Sent: Saturday, February 1, 2014 22:26
Subject: Re: [Cryptography] cheap sources of entropy

[ ... ]
The only efficient way to organize the system is for process switches to 
be triggered by the arrival of data.  Fail to do that, you wind up 
reading one sector per platter rotation.  If you want to read sectors as 
the platter rotates, you have to do process switch on disk event, not 
timer event.

If you do that, switch process on disk event, rather than the timer 
event, process switches will occur at times dictated by disk drive 
turbulence when a process is reading data.
[ ... ]

   -----Original Message 2-----
From: James A. Donald
Sent: Saturday, February 1, 2014 22:31

[ ... ]
The hypervisor is going to switch a process out when it wants data that 
is not yet available, rather than switching it on a clock.

If it switches a process in when data is available, rather than 
switching on a clock, turbulence is going to show up, even if that disk 
is on a network on the other side of the data center.
[ ... ]

   -----Reply-----
I am baffled by these assertions.  They strike me as very brittle conditions that are contrary to how input-output latency is dealt with in production systems.  Perhaps I don't understand the context in which these assumptions apply.

The red flag for me is "the only efficient way."  The way latency is minimized in modern general-purpose systems is to not make success dependent on the behavior of a single requester and to serve input-output requests more indirectly.  This kind of throughput optimization can, of course, extend the elapsed running time of individual processes on a busy system while working to avoid overall slow-down because of input-output latency issues.  

I also think it is important to be explicit about where the measurement/instrumentation is happening and how it is available for use where needed as an entropy source.  Provide more context about where this is assumed to happen, please.  For whom is turbulence going to show up, and how is it delivered as an entropy source?

CONTEXT CONSIDERATIONS

Minimizing latency of disk access has long ago been an optimization at the system level and not at the individual requester level.  Jerry Leichter states some examples of the approaches that have been developed as far back as the 1960s and only become better with distribution of operations down the storage hierarchy.  

Some real-time response capability is required.  That is kept at a deep level where processor attention is seized for very brief periods.  In particular, the requests for sectors and certainly tracks can be on behalf of multiple running applications.  The activity optimizes across requests from multiple sources.

There is significant variability when a file-system block of the kind to be read/written on behalf of an user process (or other tenant) is completed following the request.  The move toward asynchronous input-output requests in user-level programming makes this even more interesting, since input-output is not necessarily a blocking of the requester.  Database-management systems running on guests and certainly on hosts already have significant pooling and non-blocking mechanisms.

On a single guest, there are usually several applications having in-progress input-output (not to mention the handling of virtual-memory swap files and any kind of striping being managed) and many processes being served in some manner.  

Finally, time-quantum exhaustion can lead to process switching simply because an active process has failed to block/yield for other reasons and there are other processes ready and waiting to run on an available processor thread.

There are also layers of priority involved and that determines as much about when an i/o action is observed to be completed by the original requester as anything else.  Of course, finer level interruptions on behalf of interior brief activities is going on at a fairly high rate, whether or not any sort of larger-grain process switching happens as a result.

The question for me is, where can one actually capture a measurement of something?  That is, who is determining i/o times at a level where something like "turbulence" is measurable.  I suppose if we're designing the guest kernel, there might be places to capture something that will certainly show variation for all manner of reasons.  If we're attempting this in a guest's non-privileged application, and we can trust the fastest available clock, some sort of variation is also noticeable.  There are many sources of it on a busy system.

 - Dennis