[Cryptography] real random numbers

Sat Sep 14 15:29:35 EDT 2013

This discussion will progress more smoothly and more rapidly
if we clarify some of the concepts and terminology.

There are at least four concepts on the table:
  1) At one extreme, there is 100% entropy density, for example
   32 bits of entropy in a 32-bit word.  I'm talking about real
   physics entropy, the kind of hard-core industrial-strength
   randomness that will never get cracked by anybody, ever.
  2) It is also possible to have lesser entropy density, such
   as 3 bits of entropy in a 32-bit word.  Such a word is
   partly predicable and partly not.  Remark: I emphatically 
   recommend that we pay attention to cases where we have a 
   provable lower bound on the entropy, even if the entropy 
   density starts out less than 100%, because the density can
   be increased to virtually 100% by hashing.  This is how
   turbid works.
  3) At the other extreme, there is complete determinism, such
   as the decimal digits of π, or even the decimal digits of 
   1/3rd.
  4) Last but not least, there is the "squish" category.  This
   case is covered by Murphy's law.  That means:
   -- If you wanted the symbol to be predictable, it would not
    be reliably predictable.
   -- If you wanted the symbol to be unpredictable, it would
    not be reliably unpredictable.

Illustration:  Back when I was 17 years old, I needed a random
number generator.  The pointy-haired boss directed me to use
the value of the instruction pointer at each interrupt.  I had
to explain to him that it was an event-driven system, so with
high likelihood, the IP pointed to the idle loop.  It was clearly
a squish.  You couldn't rely on it being predictable, but you
also couldn't rely on it being unpredictable.

===========

To say the same thing yet again:  For non-adversarial situations,
you can do pretty much whatever you like, and you might reasonably
care about the typical case ... but a great many applications 
of randomness, including gaming and cryptography, are highly 
adversarial.  That demands a minimax approach.  The best case 
doesn't much matter and the typical case doesn't much matter;
we need to focus attention on the worst case.

Absence of proof is not proof of absence.  For the important
applications, we need a provably-good RNG.  The second law of
thermodynamics provides the required guarantees.

============

Here's another way of looking at almost the same issues.  People
on this list have been talking about combining every possible 
source of "randomness" ... the more the merrier.  Again there 
is a crucial distinction to be made:

 a) In the linux "random" device, /any/ user can mix stuff into
  the driver's pool.  This is a non-privileged operation.  The
  idea is that it can't hurt and it might help.  So far so good.
 b) Contributions of the type just mentioned do *not* increase
  the driver's estimate of the entropy in the pool.  If you want
  to increase the entropy-estimate, you need to issue a privileged
  ioctl.

If we want to have a meaningful discussion, we must distinguish
between 
 a) mixing in all sorts of squish, and
 b) mixing in some real entropy /and taking credit for it/.

Again, step (a) cannot get anybody into trouble.  Step (b) gets
you into trouble if you claim credit for more entropy than was
actually contributed.

==============

Let's be clear:  There is a huge difference between contributing 
worthless squish and contributing real entropy.  IMHO the key 
distinction is whether or not you have a hard /lower bound/ on 
the amount of entropy, so you can claim credit for it, and be
confident of the claim.  I strongly recommend not bothering with
things that /might/ be unpredictable, and focusing instead on
things that are absolutely guaranteed to be unpredictable, as
guaranteed by the laws of physics.

On 09/12/2013 07:38 PM, Thor Lancelot Simon wrote:

> The audio subsystem actually posed *two* obvious opportunities: amplifier
> noise from channels with high final stage gain but connected by a mixer
> to muted inputs, and clock skew between system timers and audio sample
> clocks. 

I wouldn't have said that.  In a multi-stage amplifier, the noise 
figure is set by the /first/ stage, not the "final stage".  More 
importantly, it hardly matters which stage contributes which part 
of the gain, so long as there is "enough" gain before the signal is 
digitized.  Most computer audio systems have enough gain to make 
the Johnson noise observable.  This gives us a source of hard-core, 
industrial-strength, guaranteed physics entropy.  In typical cases,
increasing the gain helps a tiny bit, but doubling the gain does 
/not/ double the amount of entropy per sample;  it only increases 
it by one bit.

Muting the inputs cannot help and might hurt.  It might well drop
the entropy production to zero.

Things like clock skew are usually nothing but squish ... not
reliably predictable, but also not reliably unpredictable.

I'm not interested in squish, and I'm not interested in speculation
about things that "might" be random.  I'm interested in things that
/guarantee/ a usable amount of entropy.  The second law of thermodynamics
provides such a guarantee.

============

The word "random" means different things to different people.  There
is nothing to be gained by arguing over the definition.  Feel free
to use this word however you like.

In contrast, please do not use the word "entropy" loosely.  There is 
a very well-defined and well-understood notion of physics entropy.
Tacking on additional metaphorical or just plain sloppy usages just
pollutes the discussion and serves no useful purpose.

There is a huuuuge difference between squishy randomness and real
physics entropy.