[Cryptography] Final words on RNG design

Sun Dec 4 07:37:29 EST 2016

> * What an RNG is needed for; 
> 
> Anything that requires unpredictable elements.
This goes wrong right from the get-go.  A completely random source whose outputs are leaked to the opponent - even long after they are produced - is completely useless.

It's easy to set up a *formal* definition of the requirements by writing definitions in terms of inputs to functions:  The legitimate participants get the "random values" as inputs; the opponent does not; (ideally) the opponent gets no advantage in computing whatever it is that would constitute a break in the system over just guessing the missing input.

But turning this formalization into actual real-world specifications - pinning down the actual information leakages and biases and such - is very tricky.  Yes, if the opponent can predict the output of the random number generator, it's game over - but that's the simplest case.

> * How should we define entropy? 
Personally, I try to avoid using the word in this context.  It's a powerful theoretical/formal notion - well, *multiple* related but different theoretical/formal notions.  The vast majority of people who use the word in this context have a gut feel for what they *think* it is, but that gut feel doesn't align well with the formal definitions.  Talking about the entropy of your source or your generator makes it *sound* as if you have a theoretical basis for what you're doing - but most of the time, there's actually no such basis there.

> Given our sources of randomness and our pool of collected bits, and given all of the information and resources accessible to our adversaries, the computational entropy of the pool is defined as that which is the minimal amount of computational effort ("work factor", "bruteforce", etc...) that the adversaries would collectively need to recover the state of our pool, or to otherwise predict its outputs (expect the adversaries to collaborate, voluntarily or not!). 
So now not only are we talking about unpredictability - the wrong concept - but you're making assumptions about the kinds of attacks that can exist (recovering the state of the pool) *as part of your definition*.  That's like defining the security of a protocol that uses a block cipher in terms of attacks that recover the key schedule.

> The key is that the outputs are unpredictable = out of reach from being cracked for as long as deemed necessary.
Ah, you've finally crossed over from "prediction" to "remaining unknown".  That's at least a good thing - but a bit late in the game.

There's a great deal more.  Some of it is good; some of it is "motherhood and apple pie".  Some of it gets at the important points that knowledge of any collection of outputs of the RNG gives an opponent no advantage in computing any other outputs - but it does this in passing, late in the game, when it's actually pretty fundamental (but, again, had to pin down).

I'm not going to try to go through it line by line.  As I said in a previous message:  We have some good theoretical work now on randomness, on mixers - and on protocols that rely on randomness.  It's time to start with those and see if we can build on them, not keep playing around with gut feelings.
                                                        -- Jerry