[Cryptography] Final words on RNG design

Sun Dec 4 10:39:02 EST 2016

Den 4 dec 2016 13:37 skrev "Jerry Leichter" <leichter at lrw.com>:

> * What an RNG is needed for;
>
> Anything that requires unpredictable elements.
This goes wrong right from the get-go.  A completely random source whose
outputs are leaked to the opponent - even long after they are produced - is
completely useless.

Perhaps I should have explained from the start that I only categorized the
statements into types of properties necessary, trying to stick only to
those that are necessary (some of them with justifications), instead of
making a "linear" argument for one particular way to design an RNG for
cryptography newbies.
This was rather a checklist for what you can't forget, a collection of
critical facts. It was *not* sorted by importance. In other words - to miss
any one of these facts when you design an RNG puts you at tangible risk of
*some* real world attack.

As you later noted, I *did* address this particular fact later on (although
not in a separate point, I can see why you first missed it if you skimmed
it). I do think it was clear enough from the original message that it is
important to prevent leaks, but I guess you disagree?
As I mentioned, to minimize the risks of leakage you need PFS style
properties (backtracking resistance) in the pool to not compromise past
values if the system later is compromised, and you need strong resistance
against any kind of predictive attacks. And each RNG output is only to be
shared with one application / process, not duplicated. Reseeding also helps
you to recover from state leakage. Was this not enough?

> * How should we define entropy?
Personally, I try to avoid using the word in this context.

I've seen other suggested terminology, but none that seems meaningfully
simpler or more accurate. I'm happy to change it if somebody can argue for
what to use and why.

In any case, the kind of randomness / unpredictability / secret collection
of bits we need has to be defined in terms of the knowledge we can assume
the attacker to have about the system / sources and the work which they
need to perform using that knowledge to compromise the system (interactions
with the target system included). This knowledge includes observable data
leakage (perhaps another point that should have been clearer). This covers
the entire time span for which these secret bits remain sensitive.
The minimum effort required for compromise sets the bar for how much of
this metric we can claim to have. The metric should strongly correlate with
the cost of pulling off a successful attack.

> Given our sources of randomness and our pool of collected bits [...]
So now not only are we talking about unpredictability - the wrong concept -
but you're making assumptions about the kinds of attacks that can exist
(recovering the state of the pool) *as part of your definition*.  That's
like defining the security of a protocol that uses a block cipher in terms
of attacks that recover the key schedule.

I did also say "or otherwise predict the outputs". I'm aware some RNG
designs (particularly ones based on asymmetric primitives) can be
compromised without ever knowing the original raw pool data.

What's the right concept? How would you phrase it? Perhaps I should have
phrased it in terms of making it computationally infeasible to in any way
compromise the RNG or applications using it...? How would you define
compromise?

And now I remembered another important statement on the sources, which I
previously forgot to add;

It must be infeasible for an attacker who controls some but not all sources
to compromise the system RNG, through feeding it malicious inputs. An
attacker may attempt to use such outputs to influence the system RNG such
that it produces outputs that are biased in a way favorable to the attacker
(including but not limited to increasing predictability, making the outputs
distinguishable from random, controlling various bit values in the output,
influencing the results of the application using the output (including
attacking fragile crypto), data exfiltration, etc).

Defending against an attacker achieving control of most sources would
effectively require that the defender assumes all but one source is
compromised, and thus do not assume they have more entropy than their
slowest source produces alone. (Although at this point you should probably
just scrap the hardware...)

> The key is that the outputs are unpredictable = out of reach from being
cracked for as long as deemed necessary.
Ah, you've finally crossed over from "prediction" to "remaining unknown".
That's at least a good thing - but a bit late in the game.

There's a great deal more.  Some of it is good; some of it is "motherhood
and apple pie".  Some of it gets at the important points that knowledge of
any collection of outputs of the RNG gives an opponent no advantage in
computing any other outputs - but it does this in passing, late in the
game, when it's actually pretty fundamental (but, again, had to pin down).

As I said above - these statements are not ordered by importance or how
fundamental they might be. They're sorted by type. And they're meant to be
descriptive rather than academic precision statements, because otherwise I
would never even have gotten past writing the first sentence. I tried to
avoid ambiguity and "empty" words. I'll try to source most of it later, but
it will take take some time.

I'm not going to try to go through it line by line.  As I said in a
previous message:  We have some good theoretical work now on randomness, on
mixers - and on protocols that rely on randomness. It's time to start with
those and see if we can build on them, not keep playing around with gut
feelings.

I don't expect anybody to actually reply to every single line. A few
comments are just fine. I was hoping to get all the necessary facts
gathered in one place, ideally with accompanied sources (yes, I'm aware it
likely does not all fit in one email thread, but we need to start
somewhere).

If you have a list of such theoretical work to share, please do so. In
particular if any one of them (or collection of them) or other such
resources are complete enough to be sufficient for designing an RNG that
resists all known plausible attacks which it is reasonable to expect an RNG
to defend against.

Perhaps this list and/or discussion fits better on a wiki somewhere. Does
anybody know of a fitting place, in that case? If anybody is interested and
want to help collect sources I can provide edit access to the reddit
/r/crypto subreddit wiki (reply off-list in that case), unless some other
place is better.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.metzdowd.com/pipermail/cryptography/attachments/20161204/b7f3c759/attachment.html>