[Cryptography] randomness +- entropy

Jerry Leichter leichter at lrw.com
Tue Nov 5 12:38:20 EST 2013


On Nov 5, 2013, at 11:51 AM, John Denker <jsd at av8n.com> wrote:
>>> Entropy means something very special.
> 
>> Well, yes, and in its technical definitions - 
> 
> Are we not having a technical discussion?  If not, then what?
However *you* may be using the word, its use in most discussions *surrounding cryptography* is not particularly technical.

>> either in
>> thermodynamics where it arose or information theory where it was
>> imported because of a similarity in formalisms 
> 
> The physics entropy and the information-theory entropy are the
> same thing.  This is not a mere "similarity in formalisms".
An ahistorical view.  When Shannon introduced the term, it was by analogy.  In the 1930's and '40's, no one thought about physics in information terms.  Hell, no one thought about "information" as something formalizable.  That was Shannon's genius.  But even he didn't realize that his notion of information entropy would prove to have deep and profound connections to physical entropy.  That wasn't really recognized until 20 or more years later.

>> In cryptography, especially
>> when discussing random number generators, the word has a loosy-goosy
>> meaning
> Nevermind the word, ideas are what's important.  Terminology is
> important only insofar as it helps formulate and communicate the
> ideas.
Agreed.  But you're the one who started on what I think is a fool's errand of trying to get the word used "correctly" in the random number generator community.

Here's a simple example - which I alluded to earlier - that demonstrates why this is much more difficult than it seems.  Suppose I have a source of true random bits.

a)  I want to draw a number uniformly at random from the set {0, 1}.  "Entropy" tells me that 1 bit from the source is enough.

b)  I want to draw a number uniformly at random from the set {0, 1, 2, 3}.  Same argument - two bits "of entropy" are exactly what we need.

c)  I want to draw a number uniformly at random from the set {0, 1, 2}.  Clearly "I need less entropy" than in case (b) - I'm *delivering* less entropy.  But ... you can't actually do this with two random bits.  About the best you can do is:  Take two bits of entropy.  If the result is 00, 01, or 10, you're done; otherwise draw another two bits.  Repeat until done.  The expectation value for the number of bits required is *at least* (3/4)*2 + (1/4)*4 = 2.5 bits; really, it's more.  (There are other correct ways - and many, many *incorrect* ways - to make this "random" choice, but none of them can use fewer bits than this.)

Naive arguments about entropy will lead you astray.  Naive entropy estimates will lead you astray.  If you want to do the math, do the math - don't guess.

>> When people make it a regular habit to "guess" the
>> entropy of various processes and then go on to build systems based on
>> guesses, you know the word has lost any formal meaning it ever had.
> 
> Speak for yourself, Kemosabe.  As for me, I know of a systematic 
> method for estimating the entropy, based on a series of intelligent 
> guesses.  The method was described by some guy named Shannon, several 
> decades ago.  It is consistent with everything else we know about 
> entropy.
I know perfectly well how to do computations with entropy.  That's exactly why I'm disturbed by the way the term gets misused.

> [Long analysis of the distinction between a TRNG and a PRNG]

I would put it all differently.  The two are barely comparable.

- A TRNG draws from a random distribution, a concept that is mathematically an axiom.  We believe (with very good reason) that a TRNG provides an appropriate model for the operation of some physical processes; and, flipping this around, we believe we can realize, in the physical world, something with properties akin to the mathematical abstraction by building an appropriate physical system.

- A (cryptographically strong) PRNG is a product of computational complexity theory.  It is *defined* in terms of the complexity of particular models of computation.  The "connection" between them shows up entirely in the way the PRNG and the attack models for it are defined:  The attacker is assumed to receive as input the outputs of the PRNG, but not its internal state.  If the internal state were somehow "sneaked into" the computation, the whole thing would break down.  Mathematically, we can simply say "the only information available to the machine is what is on its input tape", but in a physical realization, the internal state has to come from *somewhere*, and that "somewhere" must not leak any information to the (physical) attacker or the pretty mathematical model fails.  The only way of providing such internal state, in general, is with a TRNG.

> Bear in mind that practically everybody in the world reserves the
> word "entropy" to denote the genuine physical / statistical entropy.
It depends on who you count as "in the world".  If you're talking about physicists and information theorists and communications engineers, I'll agree with you.  If you're talking about designers of nifty new "random number" sources, complete with "entropy estimates" "conservatively" pulled out of a hat; or those who talk about "how many bits of entropy" the RNG must gather before we attempt to choose a value from {0, 1, 2} - well, not so much.

> Also note that we have perfectly good words like "randomness" and
> "unpredictability" to cover the other cases.
I don't know what "other cases" you're comparing to.

Probability theorists and statisticians generally talk about "randomness", not entropy - but surprisingly little, since that's in their axioms; rather, they talk about "drawing from a random distribution".  Sure, you can think of a random distribution as something with a source entropy, and you do that in information theory all the time, but for many other purposes - e.g., if you want to compute the moments or think about correlations - that's probably not the best representation to use.

"Unpredictability" is a nice informal catch-all for what we'd like to get, but if that's all you have, you're not ready to specify or build a system.

                                                        -- Jerry



More information about the cryptography mailing list