[Cryptography] randomness +- entropy
John Denker
jsd at av8n.com
Tue Nov 5 11:51:50 EST 2013
On 11/05/2013 05:18 AM, Jerry Leichter wrote:
>> Entropy means something very special.
> Well, yes, and in its technical definitions -
Are we not having a technical discussion? If not, then what?
> either in
> thermodynamics where it arose or information theory where it was
> imported because of a similarity in formalisms
The physics entropy and the information-theory entropy are the
same thing. This is not a mere "similarity in formalisms".
For example, it is a one-line calculation to find the entropy
of the nuclear spins in a sample of copper, starting from
statistical principles, namely
S = R ln 4
and you can also measure S using a physical thermometer.
Mirabile dictu, you get the same answer either way.
> - it plays virtually
> no role in cryptographic discussion.
Nonsense!
> In cryptography, especially
> when discussing random number generators, the word has a loosy-goosy
> meaning
Nevermind the word, ideas are what's important. Terminology is
important only insofar as it helps formulate and communicate the
ideas.
The /idea/ of entropy has tremendous significance. If we didn't
call it "entropy" we would need to invent another name for it. So
please let's save everybody a lot of trouble and call it "entropy".
> that's somehow tied to lack of predictability - but in an
> unspecified way.
Yes, "tied to" ... but that's not the same as "same as". Simple
suggestion:
-- If you mean "unpredictability", say "unpredictability".
-- If you mean "entropy", say "entropy".
We have perfectly good words for each of these things.
> When people make it a regular habit to "guess" the
> entropy of various processes and then go on to build systems based on
> guesses, you know the word has lost any formal meaning it ever had.
Speak for yourself, Kemosabe. As for me, I know of a systematic
method for estimating the entropy, based on a series of intelligent
guesses. The method was described by some guy named Shannon, several
decades ago. It is consistent with everything else we know about
entropy.
================
Successful cryptography often depends on depth of understanding and
attention to detail. The loosey-goosey approach is very likely to
get you into trouble.
The same idea applies to various other fields of endeavor
http://www.check-six.com/images/Jenny38057-poster.jpg
There *is* a difference between a TRNG and a cryptographically-strong
PRNG.
A) In /some/ situations, the difference doesn't matter very much.
B) In other situations, it matters a great deal.
For starters,
A) It possible to imagine a TRNG independent of any PRNG.
B) It is not possible to imagine a PRNG completely independent of
any TRNG, because the PRNG needs a seed, and the seed has to
come from somewhere.
Here's another contrast:
A) It may be that under /normal/ operating conditions, the /user/
does not care about TRNG versus PRNG.
B) Anybody who is /designing/ any such thing needs to understand
the distinction. In particular, the /recovery from compromise/
is wildly different in the two cases.
>> TRNG entropy density = 1 - epsilon
>> PRNG entropy density = epsilon
> I don't know what this means.
Here's a highly-condensed tutorial:
Suppose we have N-bit codewords, and various statistical distributions
over codewords.
Entropy is a property of the ensemble (not of any particular codeword).
1) If the ensemble consists of all 2^N possible codewords, evenly
distributed, the distribution has N bits of entropy.
2) Now consider a different distribution. If half the codewords are
missing, and the rest are evenly distributed, the distribution has
N-1 bits of entropy.
2a) In the sub-case where you know exactly which codewords are missing,
this distribution is noticeably less random, less predictable than
distribution (1).
2b) In the sub-case where it is computationally infeasible to figure
out which codewords are missing, this distribution may be -- for a
wide range of practical purposes -- just as unpredictable as
distribution (1). However, it still has less entropy.
3) For a typical real-world PRNG, we are not talking about one bit of
entropy going missing. Almost all of the bits have gone missing!
If we seed the PRNG with 100 bits and then use it to generate a
billion bits, then there are 2^999999900 missing codes out of a
possible 2^1000000000. That's a lot of missing codes. Well over
99.99% of the codes are missing.
With a TRNG, the situation is reversed. Ideally there would be no
missing codes whatsoever. However, for reasons of computational
efficiency, a practical TRNG will typically allow a few -- a very
few -- codes to be missing or under-represented in the ensemble.
The contrast is extreme:
A) The cryptographic strength required to hide the missing codes
when almost all codes are missing, versus
B) The cryptographic strength required to hide the under-represented
codes, when there are very few of them.
And on top of that, there is the issue or recovery from compromise.
/dev/urandom is different from /dev/random ... for a reason
There is an important idea here. If you are not going to call this
idea "entropy", you need to come up with a different name for it.
Bear in mind that practically everybody in the world reserves the
word "entropy" to denote the genuine physical / statistical entropy.
Also note that we have perfectly good words like "randomness" and
"unpredictability" to cover the other cases.
More information about the cryptography
mailing list