Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

Thu Mar 23 17:53:52 EST 2006

>From: John Denker <jsd at av8n.com>
>Sent: Mar 23, 2006 1:44 PM
>To: John Kelsey <kelsey.j at ix.netcom.com>, cryptography at metzdowd.com
>Subject: Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

...
>With some slight fiddling to get the normalization right, 1/2
>raised to the power of (program length) defines a probability
>measure.  This may not be "the" probability you want, but it
>is "a" probability, and you can plug it into the entropy definition.

No, this isn't right for the algorithmic information theory meaning at
all.  For that measure, we can intelligently discuss the entropy of a
specific random string, without reference to a probability model.
Indeed, what's the probability distribution of the sequence of bits
defined by Chaitin's Omega?  

You can certainly complain that they should have used a different term
than entropy here, but you're not going to get these to be the same.  

>The problem is almost never understanding the definition of
>entropy.  The problem is almost always ascertaining what is
>the correct probability measure applicable to the problem
>at hand.

Well, you need to decide which of the probability distribution kinds
of entropy measures to use, and that differs in different
applications.  If you use min-entropy or guessing entropy to estimate
the limits on how well some sequence of symbols will compress, you'll
get a pretty lousy answer.  The same goes for using Shannon entropy to
determine whether you have collected enough entropy in your pool to
generate a 128-bit AES key.  

...
>> 0 occurs with probability 1/2
>> each other number from 1 to 2^{160}+1 happens with probability
>> 2^{-161}.
>> 
>> The Shannon entropy on this distribution is 81.5 bits. 
>
>I think it is very close to 81 bits, not 81.5, but that is a minor
>point that doesn't change the conclusion:

>>  But if you
>> tried to sample it once to generate an 80-bit Skipjack key, half the
>> time, I'd guess your key on my first try.  
>
>Yeah, but if I sample it N times, with high probability I can generate
>a large number of very good keys.  This problem is faced by (and solved
>by) any decent TRNG.

Hmmm.  I've seen plenty of people get this wrong.  If you use Shannon
entropy as your measure, and then say "when I have collected 128 bits
of Shannon entropy in my pool, I can generate a 128-bit AES key," you
will generate keys that aren't as secure as you think they are.  Now,
most TRNGs seem to evade this and many other problems by designing the
hardware to give a relatively simple probability model.  If your
probability model is close to uniform, then all these probability
distribution based entropy measurements converge (with a constant
difference).   

In the case above, we had better specify N.  If you sample it 16
times, then you have a 2^{-16} chance of still being in a weak state.
Once you get enough samples that the probability of being in the
pathological worst case is negligible (whatever that means in your
application), then you can start generating output bits.  That's
probably somewhere between N=32 and N=64.  

--John

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com