Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

Fri Mar 24 11:57:47 EST 2006

Erik Zenner wrote:

>>0 occurs with probability 1/2
>>each other number from 1 to 2^{160}+1 happens with 
>>probability 2^{-161}.

>  Is anyone aware of whether (and where) this was 
> discussed in the literature, or what other approaches are taken?

This particular problem is contrived or at least exaggerated.  The
solution in this case is trivial:  Just run the raw data through
a compressor.  Huffman coding works fine.
     raw data                   cooked data
     zero word                  0             (one bit)
     other word                 1 || word     (161 bits)

The cooked data has 100% entropy density.  Not only does it have
162 bits of entropy for every 162 bits of string length _on average_,
every N-bits-long substring has N bits of entropy, for all values of
N.

This version serves to illustrate, in an exaggerated way, the necessity
of not assuming that the raw data words are IID (independent and identically
distributed).
Forsooth, in real life the raw data words are never exactly IID, but with
suitable engineering you can arrange that they are not terribly far from
IID, and in particular you can ascertain a useful _lower bound_ on the
entropy per raw data word.
You can then proceed to concentrate the entropy, so as to achieve something
approaching 100% entropy density at the output.  A method for doing this is
discussed at
   http://www.av8n.com/turbid/paper/turbid.htm
in particular
   http://www.av8n.com/turbid/paper/turbid.htm#sec-saturation

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com