Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

John Denker jsd at av8n.com
Thu Mar 23 23:21:47 EST 2006


Hal Finney wrote:
...
> This is true, in fact it is sometimes called the universal distribution
> or universal measure.  In more detail, it is a distribution over all
> finite-length strings.  The measure for a particular string X is defined
> as the sum over all programs that output X of 1/2^L_i, where L_i is the
> length of each such program.

There is no such that as "the" universal measure;  rather
there are lots of "universal" measures.  Universality is a
rather arcane property of the measure, the term doesn't mean
what most people think it means.
   -- Universal does NOT mean all-purpose.
   -- Universal does NOT mean general-purpose.
   -- There are many inequivalent "universal" distributions, most
    of which are not what you want for any given application.
   -- Certain things that are true asymptotically are not true
    for _any_ practical application.
   -- Ratio converging to 1 does not imply difference converging to 0.

This is probably not the right forum for cleaning out this Augean stable.

...
> The universal measure for a string can also be thought of as the
> probability that, given a fixed Universal Turing Machine (UTM), a randomly
> chosen input program will output that string.
> 
> So this is clearly a probability distribution (with some technicalities
> regarding issues of program lengths being glossed over here) as John
> Denker says.  However to go from this to a notion of entropy is more
> questionable.

Not really questionable.  If you have a probability, you have an
entropy.

> Shannon entropy is defined over a probability distribution.  That is,
> given a probability distribution we can find its Shannon entropy by
> summing -p_i / log2(p_i) over all members of the distribution.  If we
> approximate the universal distribution by 1/2^n for strings of length
> n, this sum is clearly divergent.  So there is not really a meaningful
> Shannon entropy for this probability distribution, or in other words
> the entropy is infinite.

The entropy _per string_ is unbounded, as it jolly well should be for
random strings of unbounded length.  That's true but uninteresting.
A more interesting quantity is the _entropy density_ aka the entropy
_per symbol_ which for typical long random bit-strings is one bit of
entropy per bit of string-length.  Similarly if you have a string of
symbols over a 32-symbol alphabet, you would expect to see five bits
of entropy per symbol for a typical long random string.


---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com



More information about the cryptography mailing list