Entropy Definition (was Re: passphrases with more than 160 bits of entropy)
John Denker
jsd at av8n.com
Thu Mar 23 23:21:47 EST 2006
Hal Finney wrote:
...
> This is true, in fact it is sometimes called the universal distribution
> or universal measure. In more detail, it is a distribution over all
> finite-length strings. The measure for a particular string X is defined
> as the sum over all programs that output X of 1/2^L_i, where L_i is the
> length of each such program.
There is no such that as "the" universal measure; rather
there are lots of "universal" measures. Universality is a
rather arcane property of the measure, the term doesn't mean
what most people think it means.
-- Universal does NOT mean all-purpose.
-- Universal does NOT mean general-purpose.
-- There are many inequivalent "universal" distributions, most
of which are not what you want for any given application.
-- Certain things that are true asymptotically are not true
for _any_ practical application.
-- Ratio converging to 1 does not imply difference converging to 0.
This is probably not the right forum for cleaning out this Augean stable.
...
> The universal measure for a string can also be thought of as the
> probability that, given a fixed Universal Turing Machine (UTM), a randomly
> chosen input program will output that string.
>
> So this is clearly a probability distribution (with some technicalities
> regarding issues of program lengths being glossed over here) as John
> Denker says. However to go from this to a notion of entropy is more
> questionable.
Not really questionable. If you have a probability, you have an
entropy.
> Shannon entropy is defined over a probability distribution. That is,
> given a probability distribution we can find its Shannon entropy by
> summing -p_i / log2(p_i) over all members of the distribution. If we
> approximate the universal distribution by 1/2^n for strings of length
> n, this sum is clearly divergent. So there is not really a meaningful
> Shannon entropy for this probability distribution, or in other words
> the entropy is infinite.
The entropy _per string_ is unbounded, as it jolly well should be for
random strings of unbounded length. That's true but uninteresting.
A more interesting quantity is the _entropy density_ aka the entropy
_per symbol_ which for typical long random bit-strings is one bit of
entropy per bit of string-length. Similarly if you have a string of
symbols over a 32-symbol alphabet, you would expect to see five bits
of entropy per symbol for a typical long random string.
---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com
More information about the cryptography
mailing list