[Cryptography] Entropy is forever ...

Fri Apr 17 21:47:35 EDT 2015

On 04/17/2015 02:19 PM, R. Hirschfeld wrote:

> On the other hand, there is a notion of algorithmic complexity of an
> individual string  [1]

Sorry, no, there is not any notion of complexity or entropy 
that is inherent to an individual string.

This is fundamental to cryptography, although oddly enough
your average cryptographer doesn't need to worry about it on
a day-to-day basis.  By way of analogy, atomic physics is
fundamental to chemistry, but your average chemist doesn't
need to worry about it on a day-to-day basis.

>  i.e. the length of the shortest program that
> produces it in some universal programming language.  [2]

You can define complexity that way, but statement [1] does 
not follow from [2], not even close.

First of all, calling it a "universal" compressor in the 
narrow ultra-technical sense required by statement [2]
does not make it a good compressor, or an all-purpose 
compressor ... and it certainly doesn't make it unique.
The complexity is as much a function of the compressor [C] 
as it is of the string (str):

     K = K[C](str)

For any two compressors [C1] and [C2] there is a uniform
bound on the difference between K[C1](...) and K[C2](...).
Sometimes the difference doesn't matter, but sometimes it
matters a great deal.

Also, there is not the slightest reason to expect that
the complexity is additive.  That is, in all probability:

    K[C](str1 + str2) != K[C](str1) + K[C](str2)

which is another reason why naïve notions of complexity
"inherent" in the string fail miserably, even for some
fixed compressor [C].

> There are formal relationships between K-randomness and entropy.

The formal relationship is simple and easily stated.
Entropy is defined in terms of probability.  If you
have a probability distribution over programs, it
induces a probability distribution over strings ...
or it would, if it didn't run afoul of the halting
problem.

If you ignore the halting problem you can get lower
bounds on the probability of each string.

Whether the probability constructed in this way bears
any relationship to the real-world probability of the
relevant strings is another question entirely.  There
are lots of probability measures in this world.  Just
because you have found "some" probability measure does
not mean it is the right one.  Calling it a "universal" 
probability measure doesn't make it an all-purpose 
probability measure, or even a decent approximation 
to the correct probability measure.
  http://www.math.uah.edu/stat/prob/Probability.html