[Cryptography] Entropy is forever ...

Sun Apr 19 03:18:14 EDT 2015

In the context of:

>> Sorry, no, there is not any notion of complexity or entropy
>> that is inherent to an individual string.

On 04/18/2015 07:06 PM, R. Hirschfeld wrote:

> [...] I think that if you look at
> the work of Solomonoff, Kolmogorov, Chaitin, et. al. you'll find that
> this is what they describe.  You might not find it a useful notion,
> but the notion does exist.

Well, as it turns out, I have read those guys, et al., and I find
that they recognize that the complexity of a string is not inherent; 
it depends on how it is measured.  They don't always make a big fuss 
about the dependence, but they know it exists.

> Interestingly, when the notion of Kolmogorov randomness is extended
> to infinite strings, the particular universal description language
> doesn't matter.

That's a common misconception ... but it remains a misconception
nonetheless.  As I said before, for *some* purposes it doesn't 
matter, but for other purposes it matters a great deal.  In
more detail:
 -- If you are a common carrier selling bits in bulk, a bounded
  number of "extra" bits does not affect the price per bit that
  you offer to your largest customers.  Alas, common-carrier
  quantity discount issues are off-topic for this list.
 ++ Probability is exponential in complexity.  An additive
  shift in some of the complexities makes an exponentially
  large /multiplicative/ shift in some of the probabilities.
  This multiplicative factor does *not* wash out as the
  strings become longer.  For crypto, this is crucial.

We agree that the ratio

         K[C1](str)
       --------------                       [1]
         K[C2](str) 

converges to 1 as the length of str becomes large.  However, it 
is a gross fallacy to imagine that the difference

     K[C1](str) -  K[C2](str)               [2]

converges to zero.  Convergence of the ratio does not imply 
convergence of the difference.  In all probability, the sequence 
of differences [2] doesn't converge at all, although it remains 
bounded.  If the emulator program that allows C2 to emulate C1 
is only ten bytes long, it means your probability estimates are 
off by "only" a factor of 2^80.

In the cryptography business, factors of 2^80 are often
significant in practice.  Also, most emulator programs are
more than ten bytes long.

> I guess in some cases the "right" or "correct" probability measure is
> a philosophical question.

Well, on the philosophy list or even the "pure mathematics"
list you are free to hypothesize any probability measure you
like, no problem there.  However, crypto depends on real-world 
engineering.  It depends on math also, but not math alone.

By way of analogy:  In pure mathematics, the axioms of Euclidean 
geometry assume a flat space.  If you want to assume that, go
ahead ... but remember that's just a mathematical fantasy. 
Don't turn around and pretend that you have proved that the 
earth is flat, or that spacetime is flat.

In the world where I live, the metric (i.e. the curvature) is 
not determined by pulling axioms out of some bodily orifice;
it has to be measured.  By the same token, you cannot assign
entropy to this-or-that string by cranking up the first Turing
machine you find lying around.  The entropy is a property of 
the ensemble, not a property of the string.  In fact it is 
an ensemble average, the expectation value of the surprisal.
It has to be based on measurements, not on philosophy, not 
on arbitrary hypotheses.