[Cryptography] information gain; was: coding for compression or secrecy or both or neither

Sun Jan 18 17:21:29 EST 2015

On 01/16/2015 12:02 PM, I wrote:

>> It may be that as of today, every state-of-the-art electronic
>> crypto system produces incompressible output, but there is no
>> deep reason why that has to be so.

On 01/17/2015 03:03 AM, Kristian Gjøsteen wrote:

> Obviously there is a reason why this is so.
> 
> 	Why are ciphertexts incompressible?
> 
> Note that the modern standard of security is equivalent to what is
> called real-or-random: the adversary cannot tell if we encrypted a
> message of his choice or just a random bit string of the same
> length.

That's not the correct criterion.  It would be better to
say the cryptotext does not convey any /information/ about
the plaintext or the keys.  Here's an example:

00010100 10110000 11111110 00110110 10000100 01010111 10001100 10111110
01011111 00101101 10010110 00000101 10100101 00000001 11011001 00111011
01010101 11110111 11100010 11011001 11101000 10100001 00011100 11100010
00001001 10111010 11001101 11111101 10101000 01100000 00011010 11011001
11001010 10110110 10001001 10001100 10001010 10001010 11111101 01101011
00101010 10101111 00110010 01110010 10011101 11001001 10101010 10001000
11110110 10110110 10010101 11000001 11011011 00001100 10101111 10011111
00100101 10011011 01010100 01101010 10000101 11100110 01011110 10001111
....

That's the beginning of a file that contains my Cayman
Islands bank account number, all of my passwords, et cetera.
However, it has been encrypted using a one-time pad, so I
don't mind publishing the cryptotext.

However, this cryptotext is highly compressible.  You can gain:
 -- a factor of 4 by using hex ascii instead of base-2 ascii.
 -- a factor of 6 using printable ascii chars.
 -- a factor of 5 or so (depending on file size) by using gzip.
 -- a factor of 9 by using raw format, 8 bits per byte.

My point is that -- compressible or not -- the example cryptotext 
provides zero information gain.  It doesn't tell you anything you 
didn't already know.(*)

The concept of /information gain/(**) can be formulated very precisely.
Executive summary:  Calculate the entropy of all possible plaintexts
(before seeing the cryptotext) and subtract off the row entropy or 
the conditional entropy (conditioned on the cryptotext).  The math 
is spelled out here:
  https://courses.cs.washington.edu/courses/cse455/10au/notes/InfoGain.pdf

I say again that the randomness of the cryptotext is absolutely
not a good criterion.  Suppose I send three copies of the example
cryptotext.  The receiver attributes a randomness of one bit per
byte to the first copy.  The receiver attributes no randomness
whatsoever to the second and third copies.  More importantly,
none of that matters.  Information gain is what matters, and
that is zero for all three copies.

========
Notes:

(*) We still need to account for traffic analysis.  This a 
super-important issue, but it affects all cryptosystems, whether 
compressible or not, so it does not affect the issues discussed 
here.

(**) The information gain I am talking about here is not quite
the same as the Kullback-Leibler divergence ... although the 
name "information gain" is sometimes applied to both.  They
are related to each other, and to the conditional entropy, but
they are not quite the same.