# [Cryptography] coding for compression or secrecy or both or neither

Kristian Gjøsteen kristian.gjosteen at math.ntnu.no
Sat Jan 17 05:03:42 EST 2015

```16. jan. 2015 kl. 20.02 skrev John Denker <jsd at av8n.com>:
> It may be that as of today, every state-of-the-art
> electronic crypto system produces incompressible output,
> but there is no deep reason why that has to be so.

Obviously there is a reason why this is so.

Why are ciphertexts incompressible?

Note that the modern standard of security is equivalent to what is called real-or-random: the adversary cannot tell if we encrypted a message of his choice or just a random bit string of the same length.

We may assume there’s not much ciphertext expansion going on. (I can’t think of a single system with significant ciphertext expansion that I’d call state-of-the-art.)

Obviously, almost all random strings are incompressible, so almost all encryptions of random strings will also be essentially incompressible.

If we have a system where we can usefully compress many or all of the ciphertexts we produce, there is an efficient algorithm that outputs messages that when encrypted gives ciphertexts that tend to compress.

We now have an adversary against the system. Run the algorithm to produce a message. Ask for an encryption of that message or a random string. Try to compress the resulting ciphertext. If it compresses, it’s an encryption of our message. If it does not compress, it’s an encryption of a random string.

"Counterexample"

Note that, per Katz’ message from a few days ago, there are situations where this argument does not hold. One very stupid example. Suppose I only encrypt blocks of 0s or 1s. (In this setting, ECB mode is catastrophically insecure, but CBC mode works well.) I want to compress CBC-mode encryptions.

Recall the CBC mode encryption operation. To encrypt blocks m1, m2, …, mn, choose a random iv c0, then compute c1, c2, …, cn as ci = E(mi + c(i-1)). To decrypt, we compute

(*)	mi = D(ci) - c(i-1).

Now observe that we don’t need all the bits of c(i-1) to recover our mi, we only need the first bit, since mi is either all zeros or all ones. So from ci and one bit of c(i-1), we can recover mi.

The final trick is to observe that once we have mi and ci, we can recover all of c(i-1) by rewriting (*) to

(**)	c(i-1) = D(ci) - mi.

So we can decrypt the message given cn and one bit from each of c0, c1, …, c(n-1). (Note that we decrypt back-to-front.) Ciphertexts compress nicely.

The constructions Katz referred to are much more sophisticated and quite clever, but I think they are essentially just as useless.

Why quotes around «counterexample»?

Why does the above incompressibility argument fail? The argument implicitly assumes that compressed ciphertexts can be correctly decrypted. After applying the above compression to an encryption of a random message, we will get a decryption failure.

--
Kristian Gjøsteen

```