[Cryptography] Compression before encryption?

Fri Jan 9 13:23:12 EST 2015

Den 9 jan 2015 18:34 skrev "Stephan Neuhaus" <stephan.neuhaus at zhaw.ch>:
>
> Dear list,
>
> I have come across the recommendation to "compress before you encrypt",
on the grounds that this makes plaintext recognition through frequency
analysis much harder.
>
> However, compression algorithms surely have easily recognisable headers,
right?

It is a question about entropy density here. Known-plaintext attacks
usually rely on knowing multiple blocks where you know with high certainty
what the plaintext is. RC4 suffers from this, breaking it goes do fast for
WEP because you know so much plaintext (most content in all WiFi and TCP/IP
headers and HTTP headers), and because you can make statistical guesses
about the plaintext (a whole lot of English).

This is because knowing that plaintext allows you to extract biases from
the key stream, which reveals the encryption key. Compressing all plaintext
hides those biases since it is harder to guess the individual bit values of
compressed data than for raw data.

> Also, I seem to recall a paper that did interesting things with encrypted
compressed plaintext, but I can't recall any details.
>
> So, does any one know what paper I might be referring to?  Or is there
any other hard evidence (not personal opinion, however well-informed,
please) that compression before encryption does or does not help?
>
> Thanks in advance,
>
> Stephan

That's probably BEAST and CRIME for HTTPS, which turns the browser into an
encryption oracle. The problem here is that the attacker controls a large
fraction of of the plaintext, and can watch the sizes of the compressed
ciphertext. Because  attacker controlled strings with substrings in common
with the secret strings equals shorter compressed outputs, and thus shorter
ciphertext, the attacker can do Hollywood style bruteforce and guess
AAAAAAAA, then AAAAAAAB and watch the ciphertext get shorter as he gets
more and more of the secret plaintext right.

This would typically target session cookies as they are sent with every
HTTP request. All secret plaintext which the attacker can cause to be
resent is a potential target. Javascript from malicious ads would be the
most likely attack vector.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.metzdowd.com/pipermail/cryptography/attachments/20150109/c97d5261/attachment.html>