[Cryptography] Compression before encryption?
John Denker
jsd at av8n.com
Fri Jan 9 15:09:47 EST 2015
On 01/09/2015 05:22 AM, Stephan Neuhaus wrote:
> I have come across the recommendation to "compress before you
> encrypt",
OK. Sometimes it helps, sometimes it doesn't.
Let's assume the plaintext /is compressible/. Otherwise there's
nothing to discuss. If the data is already random, or already
well compressed, further compression is obviously not going
to help.
> on the grounds that this makes plaintext recognition
> through frequency analysis much harder.
It does make frequency analysis much harder, although
that's not necessarily the only way to think about it.
> However, compression algorithms surely have easily recognisable
> headers, right?
a) That's true only if it is a lousy compression algorithm.
If the headers are predictable, the headers themselves can
be compressed.
b) A /small/ amount of predictability is more survivable than
a large amount.
> [...] is
> there any other hard evidence (not personal opinion, however
> well-informed, please) that compression before encryption does or
> does not help?
Sometimes it increases security, and sometimes it doesn't.
-- If you are breaking RSA by factoring, you derive no
advantage from a known plaintext.
-- For a fully-known plaintext, the compressed version is
also fully known, just smaller. This may or may not be
significant to the attacker. For super-long *or* super-
short messages it's probably not going to matter. There
is a Goldilocks zone in between.
-- At one opposite extreme, if you are breaking Enigma, or
breaking WEP, /partially/ known i.e. partially predictable
plaintexts are a big deal. Compression ideally removes
the predictable stuff, making cryptanalysis very much
harder.
-- As an intermediate case, compression might create a
situation where the message was not breakable by itself,
but breakable if the same message were transmitted to
multiple stations under different keys. This exploits
/same/ plaintext, not "known" plaintext. Compression
doesn't change the sameness. This is why you need
session keys.
-- At the other opposite extreme, if chosen plaintext can
be concatenated to the unknown plaintext, compression
might make things very much worse:
http://en.wikipedia.org/wiki/CRIME
Note that this magnifies and highlights the problem with
traffic analysis, which is an oft-underappreciated problem
whether or not there is compression involved.
More information about the cryptography
mailing list