[Cryptography] Is there a good algorithm providing both compression and encryption at the same time?

Ray Dillinger bear at sonic.net
Thu May 7 19:50:44 EDT 2015



In principle, what you want to do is compress, then encrypt,
before you transmit or store.  If you do it any other way, then
any compression achieved is an indication that your encryption
algorithm is broken.

I deployed a compress/encrypt function like this for a
specifically anti-auth compression on English Text in an
encrypted messaging system used by a small organization.
As an anti-auth system, it's somewhat lossy by design to
limit authorship information leakage: other than seeking
double carriage returns as an indication of paragraph ends,
for example, it compresses any whitespace to a single space,
strips oxford commas, smashes CamelCaps, coalesces/compresses
common alternate US/UK spellings and common misspellings,
etc.

It's VERY good compression for English Text; most words
and many common whole phrases become single code numbers.
But that's partly because of the anti-auth features.  And
of course if makes complete hash of ASCII art signatures,
expands UUEncoded gif images or Base58 keys by a factor
of three or so, etc.

It's not much of a win IME for performance purposes unless
bandwidth is precious; the design constraints are such that
you have to do compression and encryption as separate
computational steps anyway so you're not saving CPU cycles.
What you're saving is I/O steps and bandwidth.  You can
implement them in the same module mainly to enhance security
somewhat by NOT storing the compressed-but-not-yet-encrypted
form while writing/sending or the decrypted-but-not-yet-
decompressed form while reading/receiving.

There are a couple of different approaches you can take
to make a more full-featured/bit-preserving function:

First, the decompression dictionary(ies) may be a shared secret
with you and your recipient, making it a very long low-entropy
part of the key.  Possibly the message itself, in this case, has
occasional "control codes" inserted that suspend/enact different
parts of the decompression dictionaries/algorithms, so that for
example PNG image bytes are compressed differently from rich-
text bytes are compressed differently from plain-text bytes.
This is functionally equivalent to rolling MIME types with
compression method for each type, and then encrypting the
resulting byte stream.  Compression happens before encryption,
or else Eve can read your control codes in the transmitted
stream and deduce a lot of information about what you're
transmitting.

Second, the decompression dictionary may be "contextual" - either
transmitted in encrypted form as a sort of extended IV, and/or
transmitted along with the encrypted/compressed content via the
use of occasional "control codes" in the decrypted stream that
modify, add, and replace as well as suspend/enact various aspects
of the compression dictionary on the fly. Decryption also needs
to happen before decompression here, for the same reason: If
the control codes or IV are not also encrypted, then Eve can pick
them out of the stream and deduce a lot of information about
what you're transmitting by the kind of decompression functionality
it needs.  Assuming you start with a reasonably complete shared
"dictionary," this is really essentially the same as the first
scenario except that the dictionary itself is mutable.  It can
be worth it if you need to be able to add compression methods
for new MIME types on the fly and they correspond to highly
structured compressible formats which aren't in your starting
dictionary.  The drawback is that it is very hard to look
ahead in your input stream to determine whether transmitting
dictionary updates is or is not a waste of bandwidth, without
creating side channels which a savvy opponent can exploit.

I have not seen this done with a "secret compression dictionary"
 - Kerckhoff's Principle usually inspires designers to limit
the secrecy to a single relatively short key so the starting or
default dictionary is known to the opponent.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://www.metzdowd.com/pipermail/cryptography/attachments/20150507/f9a50c24/attachment.sig>


More information about the cryptography mailing list