[Cryptography] Summary: compression before encryption

Viktor Dukhovni cryptography at dukhovni.org
Wed Jan 14 02:01:51 EST 2015


On Tue, Jan 13, 2015 at 11:22:35PM -0500, Phillip Hallam-Baker wrote:

> Its a little off topic but not much, but Rob Stradling and I have a way to
> compress CRLs that are essentially just collections of hashes.

The key point being "collections".  One can compress sets of
substantially random short strings much more efficiently than
sequences of same.

A set of 16 distinct 5 bit strings can be compressed to a 32-bit
bitmap in which 16 bits are 1 (this is not claimed to be the optimal
encoding).

A 16 element sequence of uniformly randomly selected 5-bit strings
is an 80 bit uniformly random string, and takes 80 bits to represent.

So the set requires at most 32 bits, while the sequence 80 bits.

> But don't be too sure that random data can't compress.
> Sometimes it can - we get down to 3-4 bits per revoked cert.

A more precise claim would be that n-bit strings (sequences) with
n bits of entropy are not amenable to lossless compression.  Sets,
as you note, lead to a very nice design for compressed CRLs.

When considering compression of encrypted data, one is generally 
dealing with a stream (sequence) not a set.

-- 
	Viktor.


More information about the cryptography mailing list