[Cryptography] Summary: compression before encryption
Viktor Dukhovni
cryptography at dukhovni.org
Wed Jan 14 02:01:51 EST 2015
On Tue, Jan 13, 2015 at 11:22:35PM -0500, Phillip Hallam-Baker wrote:
> Its a little off topic but not much, but Rob Stradling and I have a way to
> compress CRLs that are essentially just collections of hashes.
The key point being "collections". One can compress sets of
substantially random short strings much more efficiently than
sequences of same.
A set of 16 distinct 5 bit strings can be compressed to a 32-bit
bitmap in which 16 bits are 1 (this is not claimed to be the optimal
encoding).
A 16 element sequence of uniformly randomly selected 5-bit strings
is an 80 bit uniformly random string, and takes 80 bits to represent.
So the set requires at most 32 bits, while the sequence 80 bits.
> But don't be too sure that random data can't compress.
> Sometimes it can - we get down to 3-4 bits per revoked cert.
A more precise claim would be that n-bit strings (sequences) with
n bits of entropy are not amenable to lossless compression. Sets,
as you note, lead to a very nice design for compressed CRLs.
When considering compression of encrypted data, one is generally
dealing with a stream (sequence) not a set.
--
Viktor.
More information about the cryptography
mailing list