Comments/summary on unicity discussion

Fri Mar 7 02:12:23 EST 2003

I'll hop in here, and see if I can give this a swing.  IANAC (I am not
a cryptographer), but I do have access to one at work, and I did make
use of him in gaining an understanding of this area.

Some of these points are minutia, but are important, both because they
are common errors, and because they really help bring everything together.

In general, it appears that you mixed up your labels in the reference list.
[Sha49] is, indeed, the reference that you want, but that paper should
be listed as "Communication Theory of Secrecy Systems" (published in 1949)

On Mon, Mar 03, 2003 at 01:28:54PM -0800, Ed Gerck wrote:
> 1. WHAT IS UNICITY?
> There are three different contexts to answer to this question!
> 
> 1.a. Unicity Definition: Shannon [Sha49, page 693] defined "unicity
> distance" (hereafter, "n") as the least amount of plaintext which can be
> uniquely deciphered from the corresponding ciphertext, allowing one to
> determine, without doubt, the key that was used for encryption. 

It doesn't deal with plaintext, just ciphertext.  In fact, unicity
distance is only valid for a ciphertext only attack.  Once you get a
known plaintext/ciphertext pair, a high unicity distance works against
you (more on this later). In addition, it is isn't certain that after
observing the requisite unicity distance number of ciphertext units that
you can uniquely determine the key, it is merely very likely.

So, the definition should be something more like:
1.a. Unicity Definition: Shannon [Sha49, page 693] defined "unicity
distance" (hereafter, "n") as the least amount of ciphertext which would
reduce the likely number of spurious keys ("equivocations") in a random
cipher to zero.

> 1.b. Unicity Model: As first given by Shannon [Sha49] under some restrictive
> assumptions, specially the "random cipher" assumption, the mathematical
> expression for unicity can be cast in the following unfolded expression
> (his original expression was  n = H(K)/D, where D is the redundancy):
> 
>     n = H(K)/[|M| - H(M)]

I don't see this.  I do see 
  D_N = log(G) - H(M)
Where D_N is the redundancy of the language, G is the total number of
messages in the language, and H(M) is the entropy of a message of the
language.  Shannon uses log base 10, but we like to talk about 'bits'
of entropy in crypto, so we tend to talk about log base 2 (often written
lg x)

  D = D_N / N

Where D is the redundancy the per unit (character/bit/whatever), D_N
is the redundancy of the language, and N is the average number of units
per message.

And finally, the unicity distance is:
  n = H(K) / D

Where n is the unicity distance (expressed in units), H(K) is the
amount of entropy of a key (also in units) for the system, and D is the
redundancy per unit.

So, pulling it together
 n = H(K) * N / (lg(G) - H(M))

(so, it would appear that your equation was off by a factor of the
average length of the message)

But truly, I think that it makes the most sense as just
 n = H(K) / D

As an aside, you define:
> H(K) = entropy of keys used in encryption
[...]
> H(M) = entropy of actual message, the plaintext

This seems to imply that a particular message has entropy.  This is
incorrect.  A random variable has entropy ('variable' in mathspeak, not
'variable' in the sense of programming, which has an actual value as
the program is executing; this is a "random variable" as in X, where X
assumes the following values x_1, x_2, ... x_n with probability p_1,
p_2, ...p_n) , based on the statistical properties of the variable.
A particular value doesn't really have 'entropy', outside the context
of the system that created the value.

Now, having said that, strings (values, messages, etc.) do have
something called "Kolmogorov Complexity".  Kolmogorov Complexity is
the size of the smallest program that can produce a particular string.
For a string, the highest Kolmogorov Complexity it can have is the size
of the string.  The lowest would be a very small program that produces
a very large string.

As you might expect, entropy and Kolmogorov Complexity are related.
You would expect a message from a high entropy system to have a high
Kolmogorov Complexity (in fact, the Kolmogorov Complexity should
be comparable to the entropy of the system).  Further, you get some
intuitively nice results where the Kolmogorov Complexity of any output
of a PRNG is at most the size of the PRNG state, plus a bit for the PRNG
algorithm, which agrees nicely with the entropy of the system, which is
(at best) size of the PRNG state.

>     NOTE 1: The model for unicity has no probability error with a tail
>     to infinity because only entropy values are used in the formula of n
>     and by *definition* of  entropy the entropy is already a limit to
>     infinity.

I don't understand what this note means.

>     NOTE 2: It does not matter how the attacker may try to decipher
>     the message. The attacker can of course use brute-force and try
>     out all keys or he can use short-cuts, it is his choice and he is entirely
>     free to use any method he desires.  The work involved may be small,
>     quite large or even unbounded -- the amount of work is actually
>     unspecified.

Hmm... This is true, but it seems to miss the point that the entire
idea behind unicity distance (and, indeed, all of information theory)
is that the attacker has unbounded resources.  "Brute force" isn't so
much an issue, when you get to assume that your attacker has an infinite
amount of computational power at his disposal...

>     NOTE 3: Shannon's definition of "random cipher" was that "all
>     decipherments must produce a random flat distribution over all
>     bits in the plaintext space."

Shannon's "Random Cipher" is a simple substitution cipher.

> 1.c. Unicity Value:  The numerical value of n. It is important not to
> confuse a model with a measurement. 

This is a strange distinction to make, given that one of the assumptions
that information theory makes is that both sides of an exchange have
infinite computational resources, which isn't something that is ever
going to occur in practice.  I guess I'm just pointing out that really,
unicity distance, information entropy, and all that other great stuff is
all just a neat conceptual framework that establishes the "worst case"
in cryptography.

> First, note that the model works for any ciphertext, any plaintext.
> And for any such pairs

Hmm... Once again, unicity distance is only directly applicable to
a ciphertext only attack.

In an information theoretic model there is a trade off between resistance
to ciphertext only attacks and resistance to known plaintext attacks.
If a system had a high redundancy (i.e., the system's messages were low
entropy as compared to the entropy of the key), you would likely have
many keys that produced a given ciphertext/plaintext pair, so you would
need more plaintext/ciphertext message pairs before you could uniquely
specify a key.  (You should, on average, be able to uniquely specify
the key after you have gathered the entropy of the key in message entropy.

So, if you redundancy is low, your system is more secure against
ciphertext only attacks (the attacker must gather a great deal of
ciphertext before they can uniquely determine the key), but insecure
against known plaintext attacks (the attacker doesn't need very much known
plaintext before they can uniquely determine the key).  Conversely, if
the redundancy is high, your system is insecure against known ciphertext
attacks, but more secure against known plaintext attacks.

> Since all good
> ciphers should be a random cipher

Well, good ciphers should have many of the same properties as a random
cipher, but is isn't one.

>     NOTE 2: The benefit of compression is to increase unicity even
>     if the compression algorithm is fully known to the attacker. If the
>     plaintext is compressed before encipherment, then we rightly
>     expect its entropy per compressed character to increase -- even
>     though its entropy per English character does not increase. This
>     is often confusing and may provide the wrong impressions that
>     nothing is gained by compression or that we may need to "hide"
>     the compression algorithm from the attacker.

Yes, compression increases unicity distance for a cipher (but makes is
more susceptible to known plaintext attack).  From a security perspective,
it also makes the system more complex, which is a security problem in
of itself.  Further, as pointed out by John Kelsey in a 'crypto rump
session (2001, perhaps?) the compression state is, itself, a critical
security parameter, and can be used by an attacker to divine quite a
lot of information about the previously sent plaintext...

> 2. READING THE FINE PRINT
> Of further importance and often ignored or even contradicted by
> some statements in the literature such as "any cipher can be attacked by
> exhaustively trying all possible keys", I usually like to call attention to
> the fact that any cipher (including 56-bit-key DES) can be theoretically
> secure against any attacker -- even an attacker with unbounded
> resources -- when the cipher is used within its unicity. Not only the
> One-Time Pad is theoretically secure, but any cipher can be theoretically
> secure if used within the unicity distance. Thus, indeed there is a
> theoretically secure defense even against brute-force attacks, which is to
> work within the unicity limit of the cipher. And, it works for any cipher
> that is a good random cipher -- irrespective of key-length or encryption
> method used.

This ignores the idea that you have to assume that the attacker has
only ciphertext, which is an abysmally non-paranoid way of modeling
the security of the system.

> It is also important to note, as the literature has also not been very neat
> in this regard, that unicity is always referred to the plaintext. 

I believe that this is incorrect.

Comments and discussions are, of course, welcome...

				Josh

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com