passphrases with more than 160 bits of entropy

Wed Mar 22 13:45:01 EST 2006

Aram Perez <aramperez at mac.com> writes:
> On Mar 22, 2006, at 9:04 AM, Perry E. Metzger wrote:
>
>>
>> Aram Perez <aramperez at mac.com> writes:
>>>> Entropy is a highly discussed unit of measure.
>>>
>>> And very often confused.
>>
>> Apparently.
>>
>>> While you do want maximum entropy, maximum
>>> entropy is not sufficient. The sequence of the consecutive numbers 0
>>> - 255 have maximum entropy but have no randomness (although there is
>>> finite probability that a RNG will produce the sequence).
>>
>> One person might claim that the sequence of numbers 0 to 255 has 256
>> bytes of entropy.
>
> It could be, but Shannon would not.

No, he wouldn't. You did, however. The maximum entropy a string of 256
bytes could have would be 256*8 bits. Since we're talking about a
sequence with far less entropy than 256*8 bits, it is not a sequence
of maximum entropy. There are, of course, trivially produced sequences
of maximum entropy.

>> Another person will note "the sequence of numbers 0-255" completely
>> describes that sequence and is only 30 bytes long.
>
> I'm not sure I see how you get 30 bytes long.

$ echo "the sequence of numbers 0-255" | wc -c
      30

Now, of course, there are probably not more than 1.5 bits of entropy
per letter in that sentence fragment, so really we're probably talking
about ~6 bytes of information. Doubtless, though, cleverer people
could do better.

>> Indeed, more
>> compact ways yet of describing that sequence probably
>> exist. Therefore, we know that the sequence 0-255 does not, in fact,
>> have "maximum entropy" in the sense that the entropy of the sequence
>> is far lower than 256 bytes and probably far lower than even 30 bytes.
>
> Let me rephrase my sequence. Create a sequence of 256 consecutive
> bytes, with the first byte having the value of 0, the second byte the
> value of 1, ... and the last byte the value of 255.

We heard you the first time.

The Shannon information of a message is the negative of the log (base
2) of the probability of the message. Of course, that definition is
only really useful if you're talking about a sequence of messages. The
Kolmogorov-Chaitin information of a text is (roughly) the smallest
program that can generate the text.

Both of these definitions are getting at can be informally described
as the smallest "representation" of a piece of information.

If Alice asks Bob "what color are your eyes", Bob could send a 10M
high resolution image of his eye, a precise measurement of the
spectrum reflected by his eyes, the word "blue", or perhaps even
something shorter, like a couple of bits that, according to a
pre-agreed table, represent an eye color from a table of eye
colors. The smallest possible representation of just the eye color,
the couple of bits from a table of eye color codes, is likely closest
to the information content of someone's eye color, though a precise
measurement is impossible since it is highly context dependent.

> If you measure the entropy (according to Shannon) of that sequence
> of 256 bytes, you have maximum entropy.

Clearly you don't, since the sequence can be described with far less
information than 256 bytes. A completely random and incompressible
sequence of 256 bytes would have maximum entropy, since it is
impossible to compress to less than 256*8 bits, but the sequence
0..255 has very little entropy because it is easily compressed to a
smaller size. This should be obvious. If I start reciting to you
"zero. one. two..." for a few iterations the probability of the next
byte will be 1/256 or close to it. Shortly, though, anyone who isn't
an idiot will guess what the next value (with nearly probability 1) in
the sequence is, and the information content of subsequent bytes falls
to far less than one bit per byte. This is just another way of saying
that the smallest program that generates 0..255 is quite small, or
that you can easily compress 0..255 into a description that fits in
far less than 256 bytes.

>> Entropy is indeed often confusing. Perhaps that is because both the
>> Shannon and the Kolmogorov-Chaitin definitions do not provide a good
>> way of determining the lower bound of the entropy of a datum, and
>> indeed no such method can exist.
>
> No argument from me.

I thought you were, indeed, still arguing.

Perry

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com