In all the talk of super computers there is not...
Leichter, Jerry
leichter_jerrold at emc.com
Thu Sep 6 09:28:40 EDT 2007
| Hi Martin,
|
| I did forget to say that it would be salted so that throws it off by
| 2^12
|
| A couple of questions. How did you come up with the ~2.5 bits per
| word? Would a longer word have more bits?
He misapplied an incorrect estimate! :-) The usual estimate - going
back to Shannon's original papers on information theory, actually - is
that natural English text has about 2.5 (I think it's usually given as
2.4) bits of entropy per *character*. There are several problems here:
- The major one is that the estimate should be for *characters*,
not *words*. So the number of bits of entropy in
a 55-character phrase is about 137 (132, if you use
2.4 bits/character), not 30.
- The minor one is that the English entropy estimate looks just
at letters and spaces, not punctuation and capitalization.
So it's probably low anyway. However, this is a much
smaller effect.
-- Jerry
---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com
More information about the cryptography
mailing list