Entropy of other languages
Steven M. Bellovin
smb at cs.columbia.edu
Mon Feb 5 16:48:16 EST 2007
On Sun, 04 Feb 2007 15:46:41 -0800
Allen <netsecurity at sound-by-design.com> wrote:
> Hi gang,
>
> An idle question. English has a relatively low entropy as a language.
> Don't recall the exact figure, but if you look at words that start
> with "q" it is very low indeed.
>
> What about other languages? Does anyone know the relative entropy of
> other alphabetic languages? What about the entropy of ideographic
> languages? Pictographic? Hieroglyphic?
>
It should be pretty easy to do at least some experiments today --
there's a lot of online text in many different languages. Have a look
at http://www.gutenberg.org/catalog/ for freely-available books that
one could mine for statistics.
--Steve Bellovin, http://www.cs.columbia.edu/~smb
---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com
More information about the cryptography
mailing list