FW: Entropy of other languages

Trei, Peter ptrei at rsasecurity.com
Tue Feb 6 09:30:11 EST 2007

Steven M. Bellovin wrote:

> On Sun, 04 Feb 2007 15:46:41 -0800
> Allen <netsecurity at sound-by-design.com> wrote:
> > Hi gang,
> > 
> > An idle question. English has a relatively low entropy as a
> language.
> > Don't recall the exact figure, but if you look at words that start 
> > with "q" it is very low indeed.
> > 
> > What about other languages? Does anyone know the relative entropy of

> > other alphabetic languages? What about the entropy of ideographic 
> > languages? Pictographic? Hieroglyphic?
> > 
> It should be pretty easy to do at least some experiments today -- 
> there's a lot of online text in many different languages.  Have a look

> at http://www.gutenberg.org/catalog/ for freely-available books that 
> one could mine for statistics.

As a very rough proxy, look at the length of the same text in different

My father was in advertising in Europe. When they laid out a print ad,
they always did so using the German text. If the German fit, any other
language they were interested in would do so as well.

Now that I work (among other things) on cellphone applications, I'm
running into similar issues in internationalizing text on tiny screens.

Peter Trei

Disclaimer: This is a personal opinion. It may or may not jibe with my
employer's opinion.

The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com

More information about the cryptography mailing list