Entropy of other languages
Sandy Harris
sandyinchina at gmail.com
Wed Feb 7 08:42:49 EST 2007
Allen <netsecurity at sound-by-design.com> wrote:
> An idle question. English has a relatively low entropy as a
> language. Don't recall the exact figure, but if you look at words
> that start with "q" it is very low indeed.
>
> What about other languages? Does anyone know the relative entropy
> of other alphabetic languages? What about the entropy of
> ideographic languages? Pictographic? Hieroglyphic?
The most general answer is in a very old paper of Mandelbrot's.
Sorry, I don't recall the exact reference or have it to hand.
He starts from information theory and an assumption that
there needs to be some constant upper bound on the
receiver's per-symbol processing time. From there, with
nothing else, he gets to a proof that the optimal frequency
distribution of symbols is always some member of a
parameterized set of curves.
Pick the right parameters and Mandelbrot's equation
simplifies to Zipf's Law, the well-known rule about
word, letter or sound frequencies in linguistics.
I'm not sure if you can also get Pareto's Law which
covers income & wealth distributions in economics.
--
Sandy Harris
Quanzhou, Fujian, China
---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com
More information about the cryptography
mailing list