Compression theory reference?

Tue Aug 31 16:56:25 EDT 2004

Hadmut Danisch wrote:

> It can be easily shown that there is no lossless 
> compression method which can effectively compress every possible
> input.

OK.

> ... I need a book about computer science or encoding theory,
> which explicitely says that this is impossible, in a way that a person
> unexperienced in computer science (judges at a court) can read and 
> at least see that this is not just my personal phantasy and at least
> somehow reasonable. 

There are two conficting requirements there.
  -- authoritative and correct, and
  -- understandable to a judge with no technical expertise.

As you know, it is easy to satisfy either requirement separately.

My suggestions:

  1) Get a few "expert witnesses" to certify that your position is
certainly not a personal fantasy.  (I assume German jurisprudence
has something akin to the US notion of expert witnesses, right?)

  2) Try to get the burden-of-proof reversed.  The opposition
are claiming that known algorithms do the job.  Get them to be
specific about which algorithms they are referring to.  Then
file with the court some disks, each containing 1.44 megabytes
of data.  If the opposition are telling the truth, they should
be able to compress these using one of their chosen algorithms.
Offer to concede the entire case if they can shorten the text
by more than, say, 0.1 percent (i.e. 1.44 kilobytes shorter).
Of course you fill your disks using a good hardware random
number generator, e.g.
   http://www.av8n.com/turbid/
Be sure to arrange suitable precautions so that the opposition
doesn't get a chance to peek at your disks and build a
special-purpose "rumplestiltskin" encoder/decoder, i.e. one
that contains quotations of your data.  One precaution would
be to require that they use an open-source implementation, or
at least an impartial implementation, of a published algorithm.

  3) Diagram the pigeon-hole argument for the judge.  See
diagrams below.

  4) Don't forget the _recursion_ argument.  Take their favorite
algorithm (call it XX).  If their claims are correct, XX should
be able to compress _anything_.   That is, the output of XX
should _always_ be at least one bit shorter than the input.
Then the compound operation XX(XX(...)) should produce something
two bits shorter than the original input.  If you start with a
N-bit message and apply the XX function N-1 times, you should be
able to compress each and every message down to a single bit.

============================================================
Here's how to diagram the pigeon-hole argument:

Write down all the three-bit numbers, and demonstrate a lossless
non-compressive encoding:

plaintext     codetext

   000 ---------- 000

   001 ---------- 001

   010 ---------- 010

   011 ---\  /=== 011
           \/
           /\
   100 ===/  \--- 100

   101 ---------- 101

   110 ---------- 110

   111 ---------- 111

===========================================

Then give an example of a lossy compressive code.  Make
the point that the codetext words are shorter than the
plaintext words.  However, there are necessarily fewer
of them, so the coding scheme is necessarily lossy and
non-invertible.

plaintext     codetext
   000 ---------- 00  "zero"

   001 ---------- 01  "one"

   010 ---------- 10  "two"

   011 -------=== 11  "many"
             /
            |
   100 ----/|
            |
            |
   101 ----/|
            |
            |
   110 ----/|
            |
            |
   111 ----/

=========================================

Then give an example of a code that might or might not
be compressive, depending on statistical assumptions.
The following code is compressive if zero, one, and two
are considerably more probable than the other three-bit
numbers, but does is anti-compressive if all eight
three-bit numbers are equally likely, or nearly equally
likely.

   000 ---------- 00  "zero"

   001 ---------- 01  "one"

   010 ---------- 10  "two"

   011 ---------- 11000

   100 ---------- 11001

   101 ---------- 11010

   110 ---------- 11011

   111 ---------- 11100

Average length, for equiprobable plaintext words:
    (2+2+2+5+5+5+5+5) / 8 = 3.875
which is an expansion, not a compression.

Judges like fairness.  Certainly an equiprobable distribution
of input words is fair.  It reflects the bit-strings that
would be generated by tossing fair coins (three at a time),
or tossing a fair eight-sided die.  This fairest of distributions
is incompressible in the usual sense.

================================================

Finally, offer a challenge.  Write down all eight three-bit
plaintext words, and all four two-bit codetext words.  Ask
the judge to conclude for himself that it is obviously
impossible to draw lines connecting corresponding words in
a one-to-one fashion.

   000           00

   001           01

   010           10

   011           11

   100

   101

   110

   111

=================================

Note that in the last example, the codetext words were only one bit
shorter than the plaintext words.  If you want to pound on the point,
present the samething with four-bit plaintext and two-bit codetext,
i.e. where the codetext is _two_ bits shorter than the plaintext,
and demonstrate that the problem is even more extreme.

It may be better to leave off the diagram for this one, so as to
not burden the judge ... i.e. just make the assertion and dare
the opposition to refute it.  Remember that the typical judge chose
to study law because he didn't *want* to learn about logarithms and
entropy and all that.

=======================================================

Important note:

The following may sound like something from the Scholasticism
of the dark ages (angels dancing on pins and all that) but it's
true.

I personally have been shouted down at conferences for making
the points discussed above.  The shouters say that Lempel-Ziv
is "universal".  Indeed they point out the the title of the
seminal paper by Lempel & Ziv is
   "A Universal Algorithm for Sequential Data Compression"

And the funny thing is, in some narrow Scholastic sense they
are right ... LZ is "universal".  The fun comes from the fact
that they have _defined_ the word "universal" in a funny way.
According to their definitions, a "universal" compressor is
one that is no worse than any other by more than some
constant.  (The constant depends on which compressors are
being compared, but is independent of the choice of inputs.)

Notice that I am careful to write the term "universal" in
scare-quotes whenever this twisted technical meaning is
intended.

A "universal" compressor is not always compressive.

To summarize:  An algorithm that is "universal" in the
narrow technical Scholastic sense is not universal, or
all-purpose, or general-purpose in the ordinary vernacular
sense.

Therefore I recommend you choose your words very carefully,
so the bad guys don't get a chance to trip you up on a narrow
technicality.

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com