Compression theory reference?

Tue Aug 31 15:50:55 EDT 2004

On Tue, Aug 31, 2004 at 02:48:00PM +0200, Hadmut Danisch wrote:

> It can be easily shown that there is no lossless 
> compression method which can effectively compress every possible
> input. Proof is easy: In a first step, consider all 
> possible messages of length n bit, n>0. There are 2^n different
> ones. But there are only (2^n)-1 shorter messages, so 
> there is no injective encoding to encode all messages into
> a shorter one.

More simply:

Choose any injection (e.g. lossless compression) from the set of finite
length messages to the set of finite length messages. Suppose for
*some* finite number N the injection has the property that no message
of length *at most* N bits encodes in more than N bits (some might get
longer while staying under N+1 bits, some shorter we don't care). Since
there is a finite number of messages that are N bits or less, and the
injection is from a finite set to itself, the injection is a bijection
(no missed outputs). So from the weaker assumption we conclude that all
N bit or less outputs correspond to N bit or less inputs, and therefore
that the injection we are looking at for the given number N never encodes
a message of length N+1 or greater in N bits or less.

>From the above we can conclude that any injection, that for *all*
N has the above (weaker for any *given* N than always compressing)
property, never encodes a message of length N+1 in N or less bits. The
only candidate injections are then simple length-preserving permutations.

> Therefore, I need a book about computer science or encoding theory,
> which explicitely says that this is impossible, in a way that a person
> unexperienced in computer science (judges at a court) can read and 
> at least see that this is not just my personal phantasy and at least
> somehow reasonable. 
> 
> I have checked several books about information and coding theory, 
> but didn't find any suitable for readers without computer science 
> background. The statement is always hidden in formulas about
> entropy, average code lengths, code efficiency inequations etc.
> If you had such an encoding scheme, you could easily show that 
> the average length is below the entropy of the efficiency is >100%.  
> But a non-computer science person does not understand that. 
> 

This is a question in elementary finite set theory, not computer science
(whatever that is :-). All proofs will involve some mathematics. The
above is I think simpler than your original argument and is the simplest
I can come up with...

-- 

 /"\ ASCII RIBBON                  NOTICE: If received in error,
 \ / CAMPAIGN     Victor Duchovni  please destroy and notify
  X AGAINST       IT Security,     sender. Sender does not waive
 / \ HTML MAIL    Morgan Stanley   confidentiality or privilege,
                                   and use is prohibited.

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com