Compression theory reference?
Victor Duchovni
Victor.Duchovni at MorganStanley.com
Tue Aug 31 15:50:55 EDT 2004
On Tue, Aug 31, 2004 at 02:48:00PM +0200, Hadmut Danisch wrote:
> It can be easily shown that there is no lossless
> compression method which can effectively compress every possible
> input. Proof is easy: In a first step, consider all
> possible messages of length n bit, n>0. There are 2^n different
> ones. But there are only (2^n)-1 shorter messages, so
> there is no injective encoding to encode all messages into
> a shorter one.
More simply:
Choose any injection (e.g. lossless compression) from the set of finite
length messages to the set of finite length messages. Suppose for
*some* finite number N the injection has the property that no message
of length *at most* N bits encodes in more than N bits (some might get
longer while staying under N+1 bits, some shorter we don't care). Since
there is a finite number of messages that are N bits or less, and the
injection is from a finite set to itself, the injection is a bijection
(no missed outputs). So from the weaker assumption we conclude that all
N bit or less outputs correspond to N bit or less inputs, and therefore
that the injection we are looking at for the given number N never encodes
a message of length N+1 or greater in N bits or less.
>From the above we can conclude that any injection, that for *all*
N has the above (weaker for any *given* N than always compressing)
property, never encodes a message of length N+1 in N or less bits. The
only candidate injections are then simple length-preserving permutations.
> Therefore, I need a book about computer science or encoding theory,
> which explicitely says that this is impossible, in a way that a person
> unexperienced in computer science (judges at a court) can read and
> at least see that this is not just my personal phantasy and at least
> somehow reasonable.
>
> I have checked several books about information and coding theory,
> but didn't find any suitable for readers without computer science
> background. The statement is always hidden in formulas about
> entropy, average code lengths, code efficiency inequations etc.
> If you had such an encoding scheme, you could easily show that
> the average length is below the entropy of the efficiency is >100%.
> But a non-computer science person does not understand that.
>
This is a question in elementary finite set theory, not computer science
(whatever that is :-). All proofs will involve some mathematics. The
above is I think simpler than your original argument and is the simplest
I can come up with...
--
/"\ ASCII RIBBON NOTICE: If received in error,
\ / CAMPAIGN Victor Duchovni please destroy and notify
X AGAINST IT Security, sender. Sender does not waive
/ \ HTML MAIL Morgan Stanley confidentiality or privilege,
and use is prohibited.
---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com
More information about the cryptography
mailing list