Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]]

Mon Aug 28 12:12:12 EDT 2006

We are both talking about the same thing :-)

I am not saying there is a finite deterministic algorithm to compress
every string into "small space", there isn't. BTW, thanks for "There
is ***NO*** way round the counting theory." :-)

All I wanted to say is:
For a specific structure (e.g. movie, picture, sound) there is some
"good" compression algorithm.

E.g.: if you take a GIF 65536x65536, all white, with just one pixel
black, it can be compressed into 35 bytes, see here:
http://i.iinfo.cz/urs-att/gif3_6-115626056883166.gif
If you wanted to compress the same picture using JPEG (i.e. discrete
cosine transform), then two things would happen:

The compressed jpeg file would be a) much bigger b) decompressed image
would have "artifacts", because Fourier transform of a pulse is sync
(infinitely many frequencies). Sure, JPEG is a lossy compression, but
good enough for photos and images that don't have a lot of high
frequencies.

Cheers,
  OM

On 8/28/06, Dave Korn <dave.korn at artimi.com> wrote:
> On 28 August 2006 15:30, Ondrej Mikle wrote:
>
> > Ad. compression algorithm: I conjecture there exists an algorithm (not
> > necessarily *finite*) that can compress large numbers
> > (strings/files/...) into "small" space, more precisely, it can
> > compress number that is N bytes long into O(P(log N)) bytes, for some
> > polynomial P.
>
> [  My maths isn't quite up to this.  Is it *necessarily* the case that /any/
> polynomial of log N /necessarily/ grows slower than N?  If not, there's
> nothing to discuss, because this is then a conventional compression scheme in
> which some inputs lead to larger, not smaller, outputs.  Hmm.  It would seem
> to me that if P(x)==e^(2x), P(log n) will grow exponentially faster than N.
> I'm using log to mean natural log here, substitute 10 for e in that formula if
> we're talking about log10, the base isn't important.  However, assuming that
> we're only talking about P that *do* grow more slowly, I'll discuss the
> problem with this theory anyway.  ]
>
>
>   There are many, but there are no algorithms that can compress *all* large
> numbers into small space; for all compression algorithms, some sets of input
> data must result in *larger* output than input.
>
>   There is *no* way round the sheer counting theory aspect of this.  There are
> only 2^N unique files of N bits.  If you compress large files of M bits into
> small files of N bits, and you decompression algorithm produces deterministic
> output, then you can only possibly generate 2^N files from the compressed
> ones.
>
> > Take as an example group of Z_p* with p prime (in another words: DLP).
> > The triplet (Z, p, generator g) is a compression of a string of p-1
> > numbers, each number about log2(p) bits.
> >
> > (legend: DTM - deterministic Turing machine, NTM - nondeterministic
> > Turing machine)
> >
> > There exists a way (not necessarily fast/polynomial with DTM) that a
> > lot of strings can be compressed into the mentioned triplets. There
> > are \phi(p-1) different strings that can be compressed with these
> > triplets. Exact number of course depends on factorization of p-1.
> >
> > Decompression is simple: take generator g and compute g, g^2, g^3,
> > g^4, ... in Z_p*
>
>   This theory has been advanced many times before.  The "Oh, if I can just
> find the right parameters for a PRNG, maybe I can get it to reconstruct my
> file as if by magic".  (Or subsitute FFT, or wavelet transform, or
> key-expansion algorithm, or ... etc.)
>
>   However, there are only as many unique generators as (2 to the power of the
> number of bits it takes to specify your generator) in this scheme.  And that
> is the maximum number of unique output files you can generate.
>
>   There is ***NO*** way round the counting theory.
>
> > I conjecture that for every permutation on 1..N there exists a
> > function that compresses the permutation into a "short"
> > representation.
>
>   I'm afraid to tell you that, as always, you will find the compression stage
> easy.... and the decompression impossible.  There are many many many more
> possible permutations of 1..N than the number of unique "short
> representations" in the scheme.  There is no way that the smaller number of
> "unique representations" can possibly produce any more than the same (smaller)
> number of permutations once expanded.  There is no way to represent the other
> (vast majority) of permutations.
>
> >  It is possible that only NTM, possibly with infinite
> > algorithm (e.g. a human) can do it in some "short finite time".
>
>   Then it's not really an algorithm, it's an ad-hoc collection of different
> schemes.  If you're allowed to use a completely different encryption scheme
> for every file, I can do better than that: for every file, define an
> encryption scheme where the bit '1' stands for the content of that file, and
> everything else is represented by a '0' bit followed by the file itself.
> Sure, most files grow 1 bit bigger, but the only file we care about is
> compressed to just a single bit!  Great!
>
>   Unfortunately, all you've done is moved information around.  The amount of
> information you'd have to have in the decompressor to know which file to
> expand any particular '1' bit into is equal to .... the amount you've saved by
> compressing the original to a single bit.
>
>   There is ***NO*** way round the counting theory.
>
> > Again,
> > I could've/should've proven or disproven the conjecture, I just don't
> > want to do it - yet ;-)
>
>   I seriously advise you not to waste much time on it.  Because ...
>
>
>
>
>
>
>      There is ***NO*** way round the counting theory.
>
>
>
>
>
>
>
>     cheers,
>       DaveK
> --
> Can't think of a witty .sigline today....
>
>

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com