FC: Politech challenge: Decode Al Qaeda stego-communications!

Thu Jul 11 18:08:25 EDT 2002

On Wed, 10 Jul 2002, Kevin E. Fu wrote:

>Of course terrorists are communicating via Web sites. Lots of people
>communicate via the Web.

More likely "everybody who's anybody communicates via the Web".

>The only person publicly searching for hidden terrorist messages hasn't
>found any. And he's using sound analytical techniques.

Indeed. Even when that isn't a guarantee in the least with steganography,
it's far more likely that this precise uncertainty has seriously caught
the mainstream press in a cycle of unwarranted paranoia and delusion. Even
when there is no guarantee, that is certainly no guarantee that there is
indeed stego imbedded in some online images.

I'd really like to know who started this stream of
web-stego-osama-terrorism articles -- I wouldn't be in the least surprised
to know that the original source was someone closely affiliated with some
War on Terrorism people.

More seriously, most people have little idea of how industrial strength
steganography works. They consider the race against "terrorists using
stego" as a heroic battle of horsepowers which the side better equipped
with intellectual or economic assets will eventually win.

In practice, steganography isn't like that. Instead, there are sound
information thereotical bounds which limit the performance of any observer
whatsoever (mostly the best source model that can be created). It is quite
possible that even the best equipped counter-terrorist agency will remain
ignorant of terrorist communications taking place, provided that the
above-mentioned information theoretical bounds are adhered to. That's
simply because when they are, the presence of stego is unknowable. Period.

The larger public does not (and apparently cannot) understand this. In
consequence, it will be ready to pour formidable financing, worry,
resources to combat a threat which might not even be there. It's also
difficult in the extreme to see whether the worry has been incited on
purpose, or whether it is just a consequence of an initial reporter
failing to see the point at a crucial time. But whatever the reason for
the current hysteria, the full picture of stego is a very clear cut one.

In the ideal world of classical, Shannonian information theory,
steganography basically becomes a race in acquiring source statistics.
Whoever knows those statistics the best can best mimic and find the
redundancy in them. When such redundancy exists, it can be exploited to
transfer information unseen by anyone "less in the know", by choosing
among alternatives permitted by the redundancy. This precise idea also
applies to different levels of redundancy detectable by statistical source
models of different acuity. Those who have less accurate statistics will
be fooled by a stego transmission because they cannot discriminate between
entropy contributed by the source and that driven by the stegoist. To
those who have more accurate source models at hand, the task of telling
stego from non-stego becomes a simple Bayesian discrimination task, with a
certainty level dictated by how much information was sent by the stegoist
and the acuity of the model. In our finite world the latter is bounded by
the amount of information in all communications known to the observer.

In the case of stego methods widely distributed (say, LSb switching in
images) it is fairly easy to discriminate between stego and non-stego.
But that's only when we know that this particular method has been used. In
theory, we get a whole array of increasingly sophisticated source models
and steganographic codings utilizing them. Whoever has the best source
model can utilize the smallest redundancies and so can encode information
at the most unnoticeable level. The tradeoff is between noticeable
modifications to the source and steganographic bandwidth. Only when one
has a better model can one discriminate between stego and non-stego
transmissions, and only the best detection method and best source model
set the bounds for total stego bandwidth.

In practice, models aren't very exact. E.g. pictures can have all kinds of
content not adherent to what one might call the norm. In consequence,
images always contain a lot of redundancy. Especially so if we factor in
all the real life concerns, like compression/download time tradeoffs in
image transmission -- even if we know the perfect statistics for the
source image, it is impossible to tell whether statistical anomalies are
due to excessive compression or steganography being present. And so on.
It's quite possible to construct low bandwidth steganographic encodings
which are with high probability totally undecipherable into the known
future.

So, in the end, there is no way to know whether a given communication has
embedded stego. It's usually fairly easy to tell whether a specific method
has been used if one has a good source model, true, but in the general
setting, neither does such a model exist nor do we have a guarantee that a
specific method has indeed been used. We're left out in the cold, not
knowing whether a communication has taken place.

That, then, becomes both the beauty of steganography and the beast of
inciting countless clueless reporters to suspect stego being present in
places it isn't. If one cannot know, one cannot know. A number of
reporters ought to be taught about knowable versus true/false if we're
ever to get rid of this web-stego-osama-terrorism nonsense.

Sampo Syreeni, aka decoy - mailto:decoy at iki.fi, tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com