[Cryptography] crypto goals and criteria
John Denker
jsd at av8n.com
Tue May 12 16:35:56 EDT 2015
On 05/12/2015 04:27 AM, Salz, Rich wrote:
>> The argument goes that
>> encryption will thwart the censors. Except of course that the encrypted
>> traffic still reveal page lengths, compressed or not...
>
> Which is why HTTP/2 has padding, and TLS 1.3 will probably have it.
>
> The IETF isn't "encrypt everything" but rather "pervasive monitoring
> is an attack," with the knowledge that protection of meta-data (DNS,
> padding, timing) is important. There's no guarantee we'll get it
> right, or even if it's possible, but they're trying.
That makes sense.
Here's another way of saying the same thing:
*) Metadata is data.
*) A cryptosystem that leaks metadata
is a cryptosystem that leaks.
*) A cryptosystem that leaks when compression is applied
is a cryptosystem that leaks.
*) A cryptosystem that leaks when the attacker can
inject some known plaintext
is a cryptosystem that leaks.
Traffic analysis is a Big Deal. Cryptanalysts have been
using traffic analysis for as long as there's been crypto.
I reckon we will never be able to stop "all" leakage, but
still we have to recognize it for what it is: leakage.
I see the distinction between metadata and data as (at
best) a legal fiction, created in the US as a way to
get around the 4th amendment (not to mention the 3rd,
9th, and 10th).
By way of contrast:
On 05/12/2015 08:54 AM, John Levine wrote:
>> It would be quite a feat to figure out which Wikipedia page someone
>> was reading just from the page length, compressed or otherwise.
>> There's over 4,800,000 articles each of which can be rendered in many
>> different ways (talk, history, diffs, etc.), they change all the time,
>> and the size of a page depends on whether you're logged in and
>> probably on other stuff. For example, I just retrieved a Wikipedia
>> page on a topic related to a river in the United States. The
>> uncompressed length of the page was 47,068 bytes. Free beer to the
>> first person who figures out what page it was.
That strikes me as naïve.
On a onesie-twosie basis, it would cost more than the price
of a beer to figure that out. However, the thought police
in even a smallish police state are surveilling millions of
people, and they get to /amortize/ the cost of indexing the
wikipedia. The scaling behavior is similar to that of a
dictionary attack. The cost per victim is negligible.
Sure, /some/ of the articles have changed since yesterday,
but most of them haven't ... and the thought police do
not need to read all of your communications; a sample
suffices.
Since there are more articles than there are plausible
length values, there will be some collisions ... but the
ambiguities can be resolved by looking at additional
information not provided in the example above, e.g. the
pattern of included images, incoming links, outgoing links,
et cetera ... and/or by statistical inference. The NSA
is reeeeally good at statistics.
A dissident could buy a measure of protection by downloading
the entire wikipedia and then referring to the local copy.
This is an example of what we call /cover traffic/. The
English-language part is on the order of 10 gigabytes, so
this is not even particularly expensive:
lynx -source -head https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2
Content-Length: 11820881800
OTOH it remains a cat-and-mouse game; cover traffic does
not defeat all avenues of attack.
Ed Snowden said:
"Encryption works. Properly implemented strong crypto
systems are one of the few things that you can rely on.
Unfortunately, endpoint security is so terrifically weak
that NSA can frequently find ways around it."
I suggest we need to pay more attention to the last part.
Just to give you some idea how hard it will be to fix the
problem, consider the following use-case:
I google for "babe". The query and the reply are secured
by https. So far so good.
a) If, however, I click on one of the hits, google will know
whether I am interested in
-- mythical oxen
-- mythical pigs
-- legendary ballplayers
-- damsels
-- or whatever
And (!) if google has it, the government will grab it, without
even a warrant. According to the 2nd circuit court of appeals,
this is illegal. I say even if it were legal it would be
unconstitutional, and even if it were constitutional it would
be bad policy ... but none of that stops them from doing it.
Here's how google knows: Even though the text and the
tooltip tell you that the link points to
en.wikipedia.org/wiki/Babe_(film)
it doesn't. Instead it points to something at google.com
that will record your click and then redirect you to the
nominal destination.
b) Furthermore, after redirection the link points to an
unencrypted http page, even though the corresponding https
page also exists. Many months ago google announced that
they would fix this, i.e. that search results would favor
the encrypted version when available ... but it hasn't
actually happened.
On my system, I have workarounds for (a) and (b), but even
so, I don't imagine that my system is secure. I assume my
machines (including phones) are compromised at every level
from the firmware on up. I assume the "Root CA" clown car
is compromised several times over.
Bottom line:
*) Security requires a lot more than cryptography.
*) Metadata is data.
*) A cryptosystem that leaks metadata
is a cryptosystem that leaks.
More information about the cryptography
mailing list