[Cryptography] traffic analysis -> let's write an RFC?

Thu Jan 29 14:35:19 EST 2015

On 01/29/2015 04:03 AM, Ben Laurie wrote:
> Clearly the idea was you design your network so that you do own the link.
> Which brings me back to my question (even Google cannot afford that much
> network, I suspect).

1) Owning the physical link is unnecessary and irrelevant.
 In the question, let's replace the concept of physical
 "link" with /virtual circuit/.  Assume the carrier who
 owns the physical link is untrustworthy.  You can still
 buy a virtual circuit with X amount of bandwidth for a 
 finite price.  You can fill that bandwidth with traffic 
 -- cover traffic or otherwise -- and secure the traffic
 using crypto.

2) It may help to distinguish "traffic analysis" from 
 "traffic engineering".  Imagine a cat-and-mouse game,
 where the mice are trying to communicate, and the cats
 are attacking by means of traffic analysis.  The mice
 need to do traffic engineering, so as to maximize the
 amount of traffic they can move, maximize the security,
 and minimize the cost.

 Traffic engineering starts from the observation that
 peak demand is eeeenormously larger than average demand.
 Therefore there is /traffic gain/ to be had by sharing.
 Under "some" favorable conditions, this can be done
 while still defending against traffic analysis.

 Here is a scenario.  This does not solve all the world's
 problems, but it has some value.  If nothing else, it
 proves we should not give up.

 Suppose there exist cheap links and expensive links,
 e.g. local versus transatlantic.  Suppose you have
 a group of N friends whom you trust.  The group is
 connected by cheap links.  All traffic is secured
 by crypto.  Each member of the group puts X amount
 of steady traffic on the expensive link, for a total 
 of N*X.  When you have a burst of traffic that
 exceeds X, you distribute it (cheaply!) to your
 friends, and ask them to forward it.

        _______
       |       |
       |   A   |                _________
       |_______|---------------|         |
          +   +           -----| carrier |---(ocean)---
          +   +_______   |   --|_________|
          +   |       |  |  |
          +   |   B   |--   |
          +   |_______|     |
          +    +            |
        __+____+            |
       |       |            |
       |   C   |------------
       |_______|

  "++" = cheap link
  "--" = expensive link

 In this scenario, you can achieve traffic gain in
 proportion to the number of people you can trust.
 Increasing N increases your peak bandwidth, at the
 cost of a proportional increase in attack surface.

 One must also analyze this from the cats' point of
 view.  Assume the cats' goal is to spy on /everybody/.
 Suppose M of the N sites can be compromised, meaning
 that the cat can now tell the difference between null
 and non-null traffic at that site.  It then becomes 
 a statistic problem to figure out where the non-null
 traffic originated.  As a function of M:
  -- how good are the statistics?
  -- what is the direct cost?
  -- what are the indirect costs, such as the chance
   that some mouse will notice that he's been pwned?

This is obviously just the sketch of an outline, but
it shows there are some things that can be done,
without requiring us to redesign all protocols from
layer 1 upward.