cryptographic ergodic sequence generators?

Sun Sep 7 09:32:23 EDT 2003

On 09/06/2003 02:09 PM, Perry E. Metzger wrote:
 > For making things like IP fragmentation ids and other similar
 > protocol elements unpredictable,

OK, that more-or-less defines an objective.

 > it would be useful to have what I'll call a cryptographic ergodic
 > sequence generator

I'm not at all sure that is the best way to approach
the stated objective.

As usual, we need a clear idea of what threats we
are facing; we need more detail than is provided by
the objective mentioned above.  Also, as always, we
want to impose large costs on the bad guys and
small costs on the good guys, so we need to have
some sort of cost model.  In this case, the main
issues appear to be:
  a) some cost if try to re-use an ID prematurely.
  b) some cost if the bad guy guesses what ID we are
     going to use in the near future.  (Presumably he
     can replay IDs from the recent past as much as
     he wants;  we can't stop that.)
  c) constraints on n, the number of bits in the ID.
  d) computation costs and communication costs.
  e) possibly it is important for our peer at the
     receiving end to be able to treat our IDs as
     being _sequential_ and well-ordered, as the
     Subject: of this thread suggests, as opposed
     to being merely distinct.

There's no point in designing something that works
well only when it is not under attack... so let's
assume it is under heavy attack.  The bad guy is
trying like crazy to guess IDs.  This means that
cost (b) must be taken very seriously.

In particular, suppose we have an ergodic design
with n=24.  Then the bad guy can just sit and watch
the first 16 million packets, and can then eat us
for lunch, predicting future IDs with 100% success.
In the long run (large numbers of packets), this is
outcome is as bad as anything could possibly be.

We can do better.

I recommend we consider schemes that are not
strictly ergodic, but instead incorporate some
degree of randomness.

To understand the effectiveness of my schemes, we
must understand item (c), the constraints on n.
Let us suppose that there can be at most 2^k IDs
"alive" in the system at any one time.
  -- Obviously we must have n>=k;  otherwise it
would be impossible to use the IDs reliably, even
in the absence of an attack.
  -- More interestingly, we need to assume n is
appreciably bigger than k, or the whole game is
not worth playing.  That's because the bad guy
has one chance in 2^(n-k) of guessing an ID that
corresponds to an "alive" ID, no matter how
cleverly we choose the IDs.

Therefore, without further ado:

Scheme #1:  Construct a n-bit-long plain-ID using
a k-bit counter concatenated with (n-k) bits of
randomness.  Then form the final crypto-ID by
encrypting the plain-ID.  (The encryption key is
randomly chosen at system boot time, and remains
secret.)

This guarantees that no ID will collide with any
of the 2^k most-recently-used IDs.  It also
guarantees that the bad guy's chance of guessing
the next ID is only 2^(n-k).  [After all, *we*
don't have any better chance of guessing our next
ID, because of the randomness.]

In the long run, this is better than the strictly
ergodic scheme by a factor of 2^(n-k).  This is
about as well as we can do, in a minimax sense,
based on the maximum success the bad guy can have.

NOTE:  If the IDs need to be sequential, as
mentioned in desideratum (e) above, then we need to
give the decryption key to our peer at the receiving
end.  The peer can recover the plain-ID, discard
the randomness, and use the counter for sequencing.

Scheme #2:  We could go to the extreme of having
no counter at all, just n bits of randomness.  To
prevent collisions, we use a content-addressable
memory sufficient to remember the last 2^k IDs we
sent.  We choose candidate IDs at random; if a candidate
would cause a collision, we discard the candidate and
try again.  This anti-collision step increases
computational costs by roughly one part in 2^(n-k)
and increases communications costs not at all.

About the only advantage this has over scheme #1 has
to do with whether you think k is an upper bound or
a typical value.  If the typical number of "alive"
IDs in the system is less than the maximum number,
scheme #2 introduces more randomness and therefore
makes life harder for the bad guy.

This of course totally sacrifices any notion of
sequencing.

Scheme #3:  This is a hybrid of scheme #1 and scheme
#2.  Keep a c-bit counter, where we allow c to be
less than k.  This absolutely guarantees no collisions
with the last 2^c IDs, and makes it highly likely
that there will be no collisions stretching back much
farther than that.  We take our chances, without
bothering to have a content-addressable memory.

This scheme exploits the tradeoff between cost (a)
and cost (b).  In the likely case that cost (b) is
much larger, it might pay to reduce the chance of
incurring cost (b) even if this means a slight risk
of incurring cost (a).

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com