Possibly questionable security decisions in DNS root management

John Gilmore gnu at toad.com
Tue Oct 20 08:48:51 EDT 2009

> designed 25 years ago would not scale to today's load.  There was a  
> crucial design mistake: DNS packets were limited to 512 bytes.  As a  
> result, there are 10s or 100s of millions of machines that read *only*  
> 512 bytes.

Yes, that was stupid, but it was done very early in the evolution of
the Internet (when there were only a hundred machines or so).

Another bizarre twist was that the Berkeley "socket" interface to UDP
packets would truncate incoming packets without telling the user
program.  If a user tried to read 512 bytes and a 600-byte packet came
in, you'd get the first 512 bytes and no error!  The other 88 bytes
were just thrown away.  When this incredible 1980-era design decision
was revised for Linux, they didn't fix it!  Instead, they return the
512 bytes, throw away the 88 bytes, and also return an error flag
(MSG_TRUNC).  There's no way to either receive the whole datagram, or
get an error and try again with a bigger read; if you get an error,
it's thrown away some of the data.

When I looked into this in December '96, the BIND code (the only major
implementation of a name server for the first 20 years) was doing
512-byte reads (which the kernel would truncate without error).  Ugh!
Sometimes the string and baling wire holding the Internet together
becomes a little too obvious.

> It is possible to have larger packets, but only if there is prior  
> negotiation via something called EDNS0.

There's no prior negotiation.  The very first packet sent to a root
name server -- a query, about either the root zone or about a TLD --
now indicates how large a packet can be usefully returned from the
query.  See RFC 2671.  (If there's no "OPT" field in the query, then
the reply packet size is 512.  If there is, then the reply size is
specified by a 16-bit field in the packet.)

In 2007, about 45% of DNS clients (who sent a query on a given day to
some of the root servers) specified a reply size.  Almost half of
those specified 4096 bytes; more than 80% of those specified 2048 or
4096 bytes.  The other ~55% of DNS clients didn't specify, so are
limited to 512 bytes.

For a few years, there was a foolish notion from the above RFC that
clients should specify arbitrarily low numbers like 1280, even if they
could actually process much larger packets.  4096 (one page) is, for
example, the size Linux allows client programs to reassemble even in
the presence of significant memory pressure in the IP stack.  See:


> That in turn means that there can be at most 13 root  
> servers.  More precisely, there can be at most 13 root names and IP  
> addresses.

Any client who sets the bit for "send me the DNSSEC signatures along
with the records" is by definition using RFC 2671 to tell the server
that they can handle a larger packet size (because the DNSSEC bit is
in the OPT record, which was defined by that RFC).

"dig . ns @f.root-servers.net" doesn't use an OPT record.  It returns
a 496 byte packet with 13 server names, 13 "glue" IPv4 addresses, and
2 IPv6 "glue" addresses.

"dig +nsid . ns @f.root-servers.net" uses OPT to tell the name server
that you can handle up to 4096 bytes of reply.  The reply is 643 bytes
and also includes five more IPv6 "glue" addresses.

Older devices can bootstrap fine from a limited set of root servers;
almost half the net no longer has that restriction.

> The DNS is working today because of anycasting;  
> many -- most?  all? -- of the 13 IP addresses exist at many points in  
> the Internet, and depend on routing system magic to avoid problems.   

Anycast is a simple, beautiful idea, and I'm glad it can be made to
work in IPv4 (it's standard in IPv6).

> At that, you still *really* want to stay below 1500 bytes, the Ethernet MTU.

That's an interesting assumption, but is it true?  Most IP-based
devices with a processor greater than 8 bits wide are able to
reassemble two Ethernet-sized packets into a single UDP datagram,
giving them a limit of ~3000 bytes.  Yes, if either of those datagrams
is dropped en route, then the datagram won't reassemble, so you've
doubled the likely failure rate.  But that's still much lower overhead
than immediately falling back to an 8-to-10-packet TCP connection,
particularly in the presence of high packet drop rates that would
also cause TCP to use extra packets.

> > As it is today, if NSA (or any major country, organized crime
> > group, or civil rights nonprofit) built an RSA key cracker, more
> > than 50% of the RSA keys in use would fall prey to a cracker that
> > ONLY handled 1024-bit keys.  It's probably more like 80-90%,
> > actually.  Failing to use 1056, 1120, 1168-bit, etc, keys is just
> > plain stupid on our (the defenders') part; it's easy to automate
> > the fix.
> That's an interesting assumption, but is it true?

I've seen papers on the prevalence of 1024-bit keys, but don't have a 
ready URL.  It's a theory.  Any comments, NSA?

> In particular, is it really that useful to tune an cracking engine
> to exactly one modulus size?  A maximum size, sure, but do the
> cracking engines really have trouble with slightly smaller moduli?

I don't think most crackers would have trouble solving moduli smaller than
their maximum moduli; that's not the issue.  There are two answers:

  *  Security is economics.  As targets of crackers, we should spread
     out, rather than clumping.  Because it would hand you the keys to
     80-90% of everything, the economic incentive to build a 1024-bit
     cracker is irresistable, as soon as it becomes economically
     feasible at all.

     In the absence of a major mathematical breakthrough, we can
     assume that an 1120-bit cracker would be more expensive than a
     1024-bit one.  It would have wider datapaths and burn more power.
     To actually crack an 1120-bit key would take much, much longer
     than to crack a 1024-bit one on the same hardware.  Anyone who
     uses an 1120-bit key is safe for some number of years after all
     the interesting 1024-bit keys have fallen -- but they've only
     paid a small performance and storage penalty for that safefy.
     Similarly, a site using 1168-bit keys would remain safe, even if
     an 1120-bit cracker was built.

     If the user population spread out such that no more than 5% of us
     were using any given key size, the economic incentive to build a
     cracker would shift toward the more expensive bigger crackers,
     making all of us safer -- even the people whose keys were
     shortest.  It isn't worth building a 1024-bit cracker if that
     only breaks you into 5% of the infrastructure (and if the really
     important infrastructure is smart enough not to be in that 5%).
     It isn't worth building the 1120 if that only gets you 15%.
     Maybe you have to be able to afford to crack 1400 or 1500 bits
     before you can crack 50% of the infrastructure, and your
     management won't spend the money on the cracker until it's *that*
     useful.  That's likely to give the public a significant safety
     margin, compared to what happens if everybody's keys sitting at
     1024 bits.

  *  Some kinds of crackers may only work, or may work best, when their
     maximum number of bits is a power of two.  Thus perhaps the
     cracker-jack has to either build a 1024-bit cracker or a *much*
     slower and more expensive 2048-bit cracker.  (The 2048-bit
     cracker could crack 1168-bit keys, but it's more expensive to
     build it, so it might not be built at all, or at least your
     1168-bit keys will be safe until the cracker-jack can afford
     2048-bit hardware.)


The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com

More information about the cryptography mailing list