Back to index

Development of the Domain Name System

Paul V. Mockapetris and Kevin J. Dunlap

One-line summary: DNS provides distributed, replicated, and cached bindings
of names to address (and other such resource records); the design,
implementation, surpises, successes, and failures of the DNS system are
examined in this paper.

Overview/Main Points

   * HOSTS.TXT - was a centrally managed file containing name to address
     mappings. Obviously wouldn't scale. Replacement needed:
        o same info as hosts.txt
        o allow database to be mainained in a distributed manner
        o no obvious size limits (names, components, data)
        o extreme interoperability (across networks/internetworks)
        o tolerable performance
        o extensible services
   * DNS architecture is simple: name servers are repositories of
     information, resolvers are interfaces to client programs that embody
     the algorithm to find a name server that has the info the client wants.
        o Name space: variable-depth tree, each node has a label. Domain
          name of a node is concatenation of all labels on path from node to
          root of tree. Structure of tree is decoupled from any implicit
          semantics.
        o Data: associated with each name is a set of resource records (RRs)
          - each record is a (type,data) tuple where the set of types is
          well known (but extensible). Example types are host addresses, MX
          records, etc.
        o Database distribution:
             + zones: zones are sections of the system-wide database
               controlled by a specific organization. Zones are contiguous
               pieces of the tree (i.e. a connected subgraph). Zones are
               created by convincing a parent organization to delegate a
               subzone consisting of a node, and then the owner of that node
               can create names within the zone.
             + caching: resolvers and name servers cache responses for use
               by later queries. A time-to-live (TTL) field is the mechanism
               for invalidating cache entries.
        o Resolvers search downwards from domains they can access already.
          Resolvers are configured with hints pointing to servers for the
          root node of the DNS, and the top node of the local domain. Thus,
          if a name isn't found locally, you hit (one of the) root nodes of
          the entire DNS.
        o Surprises:
             + Performance - underlying network performance was worse than
               expected, but DNS hierarchy still performed well, to the
               point where people using lookups for queries that did not
               need network access.
             + Negative caching - two negative responses (name in question
               does not exist, name in question exists but requested data
               does not). High percentage of negative responses, some from
               misspellings, some from programmers using DNS lookup to check
               if address is valid in DARPA internet.
        o Successes:
             + Variable-depth hierarchy - matches variable sizes of
               administrative domains, makes it possible to encapsulate
               other name spaces.
             + Organizational structuring of names - names are independent
               of network, topology, etc.
             + Datagram underlying protocol - "datagrams were successful and
               probably essential, given the unexpectedly bad performance of
               the DARPA Internet."
             + Additional section processing - allow responding server to
               anticipate the next logical request and answer it before it
               was asked, avoiding significant added cost. Cuts query
               traffic in half. (Prefetch!)
             + Caching - very successful, but security issues. (One
               administrator reversed TTL and data fields in file, ended up
               distributing bad data with TTL of years.)
        o Shortcomings:
             + Type and class growth - software needs to be recompiled,
               political hurdle to gain acceptance of new types, new types
               are useless until software adopts it.
             + Easy upgrading - incorporating DNS into an application is
               hard, especially since semantics change due to possibility of
               transient failure of DNS system. (Data lookup now may fail.)
             + Distribution of control vs. distribution of expertise - DNS
               administrators/maintainers work exactly enough to get their
               system working, not to get it working well. Human problem,
               not DNS problem.
        o Conclusions:
             + caching can work in a heterogeneous environment, but should
               include features for negative caching as well.
             + more difficult to remove functions from systems than it is to
               get a new function added. All functions become more complex
               then, as new features are added.
             + implementors lose interest once a new system delivers the
               level of performance they expect; they are not motivated to
               optimize their use of others' resources.
             + allowing variations in the implementation structure is good,
               allowing variations in provided service is bad.

     Relevance

     One of the most successful systems ever built and deployed. Still works
     today, under staggering load.

     Flaws

        o seems like hitting the root node if local lookup fails is a poor
          choice from the perspective of scalability. I'd be interested in
          knowing how much load the root servers get nowadays.
        o the semantic issues of blocking and potentially failing DNS
          lookups are thorny, and need some work.
     -----------------------------------------------------------------------
     Back to index