Back to index Development of the Domain Name System Paul V. Mockapetris and Kevin J. Dunlap One-line summary: DNS provides distributed, replicated, and cached bindings of names to address (and other such resource records); the design, implementation, surpises, successes, and failures of the DNS system are examined in this paper. Overview/Main Points * HOSTS.TXT - was a centrally managed file containing name to address mappings. Obviously wouldn't scale. Replacement needed: o same info as hosts.txt o allow database to be mainained in a distributed manner o no obvious size limits (names, components, data) o extreme interoperability (across networks/internetworks) o tolerable performance o extensible services * DNS architecture is simple: name servers are repositories of information, resolvers are interfaces to client programs that embody the algorithm to find a name server that has the info the client wants. o Name space: variable-depth tree, each node has a label. Domain name of a node is concatenation of all labels on path from node to root of tree. Structure of tree is decoupled from any implicit semantics. o Data: associated with each name is a set of resource records (RRs) - each record is a (type,data) tuple where the set of types is well known (but extensible). Example types are host addresses, MX records, etc. o Database distribution: + zones: zones are sections of the system-wide database controlled by a specific organization. Zones are contiguous pieces of the tree (i.e. a connected subgraph). Zones are created by convincing a parent organization to delegate a subzone consisting of a node, and then the owner of that node can create names within the zone. + caching: resolvers and name servers cache responses for use by later queries. A time-to-live (TTL) field is the mechanism for invalidating cache entries. o Resolvers search downwards from domains they can access already. Resolvers are configured with hints pointing to servers for the root node of the DNS, and the top node of the local domain. Thus, if a name isn't found locally, you hit (one of the) root nodes of the entire DNS. o Surprises: + Performance - underlying network performance was worse than expected, but DNS hierarchy still performed well, to the point where people using lookups for queries that did not need network access. + Negative caching - two negative responses (name in question does not exist, name in question exists but requested data does not). High percentage of negative responses, some from misspellings, some from programmers using DNS lookup to check if address is valid in DARPA internet. o Successes: + Variable-depth hierarchy - matches variable sizes of administrative domains, makes it possible to encapsulate other name spaces. + Organizational structuring of names - names are independent of network, topology, etc. + Datagram underlying protocol - "datagrams were successful and probably essential, given the unexpectedly bad performance of the DARPA Internet." + Additional section processing - allow responding server to anticipate the next logical request and answer it before it was asked, avoiding significant added cost. Cuts query traffic in half. (Prefetch!) + Caching - very successful, but security issues. (One administrator reversed TTL and data fields in file, ended up distributing bad data with TTL of years.) o Shortcomings: + Type and class growth - software needs to be recompiled, political hurdle to gain acceptance of new types, new types are useless until software adopts it. + Easy upgrading - incorporating DNS into an application is hard, especially since semantics change due to possibility of transient failure of DNS system. (Data lookup now may fail.) + Distribution of control vs. distribution of expertise - DNS administrators/maintainers work exactly enough to get their system working, not to get it working well. Human problem, not DNS problem. o Conclusions: + caching can work in a heterogeneous environment, but should include features for negative caching as well. + more difficult to remove functions from systems than it is to get a new function added. All functions become more complex then, as new features are added. + implementors lose interest once a new system delivers the level of performance they expect; they are not motivated to optimize their use of others' resources. + allowing variations in the implementation structure is good, allowing variations in provided service is bad. Relevance One of the most successful systems ever built and deployed. Still works today, under staggering load. Flaws o seems like hitting the root node if local lookup fails is a poor choice from the perspective of scalability. I'd be interested in knowing how much load the root servers get nowadays. o the semantic issues of blocking and potentially failing DNS lookups are thorny, and need some work. ----------------------------------------------------------------------- Back to index