Designing a Global Name Service --- Butler Lampson, 1986. Abstract: A name service maps a name of an individual, organization or facility into a set of labeled properties, each of which is a string. It is the basis for resource location, mail addressing, and authentication in a distributed computing system. The global name service described here is meant to do this for billions of names distributed throughout the world. It addresses the problems of high availability, large size, continuing evolution, fault isolation and lack of global trust. The non-deterministic behavior of the service is specified rather precisely to allow a wide range of client and server implementations. - design of a name service for billions of names; - basis for resource allocation, mail addresssing, and authentication in a distributed computing system; - a name service maps a name for an entity into a set of labeled properties, each of which is a string; - Grapevine and Clearinghouse are examples of such a name service; - a name service is not a general database: - the set of names changes slowly; - the properties of a given name changes slowly; - integrity constraints of a useful name service are much weaker than those of a database; - a name service is not a filesystem: - a FS must create names and look them up much faster; - a FS does not need to be that large and available; - requirements for the name service: - large size; - long life; - high availability; - fault isolation; - tolerance of mistrust; - the name service supports six major abstractions, divided into two levels: the client level and the administrative level; - at the client level, there are hierarchical names and their values, with operations for reading and updating them, and facilities for protection and authentication. The fact that the database is distributed and replicated is invisible at this level; - at the administrative level, the copies of the database are visible, together with the mechanisms for locating copies and keeping them synchronized; - Client level: - tree structure, just like a Unix file system; - tree of directories, each with a unique directory identifier (DI) and a name by which it can be reached from its parent; - the arcs of the tree are called directory references (DRs); - a directory is not simply a mapping from simple names to values, but contains a tree of values. An arc of the tree carries a label (L), which is just a string. A node carries a timestamp (TS) and a mark which is either "present" or "absent". A path through the tree is defined by a sequence of labels; - an update to a directory makes the node at the end of a given path present or absent; - each directory has an access control function which maps a principal and a path into a set of rights. Each of the operations provided by the name service requires the principal that invokes it to have certain rights to the nodes involved in the operation; - authentication is based on the use of encryption to provide a secure channel between the caller of an operation and its implementor. A directory has an authentication function af(), which is a mapping from keys to principals. Each directory has a few values for which af() is defined by some external means. In particular, there is a secure channel for each parent-child link; the parent's af() maps this channel's key to the child's name, and vice-versa. - certificates can also be used. A certificate links a key to a name relative to the issuing directory; - Administrative level: - sees a set of directory copies (DCs), each one stored on a different server machine (S); - a directory reference (DR) now includes not just the DI of the directory, but also a list of the servers that store its DCs; - a lookup can try one or more of the servers to find a copy from which to read; * an update originates at one DC, and is initially recorded * there. The basic method for spreading updates to all the * copies is a sweep operation, which visits every DC, collects * a complete set of updates, and then writes this set bach to * every DC. The sweep has a timestamp sweepTS. Before it * reads from a DC it increases that DC's nextTS to sweetTS; * this ensures that the sweep collects all updates earlier * than sweepTs. After it writes back to a DC, it sets that * DC's lastSweep to sweetTS. - a sweep's major problem is to obtain the set of DCs reliably; - all the DCs are linked as a ring, and updates propagate following the ring arrows; - a name has two parts: the full name (FN) of a directory and the name of an entity registered in that directory; - the basic mechanism for growth of the name space is its hierarchical structure. Each directory is a context within which names can be generated independently of what is going on in any other directory; - as name lookup is not likely to be specially cheap, it is very desirable for a client to be able to cache the result of a lookup; * each entry has an expiration time (TX); an arc or link may not be * changed until its TX has expired; with this restriction, the result * of a lookup can be safely cached until the minimum TX of any arc or * link that was followed. - the update function defined by Lampson is: - total: it always makes sense to apply an update function; - commutative: the order in which two updates are applied does not affect the result; - idempotent: applying the same update twice has the same effect as applying it once; - commutativity is enforced by the use of timestamps in the updates; ========= Designing a Global Name Service Butler W. Lampson Name service maps a name to a set of properties. It is the basis for resource location (such as mail addressing, authentication, etc.). Name service is not a general database (slower changes, looser integrity constraints), not a file system (much larger, more available). Name Service Requirements: * Large size * Long life * High Availability * Fault isolation * Tolerance of mistrust This implies hierarchical system. At the Client Level: * Hierarchical Names and their values. Facilities for updating, and reading. Directory structure. Edges in the tree are named with the directory name. Nodes in the tree are labeled with a timestamp. * Protection and Authentication of system. Access control is based on an "entity" named in the system. Entities password, etc., is stored as a property. You can use your initial access to get access to a directory with a second identity, a second password, and also more access. Can chain this authentications together to extend your abilities Administrative Level: * distributed, replicated copies There are a number of distributed copies of each directory. * locating copies * keeping copies synchronized Copies are kept approximately the same. Updates are spread via "sweeps" or via unstructured message passing between copies. Sweep obtains a circular linked list of copies. Traverses copies, collecting a set of all updates. Then applies these updates to each copy. "Epochs" are used to create new rings (if a server fails permanently, for example). Epoch is identified by its starting time stamp. Name Space: * names are divided into two parts: 1) Full name of a directory, 2) name of entity within the directory. Entity name is expected to be short. Directory name is long. * hierarchical structure of name space is the basic mechanism for growth. * directory can be restructured, using forwarding links to redirect old names to their new names. Caching: Values have expiration times. They can be cached until then by clients. Values and Updates: * value is a tree with labeled arcs and timestamped nodes * updates are operations with name a node (via the path to the node and the node's timestamp) and say whether to create the node or remove it. * Updates are total (you can always apply them). * Updates are commutative: order of two updates does not affect result * Updates are idempotent: can apply the same update twice. Directory Copies: * there is always at least one complete ring in set of DCs. * there are never two complete non-intersecting rings. Comments: I didn't like this paper. I thought it was poorly organized and hard to read. For example, in the overview section, the first sentence says there are "6 major abstractions," and then Questions: Why exactly is a hierarchical system required? It seems like Lampson thinks its obvious, but he doesn't justify his statement. Admin does static partitioning of data. In a large name service, is this enough? Is the assumption that names change slowly valid? (e.g., Akamai DNS tricks). Or, the "location" property of a person?