Development of the Domain Name System (DNS)

rw-book-cover

Metadata

  • Author: Mockapetris and Dunlap
  • Full Title: Development of the Domain Name System (DNS)
  • Category:articles
  • Summary: The document describes the development and evolution of the Domain Name System (DNS), a critical component of the DARPA Internet, focusing on its design, structure, and implementation. It explains the initial design concepts from 1983, the transition from HOSTS.TXT to DNS, the role of name servers and resolvers, the hierarchical name space structure, and the challenges faced during implementation at organizations like the University of California, Berkeley. The DNS has become a fundamental part of internet infrastructure, supporting diverse applications, and evolving to meet the demands of a growing network.
  • URL: http://nms.lcs.mit.edu/6829-papers/dns.pdf

Highlights

  • The number of hosts changed from the number of timesharing systems (roughly organizations) to the number of workstations (roughly users). This increase was directly reflected in the size of HOSTS.TXT, the rate of change in HOSTS.TXT, and the number of transfers of the file, leading to a much larger than linear increase in total resource use for distributing the file. (View Highlight)
  • HOSTS.TXT is the name of a simple text file, which is centrally maintained on a host at the SRI Network Information Center (SRI-NIC) and distributed to all hosts in the Internet via direct and indirect file transfers. (View Highlight)
  • we wanted to avoid any constraints on the system due to outside influences and permit as many different implementation structures as possible (View Highlight)
  • A hierarchical name space seemed the obvious and minimal solution for the distribution and size requirements. (View Highlight)
  • The initial design of the DNS was specified in [RFC 882, RFC 883]. The outward appearance is a hierarchical name space with typed data at the nodes. Control of the database is also delegated in a hierarchical fashion. The intent was that the data types be extensible, with the addition of new data types continuing indefinitely as new applications were added. (View Highlight)
  • The “leanness” criterion led to a conscious decision to omit many of the functions one might expect in a state-of-the-art database. In particular, dynamic update of the database with the related atomicity, voting, and backup considerations was omitted. (View Highlight)
  • Allow the database to be maintained in a distributed manner. (View Highlight)
  • Have no obvious size limits for names, name components, data associated with a name, etc. (View Highlight)
  • The active components of the DNS are of two major types: name servers and resolvers. Name servers are repositories of information, and answer queries using whatever information they possess. Resolvers interface to client programs, and embody the algorithms necessary to find a name server that has the information sought by the client. (View Highlight)
  • the system should be independent of network topology, and capable of encapsulating other name spaces. (View Highlight)
  • the default assumption is that the only way to tell definitely what a name represents is to look at the data associated with the name. (View Highlight)
  • The recommended name space structure for hosts, users, and other typical applications is one that mirrors the structure of the organization controlling the local domain. (View Highlight)
  • The DNS internal name space is a variable-depth tree where each node in the tree has an associated label. The domain name of a node is the concatenation of all labels on the path from the node to the root of the tree. (View Highlight)
  • Data for each name in the DNS is organized as a set of resource records (RRs); each RR carries a well-known type and class field, followed by applications data. Multiple values of the same type are represented as separate RRs. (View Highlight)
  • Configuration files in the domain system represent names as character strings separated by dots, but applications are free to do otherwise. (View Highlight)
  • The class field is meant to divide the database orthogonally from type, and specifies the protocol family or instance. The DARPA Internet has a class, and we imagined that classes might be allocated to CHAOS, ISO, XNS or similar protocol families. (View Highlight)
    • Note: This concept of “class” became increasingly irrelevant as the Internet became the only wide network in ordinary use. Although the class field still exists, in today’s DNS, only the “IN” class is used.
  • . Thus the name of a host might have more or fewer labels than the name of a user, and the tree is not organized by network or other grouping. (View Highlight)
  • the multiple RR option cut down the maximum RR size. This appeared to promise simpler dynamic update protocols, and also seemed suited to use in a limited-size datagram environment (View Highlight)
  • This scheme allows almost arbitrary distribution, but is most efficient when the database is distributed in parallel with the name hierarchy. (View Highlight)
  • The DNS provides two major mechanisms for transferring data from its ultimate source to ultimate destination: zones and caching. (View Highlight)
  • Note that the intent is that both of these mechanisms be invisible to the user who should see a single database without obvious boundaries. (View Highlight)
  • A zone is a complete description of a contiguous section of the total tree name space, together with some “pointer” information to other contiguous zones. (View Highlight)
  • In addition to the planned distribution of data via zone transfers, the DNS resolvers and combined name server/resolver programs also cache responses for use by later queries. The mechanism for controlling caching is a time-to-live (TTL) field attached to each RR. (View Highlight)
  • The parent organization does this by inserting RRs in its zone which mark a zone division. (View Highlight)
    • Note: This is done by creating NS records in the parent node. For example, if I own mydomain.com, I could create a NS record associated with sub.mydomain.com pointing to another nameserver.
  • D (View Highlight)
  • From an organization’s point of view, it gets control of a zone of the name space by persuading a parent organization to delegate a subzone consisting of a single node. The parent organization does this by inserting RRs in its zone which mark a zone division. The new zone can then be grown to arbitrary size and further delegated without involving the parent, although the parent always retains control of the initial delegation. For example, the ISI.EDU zone was created by persuading the owner of the EDU domain to mark a zone boundary between EDU and ISI.EDU. (View Highlight)
  • A particular name server can support any number of zones which may or may not be contiguous (View Highlight)
  • The basic search algorithm for the DNS allows a resolver to search “downward” from domains that it can access already. Resolvers are typically configured with “hints” pointing at servers for the root node and the top of the local domain. (View Highlight)
  • The value of a ubiquitous name service and consistent name space at all levels of the protocol suite and operating system seems obvious, but it is equally obvious that tradeoffs between performance, generality, and distribution require at least different styles of use at different levels. (View Highlight)

New highlights added March 22, 2024 at 4:38 PM

  • name servers and resolvers (View Highlight)
    • Note: By “resolver,” we mean what we now call “DNS server,” such as 8.8.8.8. These do recursive search. By contrast, nameservers are responsible for publishing authoritative resource records for a zone.
  • Our intent is that cached answers be as good as answers from an authoritative server, excepting changes made within the TTL period. (View Highlight)
    • Note: In practice, modern public DNS servers usually have informaton much faster than the TTL because large DNS providers use anycast to announce changes.
  • Since access to the root and other top level zones is so important, the root domain, together with other top-level domains managed by the SRI-NIC, is supported by seven redundant name servers. (View Highlight)
    • Note: Now 13, which in turn have many replicas
  • The performance of the underlying network was much worse than the original design expected. (View Highlight)
  • Even though the TOPS-20 root servers take less than 100 milliseconds to process the vast majority of queries, clients typically see response times of 500 milliseconds to 5 seconds, even for the closest root server, depending on their location in the Internet. (View Highlight)
    • Note: The principle cause of rapid improvement to these numbers was the connectivity and redundancy of the network.
  • any naming system that relies on caching for performance may need caching for negative results as well. Such a mechanism has been added to the DNS as an optional feature, (View Highlight)
  • The use of datagrams as the preferred method for accessing name servers was successful and probably was essential, given the unexpectedly bad performance of the DARPA Internet. (View Highlight)
    • Note: By “datagrams” here, they are referring to UDP. DNS uses UDP because the overhead of TCP connections is too wasteful, and the messages are too small. However, messages longer than 512 bytes will contain a “truncation” flag that results in a TCP transmission. This happens with DNSSEC.
  • integrity of the network addressing mechanism, and this is questionable in an era of local networks and PCs. (View Highlight)
    • Note: DNSSEC has been the solution.
  • When the draft DNS specifications were made available in 1983, the one nearly unanimous criticism was that the type and class data specifiers, which were 8 bits in the draft, should be expanded to 16, or even 32 bits, to allow for new definitions. (View Highlight)
    • Note: This size is still in place: there is an integer mapping from the specifier to a record type (MX, NS, A, etc). RFC 2671 introduced DNS extensions (EDNS), but it is not widely used.
  • The only problems with caching relate to databases and query strategies that make it less reliable or useful. For example, RRs of the same type at a particular node should have the same TTL so that they will time out simultaneously, but administrators sometimes assign TTLs in the mistaken idea that they are assigning some sort of priority. (View Highlight)
  • Several existing resolvers cache all information in responses without regard to its reasonableness. This has resulted in numerous instances where bad information has circulated and caused problems. (View Highlight)
  • While various measures have reduced the vulnerability to error, the security of the present system does depend on the (View Highlight)
  • are provided. (View Highlight)
  • Distributing authority for a database does not distribute a corresponding amount of expertise. Maintainers fix things until they work, rather than until they work well, and want to use, not understand, the systems they (View Highlight)