Considerations for Large Authoritative DNS Server OperatorsSIDN Labs/TU DelftMeander 501Arnhem6825 MDNetherlands+31 26 352 5500giovane.moura@sidn.nlUSC/Information Sciences InstitutePO Box 382DavisCA95617-0382United States of America+1 (530) 404-0099ietf@hardakers.netUSC/Information Sciences Institute4676 Admiralty WayMarina Del ReyCA90292-6695United States of America+1 (310) 448-8708johnh@isi.eduSIDN LabsMeander 501Arnhem6825 MDNetherlands+31 26 352 5500marco.davids@sidn.nl
Internet
RoutingDNSAnycastDomainNameSystemBGPRecent research work has explored the deployment characteristics and
configuration of the Domain Name System (DNS). This document
summarizes the conclusions from these research efforts and offers
specific, tangible considerations or advice to authoritative DNS
server operators. Authoritative server operators may wish to follow
these considerations to improve their DNS services.It is possible that the results presented in this document could be
applicable in a wider context than just the DNS protocol,
as some of the results may generically apply to
any stateless/short-duration anycasted service.This document is not an IETF consensus document: it is published for
informational purposes.Status of This Memo
This document is not an Internet Standards Track specification; it is
published for informational purposes.
This is a contribution to the RFC Series, independently of any
other RFC stream. The RFC Editor has chosen to publish this
document at its discretion and makes no statement about its value
for implementation or deployment. Documents approved for
publication by the RFC Editor are not candidates for any level of
Internet Standard; see Section 2 of RFC 7841.
Information about the current status of this document, any
errata, and how to provide feedback on it may be obtained at
.
Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
() in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document.
Table of Contents
. Introduction
. Background
. Considerations
. C1: Deploy Anycast in Every Authoritative Server to Enhance Distribution and Latency
. Research Background
. Resulting Considerations
. C2: Optimizing Routing is More Important than Location Count and Diversity
. Research Background
. Resulting Considerations
. C3: Collect Anycast Catchment Maps to Improve Design
. Research Background
. Resulting Considerations
. C4: Employ Two Strategies When under Stress
. Research Background
. Resulting Considerations
. C5: Consider Longer Time-to-Live Values Whenever Possible
. Research Background
. Resulting Considerations
. C6: Consider the Difference in Parent and Children's TTL Values
. Research Background
. Resulting Considerations
. Security Considerations
. Privacy Considerations
. IANA Considerations
. References
. Normative References
. Informative References
Acknowledgements
Contributors
Authors' Addresses
IntroductionThis document summarizes recent research that explored the
deployed DNS configurations and offers derived, specific, tangible
advice to DNS authoritative server operators (referred to as "DNS operators"
hereafter). The considerations (C1-C6) presented in this document are
backed by peer-reviewed research, which used wide-scale Internet
measurements to draw their conclusions. This document summarizes the
research results and describes the resulting key engineering options.
In each section, readers are pointed to the pertinent publications where
additional details are presented.These considerations are designed for operators of "large"
authoritative DNS servers, which, in this context, are servers with a significant global user population, like top-level domain (TLD) operators, run by either a single operator or
multiple operators. Typically, these networks are deployed on wide
anycast networks .
These considerations may not be
appropriate for smaller domains, such as those used by an organization
with users in one unicast network or in a single city or region, where
operational goals such as uniform, global low latency are less
required.It is possible that the results presented in this document could be
applicable in a wider context than just the DNS protocol, as some of
the results may generically apply to any stateless/short-duration
anycasted service. Because the conclusions of the reviewed studies
don't measure smaller networks, the wording in this document
concentrates solely on discussing large-scale DNS authoritative services.This document is not an IETF consensus document: it is published for
informational purposes.BackgroundThe DNS has two main types of DNS servers: authoritative servers and
recursive resolvers, shown by a representational deployment model in
. An authoritative server (shown as AT1-AT4 in
) knows the content of a DNS zone and is responsible for
answering queries about that zone. It runs using local (possibly
automatically updated) copies of the zone and does not need to query
other servers in order to answer requests. A recursive
resolver (Re1-Re3) is a server that iteratively queries authoritative
and other servers to answer queries received from client requests
. A client typically employs a software library called a "stub
resolver" ("stub" in ) to issue its query to the upstream
recursive resolvers .DNS queries issued by a client contribute to a user's perceived latency and affect the user experience depending
on how long it takes for responses to be returned. The DNS system has
been subject to repeated Denial-of-Service (DoS) attacks (for example,
in November 2015 ) in order to specifically degrade the user
experience.To reduce latency and improve resiliency against DoS attacks, the DNS
uses several types of service replication. Replication at the
authoritative server level can be achieved with the following:
the deployment of
multiple servers for the same zone (AT1-AT4 in );
the use of IP anycast
that allows the same IP address to
be announced from multiple locations (each of referred to as an
"anycast instance" ); and
the use of load balancers to
support multiple servers inside a single (potentially anycasted)
instance. As a consequence, there are many possible ways an
authoritative DNS provider can engineer its production authoritative
server network with multiple viable choices, and there is not necessarily a single
optimal design.
ConsiderationsIn the next sections, we cover the specific considerations (C1-C6) for
conclusions drawn within academic papers about large authoritative
DNS server operators. These considerations are conclusions reached
from academic work that authoritative server operators may wish to
consider in order to improve their DNS service. Each consideration
offers different improvements that may impact service latency,
routing, anycast deployment, and defensive strategies, for example.C1: Deploy Anycast in Every Authoritative Server to Enhance Distribution and LatencyResearch BackgroundAuthoritative DNS server operators announce their service using NS
records . Different authoritative servers for a given zone
should return the same content; typically, they stay synchronized using
DNS zone transfers (authoritative transfer (AXFR) and incremental zone transfer (IXFR) ), coordinating
the zone data they all return to their clients.As discussed above, the DNS heavily relies upon replication to support
high reliability, ensure capacity, and reduce latency .
The DNS has two complementary mechanisms for service replication:
name server replication (multiple NS records) and anycast (multiple
physical locations). Name server replication is strongly recommended
for all zones (multiple NS records), and IP anycast is used by many
larger zones such as the DNS root , most top-level
domains , and many large commercial enterprises, governments,
and other organizations.Most DNS operators strive to reduce service latency for users, which
is greatly affected by both of these replication techniques. However,
because operators only have control over their authoritative servers
and not over the client's recursive resolvers, it is difficult to
ensure that recursives will be served by the closest authoritative
server. Server selection is ultimately up to the recursive resolver's
software implementation, and different vendors and even different
releases employ different criteria to choose the authoritative servers
with which to communicate.Understanding how recursive resolvers choose authoritative servers is
a key step in improving the effectiveness of authoritative server
deployments. To measure and evaluate server deployments,
describes the deployment of seven unicast authoritative name servers in
different global locations and then queried them from more than 9000
Reseaux IP Europeens (RIPE) authoritative server operators and their respective recursive
resolvers.It was found in that recursive resolvers in the wild query all
available authoritative servers, regardless of the observed
latency. But the distribution of queries tends to be skewed towards
authoritatives with lower latency: the lower the latency between a
recursive resolver and an authoritative server, the more often the
recursive will send queries to that server. These results were
obtained by aggregating results from all of the vantage points, and
they were not specific to any vendor or version.The authors believe this behavior is a consequence of combining the
two main criteria employed by resolvers when selecting authoritative
servers: resolvers regularly check all listed authoritative servers in
an NS set to determine which is closer (the least latent), and when one
isn't available, it selects one of the alternatives.Resulting ConsiderationsFor an authoritative DNS operator, this result means that the latency
of all authoritative servers (NS records) matter, so they all must be
similarly capable -- all available authoritatives will be queried by
most recursive resolvers. Unicasted services, unfortunately, cannot
deliver good latency worldwide (a unicast authoritative server in
Europe will always have high latency to resolvers in California and
Australia, for example, given its geographical
distance). recommends that DNS operators deploy equally
strong IP anycast instances for every authoritative server (i.e., for
each NS record). Each large authoritative DNS server provider should
phase out its usage of unicast and deploy a number of well-engineered anycast instances with good peering strategies so they can provide
good latency to their global clients.
As a case study, the ".nl" TLD zone was originally served on seven
authoritative servers with a mixed unicast/anycast setup. In early
2018, .nl moved to a setup with 4 anycast authoritative
servers.
The contribution of to DNS service engineering shows that
because unicast cannot deliver good latency worldwide, anycast needs
to be used to provide a low-latency service worldwide.C2: Optimizing Routing is More Important than Location Count and DiversityResearch BackgroundWhen selecting an anycast DNS provider or setting up an anycast
service, choosing the best number of anycast instances to
deploy is a challenging problem. Selecting the right quantity and set of global locations that should send BGP announcements is tricky. Intuitively, one
could naively think that more instances are better and that simply
"more" will always lead to shorter response times.This is not necessarily true, however. In fact, proper route engineering can matter more than the total number of locations, as found in . To study the relationship between the number of
anycast instances and the associated service performance, the authors measured the round-trip time (RTT) latency of four DNS root servers. The root DNS servers are implemented by 12 separate
organizations serving the DNS root zone at 13 different IPv4/IPv6
address pairs.The results documented in measured the performance of
the {c,f,k,l}.root-servers.net (referred to as "C", "F", "K", and "L" hereafter)
servers from more than 7,900 RIPE Atlas probes. RIPE Atlas is an
Internet measurement platform with more than 12,000 global vantage
points called "Atlas probes", and it is used regularly by both
researchers and operators .In , the authors found that the C server, a smaller anycast deployment
consisting of only 8 instances, provided very similar overall
performance in comparison to the much larger deployments of K and L,
with 33 and 144 instances, respectively. The median RTTs for the C, K, and L
root servers were all between 30-32 ms.Because RIPE Atlas is known to have better coverage in Europe than
other regions, the authors specifically analyzed the results per
region and per country (Figure 5 in ) and show that
known Atlas bias toward Europe does not change the conclusion that
properly selected anycast locations are more important to latency than
the number of sites.Resulting ConsiderationsThe important conclusion from is that when engineering
anycast services for performance, factors other than just the number
of instances (such as local routing connectivity) must be considered.
Specifically, optimizing routing policies is more important than
simply adding new instances. The authors showed that 12 instances can
provide reasonable latency, assuming they are globally distributed and
have good local interconnectivity. However, additional instances can
still be useful for other reasons, such as when handling
DoS attacks .C3: Collect Anycast Catchment Maps to Improve DesignResearch BackgroundAn anycast DNS service may be deployed from anywhere and from several
locations to hundreds of locations (for example, l.root-servers.net
has over 150 anycast instances at the time this was written). Anycast
leverages Internet routing to distribute incoming queries to a
service's nearest distributed anycast locations measured by the number of routing hops. However, queries are usually not evenly distributed across all anycast locations, as
found in the case of L-Root when analyzed using Hedgehog .Adding locations to or removing locations from a deployed anycast
network changes the load distribution across all of its
locations. When a new location is announced by BGP, locations may
receive more or less traffic than it was engineered for, leading to
suboptimal service performance or even stressing some locations while
leaving others underutilized. Operators constantly face this scenario
when expanding an anycast service. Operators cannot easily
directly estimate future query distributions based on proposed anycast
network engineering decisions.To address this need and estimate the query loads of an anycast service undergoing changes (in particular expanding), describes the development of a new technique enabling operators to carry out active
measurements using an open-source tool called Verfploeter (available
at ). The results allow the creation of detailed anycast
maps and catchment estimates. By running Verfploeter combined with a
published IPv4 "hit list", the DNS can precisely calculate which remote
prefixes will be matched to each anycast instance in a network. At
the time of this writing, Verfploeter still does not support IPv6 as
the IPv4 hit lists used are generated via frequent large-scale ICMP
echo scans, which is not possible using IPv6.As proof of concept, documents how Verfploeter was used to predict both the catchment and query load distribution for a new anycast instance deployed for b.root-servers.net. Using two
anycast test instances in Miami (MIA) and Los Angeles (LAX), an ICMP
echo query was sent from an IP anycast address to each IPv4 /24
network routing block on the Internet.The ICMP echo responses were recorded at both sites and analyzed and
overlaid onto a graphical world map, resulting in an Internet-scale
catchment map. To calculate expected load once the production network
was enabled, the quantity of traffic received by b.root-servers.net's
single site at LAX was recorded based on a single day's traffic
(2017-04-12, "day in the life" (DITL) datasets ). In , it was predicted that
81.6% of the traffic load would remain at the LAX site. This Verfploeter estimate
turned out to be very accurate; the actual measured
traffic volume when production service at MIA was enabled was 81.4%.Verfploeter can also be used to estimate traffic shifts based on other
BGP route engineering techniques (for example, Autonomous System (AS) path prepending or
BGP community use) in advance of operational deployment. This was studied in using prepending with 1-3 hops at each instance, and
the results were compared against real operational changes to validate the
accuracy of the techniques.Resulting ConsiderationsAn important operational takeaway provides is how DNS operators can make informed engineering choices when changing DNS
anycast network deployments by using Verfploeter in advance.
Operators can identify suboptimal routing situations in advance with
significantly better coverage rather than using other active measurement
platforms such as RIPE Atlas. To date, Verfploeter has been deployed
on an operational testbed (anycast testbed) on a large
unnamed operator and is run daily at b.root-servers.net .Operators should use active measurement techniques like Verfploeter in
advance of potential anycast network changes to accurately measure the
benefits and potential issues ahead of time.C4: Employ Two Strategies When under StressResearch BackgroundDDoS attacks are becoming bigger, cheaper, and more frequent
. The most powerful recorded DDoS attack against DNS
servers to date reached 1.2 Tbps by using Internet of Things (IoT) devices
.
How should a DNS operator engineer its anycast
authoritative DNS server to react to such a DDoS attack?
investigates this question using empirical observations grounded with
theoretical option evaluations.An authoritative DNS server deployed using anycast will have many
server instances distributed over many networks. Ultimately, the
relationship between the DNS provider's network and a client's ISP
will determine which anycast instance will answer queries for a given
client, given that the BGP protocol maps clients to specific
anycast instances using routing information. As a
consequence, when an anycast authoritative server is under attack, the
load that each anycast instance receives is likely to be unevenly
distributed (a function of the source of the attacks); thus, some
instances may be more overloaded than others, which is what was
observed when analyzing the root DNS events of November 2015
. Given the fact that different instances may have
different capacities (bandwidth, CPU, etc.), making a decision about how
to react to stress becomes even more difficult.In practice, when an anycast instance is overloaded with incoming traffic,
operators have two options:
They can withdraw its routes, pre-prepend its AS route to some or
all of its neighbors, perform other traffic-shifting tricks (such as
reducing route announcement propagation using BGP
communities ), or communicate with its upstream
network providers to apply filtering (potentially using FlowSpec or the DDoS Open Threat Signaling (DOTS) protocol ). These techniques shift both legitimate and attack traffic to other anycast instances (with hopefully greater capacity) or block traffic
entirely.
Alternatively, operators can become degraded absorbers by
continuing to operate, knowing dropping incoming legitimate requests
due to queue overflow. However, this approach will also absorb
attack traffic directed toward its catchment, hopefully protecting
the other anycast instances.
describes seeing both of these behaviors deployed in practice when studying instance reachability and RTTs in the DNS
root events. When withdraw strategies were deployed, the stress of
increased query loads were displaced from one instance to multiple
other sites. In other observed events, one site was left to absorb
the brunt of an attack, leaving the other sites to remain relatively
less affected.Resulting ConsiderationsOperators should consider having both an anycast site withdraw strategy
and an absorption strategy ready to be used before a network overload
occurs. Operators should be able to deploy one or both of these
strategies rapidly. Ideally, these should be encoded into operating
playbooks with defined site measurement guidelines for which strategy
to employ based on measured data from past events. speculates that careful, explicit, and automated
management policies may provide stronger defenses to overload
events. DNS operators should be ready to employ both common
filtering approaches and other routing load-balancing techniques
(such as withdrawing routes, prepending Autonomous Systems (ASes), adding communities, or isolating instances),
where the best choice depends on the specifics of the attack.Note that this consideration refers to the operation of just one
anycast service point, i.e., just one anycasted IP address block
covering one NS record. However, DNS zones with multiple authoritative
anycast servers may also expect loads to shift from one anycasted
server to another, as resolvers switch from one authoritative service
point to another when attempting to resolve a name .C5: Consider Longer Time-to-Live Values Whenever PossibleResearch BackgroundCaching is the cornerstone of good DNS performance and reliability. A
50 ms response to a new DNS query may be considered fast, but a response of less
than 1 ms to a cached entry is far faster. In , it was
shown that caching also protects users from short outages and even
significant DDoS attacks.Time-to-live (TTL) values for DNS records directly
control cache durations and affect latency, resilience, and the role
of DNS in Content Delivery Network (CDN) server selection. Some early work modeled caches as a
function of their TTLs , and recent work has examined cache
interactions with DNS , but until , no research
had provided considerations about the benefits of various TTL value
choices. To study this, Moura et al. carried out a
measurement study investigating TTL choices and their impact on user
experiences in the wild. They performed this study independent of
specific resolvers (and their caching architectures), vendors, or
setups.First, they identified several reasons why operators and zone owners may
want to choose longer or shorter TTLs:
Longer TTLs, as discussed, lead to a longer cache life, resulting
in faster responses. In , this was measured this in the wild, and it
showed that by increasing the TTL for the .uy TLD from 5 minutes
(300 s) to 1 day (86,400 s), the latency measured from 15,000 Atlas
vantage points changed significantly: the median RTT decreased
from 28.7 ms to 8 ms, and the 75th percentile decreased from 183 ms to 21 ms.
Longer caching times also result in lower DNS traffic:
authoritative servers will experience less traffic with extended
TTLs, as repeated queries are answered by resolver caches.
Longer caching consequently results in a lower overall cost if
the DNS is metered: some providers that offer DNS as a Service charge a per-query
(metered) cost (often in addition to a fixed monthly cost).
Longer caching is more robust to DDoS attacks on DNS
infrastructure. DNS caching was also measured in , and it showed that the effects of a DDoS on DNS can be greatly reduced, provided
that the caches last longer than the attack.
Shorter caching, however, supports deployments that may require
rapid operational changes: an easy way to transition from an old
server to a new one is to simply change the DNS records. Since
there is no method to remotely remove cached DNS records, the TTL
duration represents a necessary transition delay to fully shift
from one server to another. Thus, low TTLs allow for more rapid
transitions. However, when deployments are planned in advance
(that is, longer than the TTL), it is possible to lower the TTLs
just before a major operational change and raise them again
afterward.
Shorter caching can also help with a DNS-based response to DDoS
attacks. Specifically, some DDoS-scrubbing services use the DNS to
redirect traffic during an attack. Since DDoS attacks arrive
unannounced, DNS-based traffic redirection requires that the TTL be
kept quite low at all times to allow operators to suddenly have
their zone served by a DDoS-scrubbing service.
Shorter caching helps DNS-based load balancing. Many large
services are known to rotate traffic among their servers using
DNS-based load balancing. Each arriving DNS request provides an
opportunity to adjust the service load by rotating IP address records
(A and AAAA) to the lowest unused server. Shorter TTLs may be
desired in these architectures to react more quickly to traffic
dynamics. Many recursive resolvers, however, have minimum caching
times of tens of seconds, placing a limit on this form of agility.
Resulting ConsiderationsGiven these considerations, the proper choice for a TTL depends in
part on multiple external factors -- no single recommendation is
appropriate for all scenarios. Organizations must weigh these
trade-offs and find a good balance for their situation. Still, some
guidelines can be reached when choosing TTLs:
For general DNS zone owners, recommends a longer TTL
of at least one hour and ideally 4, 8, or 24 hours. Assuming
planned maintenance can be scheduled at least a day in advance, long
TTLs have little cost and may even literally provide cost savings.
For TLD and other public registration
operators (for example, most ccTLDs and .com, .net, and .org) that host
many delegations (NS records, DS records, and "glue" records),
demonstrates that most resolvers will use the TTL
values provided by the child delegations while some others
will choose the TTL provided by the parent's copy of the
record. As such, recommends longer TTLs (at least an
hour or more) for registry operators as well for child NS and
other records.
Users of DNS-based load balancing or DDoS-prevention services may
require shorter TTLs: TTLs may even need to be as short as 5
minutes, although 15 minutes may provide sufficient agility for
many operators. There is always a tussle between using shorter TTLs
that provide more agility and using longer TTLs that include all the benefits listed above.
Regarding the use of A/AAAA and NS records, the TTLs for A/AAAA records should
be shorter than or equal to the TTL for the corresponding NS records
for in-bailiwick authoritative DNS servers, since
finds that once an NS record expires, their associated A/AAAA will
also be requeried when glue is required to be sent by the
parents. For out-of-bailiwick servers, A, AAAA, and NS records are
usually all cached independently, so different TTLs can be used
effectively if desired. In either case, short A and AAAA records
may still be desired if DDoS mitigation services are required.
C6: Consider the Difference in Parent and Children's TTL ValuesResearch BackgroundMultiple record types exist or are related between the parent of a
zone and the child. At a minimum, NS records are supposed to be
identical in the parent (but often are not), as are corresponding IP
addresses in "glue" A/AAAA records that must exist for in-bailiwick
authoritative servers. Additionally, if DNSSEC is deployed for a zone, the
parent's DS record must cryptographically refer to a child's DNSKEY
record.Because some information exists in both the parent and a child, it is
possible for the TTL values to differ between the parent's copy and
the child's. examines resolver behaviors when these
values differed in the wild, as they frequently do -- often, parent zones
have de facto TTL values that a child has no control over. For
example, NS records for TLDs in the root zone are all set to 2 days
(48 hours), but some TLDs have lower values within their published
records (the TTLs for .cl's NS records from their authoritative
servers is 1 hour). also examines the differences in the
TTLs between the NS records and the corresponding A/AAAA records for
the addresses of a name server. RIPE Atlas nodes are used to determine
what resolvers in the wild do with different information and whether
the parent's TTL is used for cache lifetimes ("parent-centric") or
the child's ("child-centric"). found that roughly 90% of resolvers follow the child's
view of the TTL, while 10% appear parent-centric. Additionally, it
found that resolvers behave differently for cache lifetimes for
in-bailiwick vs. out-of-bailiwick NS/A/AAAA TTL combinations.
Specifically, when NS TTLs are shorter than the corresponding address
records, most resolvers will requery for A/AAAA records for the
in-bailiwick resolvers and switch to new address records even if the
cache indicates the original A/AAAA records could be kept longer. On
the other hand, the inverse is true for out-of-bailiwick resolvers: if
the NS record expires first, resolvers will honor the original cache
time of the name server's address.Resulting ConsiderationsThe important conclusion from this study is that operators cannot
depend on their published TTL values alone -- the parent's values are
also used for timing cache entries in the wild. Operators that are
planning on infrastructure changes should assume that an older
infrastructure must be left on and operational for at least the
maximum of both the parent and child's TTLs.Security ConsiderationsThis document discusses applying measured research results to
operational deployments. Most of the considerations affect mostly
operational practice, though a few do have security-related impacts.Specifically, C4 discusses a couple of strategies to employ when a
service is under stress from DDoS attacks and offers operators
additional guidance when handling excess traffic.Similarly, C5 identifies the trade-offs with respect to the
operational and security benefits of using longer TTL values.Privacy ConsiderationsThis document does not add any new, practical privacy issues, aside
from possible benefits in deploying longer TTLs as suggested in C5.
Longer TTLs may help preserve a user's privacy by reducing the number
of requests that get transmitted in both client-to-resolver and
resolver-to-authoritative cases.IANA ConsiderationsThis document has no IANA actions.
ReferencesNormative ReferencesDomain names - concepts and facilitiesThis RFC is the revised basic definition of The Domain Name System. It obsoletes RFC-882. This memo describes the domain style names and their used for host address look up and electronic mail forwarding. It discusses the clients and servers in the domain name system and the protocol used between them.Domain names - implementation and specificationThis RFC is the revised specification of the protocol and format used in the implementation of the Domain Name System. It obsoletes RFC-883. This memo documents the details of the domain name client - server communication.Host Anycasting ServiceThis RFC describes an internet anycasting service for IP. The primary purpose of this memo is to establish the semantics of an anycasting service within an IP internet. This memo provides information for the Internet community. This memo does not specify an Internet standard of any kind.Incremental Zone Transfer in DNSThis document proposes extensions to the DNS protocols to provide an incremental zone transfer (IXFR) mechanism. [STANDARDS-TRACK]BGP Communities AttributeThis document describes an extension to BGP which may be used to pass additional information to both neighboring and remote BGP peers. [STANDARDS-TRACK]Clarifications to the DNS SpecificationThis document considers some areas that have been identified as problems with the specification of the Domain Name System, and proposes remedies for the defects identified. [STANDARDS-TRACK]Operation of Anycast ServicesAs the Internet has grown, and as systems and networked services within enterprises have become more pervasive, many services with high availability requirements have emerged. These requirements have increased the demands on the reliability of the infrastructure on which those services rely.Various techniques have been employed to increase the availability of services deployed on the Internet. This document presents commentary and recommendations for distribution of services using anycast. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.DNS Zone Transfer Protocol (AXFR)The standard means within the Domain Name System protocol for maintaining coherence among a zone's authoritative name servers consists of three mechanisms. Authoritative Transfer (AXFR) is one of the mechanisms and is defined in RFC 1034 and RFC 1035.The definition of AXFR has proven insufficient in detail, thereby forcing implementations intended to be compliant to make assumptions, impeding interoperability. Yet today we have a satisfactory set of implementations that do interoperate. This document is a new definition of AXFR -- new in the sense that it records an accurate definition of an interoperable AXFR mechanism. [STANDARDS-TRACK]Architectural Considerations of IP AnycastThis memo discusses architectural implications of IP anycast and provides some historical analysis of anycast use by various IETF protocols.DNS TerminologyThe Domain Name System (DNS) is defined in literally dozens of different RFCs. The terminology used by implementers and developers of DNS protocols, and by operators of DNS systems, has sometimes changed in the decades since the DNS was first defined. This document gives current definitions for many of the terms used in the DNS in a single document.This document obsoletes RFC 7719 and updates RFC 2308.Distributed Denial-of-Service Open Threat Signaling (DOTS) Data Channel SpecificationThe document specifies a Distributed Denial-of-Service Open Threat Signaling (DOTS) data channel used for bulk exchange of data that cannot easily or appropriately communicated through the DOTS signal channel under attack conditions.This is a companion document to "Distributed Denial-of-Service Open Threat Signaling (DOTS) Signal Channel Specification" (RFC 8782).Dissemination of Flow Specification RulesThis document defines a Border Gateway Protocol Network Layer Reachability Information (BGP NLRI) encoding format that can be used to distribute (intra-domain and inter-domain) traffic Flow Specifications for IPv4 unicast and IPv4 BGP/MPLS VPN services. This allows the routing system to propagate information regarding more specific components of the traffic aggregate defined by an IP destination prefix. It also specifies BGP Extended Community encoding formats, which can be used to propagate Traffic Filtering Actions along with the Flow Specification NLRI. Those Traffic Filtering Actions encode actions a routing system can take if the packet matches the Flow Specification. This document obsoletes both RFC 5575 and RFC 7674.Distributed Denial-of-Service Open Threat Signaling (DOTS) Signal Channel SpecificationThis document specifies the Distributed Denial-of-Service Open Threat Signaling (DOTS) signal channel, a protocol for signaling the need for protection against Distributed Denial-of-Service (DDoS) attacks to a server capable of enabling network traffic mitigation on behalf of the requesting client.A companion document defines the DOTS data channel, a separate reliable communication layer for DOTS management and configuration purposes.This document obsoletes RFC 8782.Informative ReferencesBest Practices in DNS Service-Provision ArchitectureAnycasting f.root-servers.netTangled Anycast TestbedTangled2017 DITL DataDNS-OARChedgehogcommit b136eb0Modeling TTL-based Internet CachesACM 2003 IEEE INFOCOMAnycast vs. DDoS: Evaluating the November 2015 Root DNS EventACM 2016 Internet Measurement ConferenceWhen the Dike Breaks: Dissecting DNS Defenses During DDoSACM 2018 Internet Measurement ConferenceCache Me If You Can: Effects of DNS Time-to-LiveACM 2019 Internet Measurement ConferenceRecursives in the Wild: Engineering Authoritative DNS ServersACM 2017 Internet Measurement ConferenceHackers Used New Weapons to Disrupt Major Websites Across U.S.DNS Security Introduction and RequirementsThe Domain Name System Security Extensions (DNSSEC) add data origin authentication and data integrity to the Domain Name System. This document introduces these extensions and describes their capabilities and limitations. This document also discusses the services that the DNS security extensions do and do not provide. Last, this document describes the interrelationships between the documents that collectively describe DNSSEC. [STANDARDS-TRACK]Resource Records for the DNS Security ExtensionsThis document is part of a family of documents that describe the DNS Security Extensions (DNSSEC). The DNS Security Extensions are a collection of resource records and protocol modifications that provide source authentication for the DNS. This document defines the public key (DNSKEY), delegation signer (DS), resource record digital signature (RRSIG), and authenticated denial of existence (NSEC) resource records. The purpose and format of each resource record is described in detail, and an example of each resource record is given. This document obsoletes RFC 2535 and incorporates changes from all updates to RFC 2535. [STANDARDS-TRACK]Protocol Modifications for the DNS Security ExtensionsThis document is part of a family of documents that describe the DNS Security Extensions (DNSSEC). The DNS Security Extensions are a collection of new resource records and protocol modifications that add data origin authentication and data integrity to the DNS. This document describes the DNSSEC protocol modifications. This document defines the concept of a signed zone, along with the requirements for serving and resolving by using DNSSEC. These techniques allow a security-aware resolver to authenticate both DNS resource records and authoritative DNS error indications. This document obsoletes RFC 2535 and incorporates changes from all updates to RFC 2535. [STANDARDS-TRACK]Use of SHA-256 in DNSSEC Delegation Signer (DS) Resource Records (RRs)This document specifies how to use the SHA-256 digest type in DNS Delegation Signer (DS) Resource Records (RRs). DS records, when stored in a parent zone, point to DNSKEYs in a child zone. [STANDARDS-TRACK]DDoS Open Threat Signaling (DOTS) ArchitectureThis document describes an architecture for establishing and maintaining Distributed Denial-of-Service (DDoS) Open Threat Signaling (DOTS) within and between domains. The document does not specify protocols or protocol extensions, instead focusing on defining architectural relationships, components, and concepts used in a DOTS deployment.RIPE Atlas: A Global Internet Measurement NetworkRIPE Network Coordination Centre (RIPE NCC)RIPE AtlasRIPE Network Coordination Centre (RIPE NCC)Anycast Latency: How Many Sites Are Enough?PAM 2017 Passive and Active Measurement ConferenceThe Internet at the Speed of Light13th ACM Workshop on Hot Topics in NetworksVerfploeter Source Codecommit f4792dcBroad and Load-Aware Anycast Mapping with VerfploeterACM 2017 Internet Measurement ConferenceAcknowledgementsWe would like to thank the reviewers of this document who offered
valuable suggestions as well as comments at the IETF DNSOP
session (IETF 104): , , ,
, , , , , and .Additionally, we would like thank those acknowledged in the papers
this document summarizes for helping produce the results: RIPE NCC and
DNS OARC for their tools and datasets used in this research, as well
as the funding agencies sponsoring the individual research.ContributorsThis document is a summary of the main considerations of six research
papers written by the authors and the following people who contributed substantially to the content and should be considered coauthors; this document would not
have been possible without their hard work:
Authors' AddressesSIDN Labs/TU DelftMeander 501Arnhem6825 MDNetherlands+31 26 352 5500giovane.moura@sidn.nlUSC/Information Sciences InstitutePO Box 382DavisCA95617-0382United States of America+1 (530) 404-0099ietf@hardakers.netUSC/Information Sciences Institute4676 Admiralty WayMarina Del ReyCA90292-6695United States of America+1 (310) 448-8708johnh@isi.eduSIDN LabsMeander 501Arnhem6825 MDNetherlands+31 26 352 5500marco.davids@sidn.nl