2.4 The Domain Name System (DNS)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] in this section we'll cover the dns the domain name system the part of the internet that's responsible for translating host names like gaia.cs.umass.edu to ip addresses like 128.119.40.186 and we'll see that the dns is this amazing distributed application that's able to perform its services at massive scale and with pretty amazing performance now we're at the application layer and so you may be asking yourself well hey why are we studying a core internet function like name translation up at the application layer and the answer to that is because the dns is an application layer protocol and service it's built on top of and uses the services of tcp and udp so it really is an application level service well we've got a lot to cover here and what we're going to do is we're going to start off by talking about the structure and the functioning of the dns we'll look at the process by which queries are going to be resolved and then we'll take a look at dns records and also the message format of the dns protocol so we've got a lot to cover so let's get started well as the name domain name system would imply the dns is all about names and identifiers and as a person i have many identifiers associated with me i have my name my social security number i've got my passport id i've got my umass employee id associated with me and we've already seen that internet hosts also have at least two identifiers associated with them they've got a name like gaya.cs.umass.edu they've got an ip address like 128.119.4 as we'll see the role of the dns is to provide translation between names and services and ip addresses the domain name system the dns is a distributed database the contents of this database are the records containing information about the translation among the host names services and ip addresses the dns itself has a hierarchy of servers spread around the internet services that communicate with each other to provide this name translation service and it's important to note that the dns is implemented as an application layer service it's implemented by servers that sit at the network edge rather than at routers and switches inside the network and this reflects an internet design philosophy of keeping the network core simple and putting complexity at the network's edge we'll see this design philosophy coming up again and again as we dive down into the transport layer and then the network layer the dns provides a number of different functions the one we've heard about the most is this ip address to host name translation service but it provides a number of other important services as well provides an aliasing function that is translating from externally facing names like mail.cs.ums.edu to some internal hostname that's much more complicated than that it also provides service resolution for example returning the ip address of a mail server associated with a domain and finally performs load balancing there may be a number of ip addresses that are able to perform a requested service for instance a web server for example and the dns will rotate among those possible ip addresses returning one of those as the primary service and thus performing a kind of load balancing function given the dns takes a distributed decentralized approach you might ask yourself why didn't the designers of the dns take a more centralized approach and there are several considerations here a centralized approach represents a single point of failure and we've seen that the dns is critical infrastructure given the loads on the dns a centralized approach would create a tremendous concentration of traffic and given how important performance is remember milliseconds count when resolving a dns query placing it at one location would necessarily mean long rtt delays to some places on the planet and this is all to say that a centralized approach simply doesn't scale with trillions of queries a day akamai alone handles more than a trillion dns requests a day a single centralized service doesn't have the computational capabilities the resiliency or the performance that one can get with a decentralized approach well before diving into the technical details of the dns let's summarize here in terms of how to just even think about the dns so you can think about it as a highly distributed high scale high performance distributed database that's a tough problem but at least as we'll see the records are relatively simple and you want to think about it in terms of performance and scale also needs to be able to handle literally trillions of requests mostly reads that come in every day and it has to do so with really high performance milliseconds are going to count and then organizationally it's also highly decentralized there are literally hundreds of thousands of organizations that are going to be responsible for their pieces their records within this distributed database is this an easy problem not really well we said that the dns is a distributed hierarchical database so let's take a high level look at this hierarchy the root of this tree we have the root dns servers the next layer we have the dns servers that are responsible for all of the dot com or edu or dot net domain names these are known as the top level domain or tld servers and then we have the authoritative name servers these are the servers that have the ultimate responsibility for resolving names within their domain for example for all of umass.edu all of nyu.edu or all of pbs.org now if a client wants to resolve an address say for www.amazon.com here's the basic approach client could first contact a root dns server to get the name of the tld server for all of the dot coms the client then contacts the tld server to get the name of the authoritative name server for amazon.com and then finally the client contacts the authoritative name server for amazon.com to get the ip address of www.amazon.com well since we like to study things top down let's start at the top of the dns hierarchy and that's with the root servers the root servers are a place to go when a server is not actually able to resolve a name you can think of it almost like a contact of last resort and in fact it's not really a contact of last resort because it's not going to actually provide that translation service but it's a place to go to get translation started well this is obviously an incredibly important function for the internet almost like the central nervous system of the internet and as such security is going to be very important now the root servers and much of the infrastructure associated with them is the responsibility of icann the internet corporation for assigned names and numbers there are 13 logical route servers around the world but each of these logical route servers are themselves actually replicated so corresponding to these 13 logical servers are actually close to a thousand physical servers around the world there are more than 200 physical route servers in the us moving down a level from the root domain we find the tld the top level domain and each of the servers in the top level domain is responsible for resolving one of the addresses that have an endinglike.com.edu.net.org the associations that are responsible for managing these tld domains are known as internet registries these internet registries are also the place that you'd go if you want to register a new.com.edu or net name the authoritative name servers are responsible for resolving names within an organization and such a servers are authoritative in the sense that as the saying goes the buck stops here this is the dns server that has authority over the organization's names and what this server says goes and lastly there are the local dns servers every host on the internet has an associated local dns server and this is the name server that a host is going to contact when it wants to resolve a name the local dns name server is then going to respond immediately to the requesting host if it has that named address translation pair cached locally otherwise it's going to start the resolution process if you're interested in finding out the hostname of your local dns server you can type in one of these two commands into your computer under macos type in sc util minus minus dns or under windows ipconfig all each of these commands displays the name of your local dns server now let's take a look at an example of dns name resolution in action suppose the requesting host is at engineering.nyu and it's going to make a request to resolve the name gaia.cs.umass.edu well here's how this unfolds the host at engineering.nyu.edu first sends a dns query message to the local nyu dns server let's say dns.ny.edu the query message contains the hostname to be translated gaia.cs.umass.edu now it's the job of this local nyu dns server to resolve this name it begins by forwarding a query message to a root dns server the root dns server takes note of the edu suffix and returns to the local dns server a list of ip addresses of tld servers top level domain servers responsiblefor.edu the local nyu dns server then resends the query message to one of these tld servers the tld server takes note of the umass.edu suffix and responds with the ip address of the authoritative dns server for the university of massachusetts dns.umass.edu finally nyu's local dns server re-sends the query message again to dns.umass.edu and umass's authoritative name server responds with the ip address of gaia.cs.umass.edu in order to get the mapping for one hostname eight dns messages were sent four query messages and four reply messages we'll see soon how dns caching can reduce this query traffic the type of querying that we've seen in this example is known as an iterated query because the local dns server at nyu is iteratively querying a sequence of servers until finally the gaia.cs.ums.edu name is finally resolved a second form of query resolution is known as recursive query resolution in recursive query resolution rather than responding to a request with a i don't know but here's who to try next type of response that we saw with iterative queries the name server takes it upon itself to resolve the query and return a definitive reply in this example here which shows a cascade of recursive queries the local dns server at nyu again queries the root server in this recursive case however the root server queries the tld server who queries the umass authoritative name server who replies to the tld server who replies to the root server who replies to the local dns server at nyu who replies to the querying host because this form of recursive querying puts the burden on the servers at the upper level of the hierarchy it's not often used in practice and instead iterative queryings adopted by the local dns server now we've seen that a lot of work can be involved in obtaining the dns record for a named address translation pair so it would be great to somehow leverage that work and remember that's to say cache that record locally for some time in case another request comes in for this same record so once a dns server learns a mapping it's going to cache that mapping for some amount of time if a future request comes in for that mapping it can immediately return the cash reply in response to the query so we see that caching improves response time and it takes load off the dns infrastructure a double win cached entries will eventually time out and disappear from the cache after some amount of time the time to live note however that it's possible that if a dns record changes the cached entries will then be out of date however the dns doesn't worry about stale and out of date cached entries they'll timeout eventually even if in the meantime there's a bit of inaccurate information floating around for example if a named host changes its ip address that change won't be known internet wide until all of the ttls expire what's gained here however in this well best effort approach to name to address translation is that there's no need for costly and complicated mechanism to locate and purge out-of-date information from caches well that's all we want to say about the structure and the function of the dns but there's still two more things we want to take a look at we want to take a look at the resource records that are inside the dns and we want to take a look at what dns protocol messages look like dns database records are a four tuple with the name value type and ttl or time to live field there are a number of different types of dns records but here are four popular ones when the type is a that is an address record the record contains a hostname and its ip address and this record is used for name to address translation when the type is ns a name server record the name is a domain name like umass.edu and the value is the host name of the authoritative name server for that domain a cname record is used for name aliasing and an mx record is used to give the name of a mail server associated with the domain let's next take a look at the dns protocol message formats both the query and the reply message have the same format as shown here on the right remember that the dns is a query response protocol the id field here is a 16-bit number chosen by the querier when a response is sent in reply to a query that response takes its id value to be the same as that of the query to indicate that this is a response to that particular query the flag field is used to indicate whether this is a query message or a reply message whether recursion is being requested if it's a query and if it's a reply message whether the reply is authoritative the next four fields are used to indicate the number of questions and responses in the remainder of the protocol message in the case of a query a question say to resolve a hostname to an ip address the hostname would go in this field here in the case of a reply to such a query a resource record of type a remember containing the name and the ip address of a host would be inserted in this field here rfc 1035 defines all of these fields and resource records as well to help put together the pieces of some of what we've learned about the dns suppose now that you create a company it's called network utopia it's got a network and you want an internet presence you want your company's services to be reachable by others on the internet at your site network networkutopia.com what do you need to do well clearly the dns is going to be involved since in order for users to reach your network they'll need the ip addresses of the servers in your network and even if the name network utopia.com becomes fabulously famous no one's going to know the ip addresses of your servers so of course you'll need to use the dns for that first you register your name networkutopia.com with a dns registrar like network solutions a company that's a registrar that we mentioned earlier you'll need a set of ip addresses also we'll discuss in chapter 4 how you get those so let's assume for now that you've got a range of ip addresses for your servers you then need to give the name and address of your authoritative name server to the registrar the registrar will insert your name server's name in an ns record and its ip address in an a record into the global dns database that's all that needs to be done with the registrar the addresses of all the other servers in your network will be provided by your authoritative name server to queries who know the host names for those services and lastly you need to bring up your authoritative name server and populate it with resource records for the servers in your network and let's wrap up here with just a quick word about dns security now that you understand what the dns does you can see how absolutely critical it is to the functioning of the internet if the dns stopped working it'd be impossible to contact any host unless you knew its ip address which means practically never and so it's critical the dns be protected the dns is protected against denial of service attacks primarily by firewalls the dns also needs to ensure that records that are entered into the database are from authorized sources and so authentication services will play a critical role in protecting the dns we'll look at authentication services when we get to chapter 7. so that wraps up our discussion of the dns an absolutely critical network function that has to work at amazing scales and also an amazing performance we took a look at several things here we talked about the function and the structure of the dns we talked about how names are actually resolved and we took a look at the resource records inside the dns database as well as the dns protocol message formats coming up next we're going to take a look at another highly decentralized high performance distributed application video streaming you
Info
Channel: JimKurose
Views: 82,522
Rating: undefined out of 5
Keywords:
Id: 6lRcMh5Yphg
Channel Id: undefined
Length: 19min 8sec (1148 seconds)
Published: Sat Jan 15 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.