Hey everyone, in this video I want to explore how DNS works. What's happening behind the scenes when we type in that URL. So we just typed in. Well, we typed in. Dot. YouTube. Dot com. And there is kind of a trailing dot at the end. A period that we don't have to type in is just really done for us, but we type in that name and magically we can go and browse the videos, including this one. But that's not really where that content is, it served up through an IP address. So what actually happened is everything has an IP address and it may be multiple IP addresses serving up the content. So I can think, well, OK, there's actually a IPS, maybe multiple IP addresses that actually serve the content. So what takes that friendly name? Because as humans we're better with names. Same for phone numbers of people. Hey, we have an address book and we can type in the name and it calls the number. We'll stay with the Internet. There's a nice 32 bit address, but we're terrible at remembering those. There's also problems. Addresses could change. There might be hosted by multiple things. We want something that abstracts away the actual IP address. And the magic that does this is DNS. It enables us to map a name. To the IP address. So we have this hey Domain Name System and it kind of worked the other way as well. We could give it an IP address and there were reverse zones. Zone is saying that holds some portion of the namespace that could then give us a name from an IP address. Now there are also things we may have locally on our machines host files that map and name to particular IPS. I'm not going to talk about that. I'm going to really focus on what is the core DNS server. And we have to think about there's not just one server or one copy of a database that holds every single record in the world. It would be impossible. There's so many different records. And So what we want to really think about is there's a hierarchy of servers that together can answer any query about any name. But we have to take a step back and we have to really look at this name for a second and realize what are we actually looking at. Now the first thing is, if we really look, there is this. Top level, there's this route that we don't have to type in, but there is a period end of the name. But we're useful is this idea of com. So this portion here, this COM, this is the top. Level. Domain name. And the TLDR and there are many of these com net Edu, they're sponsored ones like gave and Mill. There are country specific ones like UK and US, but there's a top level domain name. And then there were specific servers that are authoritative, IE they can give an answer and they hold the records for for the various different top level domain names. And then there's all this, in this case YouTube component. Well, this is a second level. And what we really think about is often we'll put these two pieces together and it could be multiple other pieces as well. But we think of this as the domain name. So if I said youtube.com, well that's made-up of the second level name and the top level domain name. But hey, we think of it as the domain name. Now let's just www.partwellthisis. A sub domain. And it's totally possible there are other subdomains I might, especially internally and organisations. We'll see maybe a blog.youtube.com, astudio.youtube.com. In my company there's many different I can have child sub domains of a domain name, but in this case this is the end. So we might also think of this as a host name. This is the final service we're trying to get to that lives within this domain name if we put all of these pieces together. We get the FQDN, the fully qualified domain name. All of that is really just to identify the fact. That. There were different components. There's a hierarchy that makes up that name. Now sometimes I could I don't have to type www.youtube.com, I can just type in.youtube.com. In this case, that youtube.com is the domain apex, either root of the domain. And I as someone hosting website, can make that work. So I don't have to do WWW. I can create certain records at that domain apex the root of my domain that can then send it to certain content. And once again, I don't have to typically type this trolling dot. The trailing dot is really just saying, hey, this is the end, there's nothing coming after this. It's implicit. It's just going to get added for you. Sometimes it can be a factor. Internally on our network. I could internally just type nslookup. Just a host name part and it would work. And it would work because our network configuration has a default domain name that just gets added on to anything I type in automatically. If I typed in a dot that wouldn't happen. I'm saying no no this is the end of it and it would fail. So when I actually go and do a query to DNS, this is implicitly just being added this dot at the end. So this. Structure these components is really important to understand what's actually happening when we try and use DNS. So let's try and use DNS. So I could think about. My machine. So my little machine I want to go and look up youtube.com so how is that going to function so I have my box down here. It's a very advanced, fantastic PC, so I am the client in this scenario. And on my machine I have configured DNS servers. Now these could be many different things. This could be something eternal. In my company especially, I've got Active Directory. My DNS servers would probably be my domain controllers. They are authoritative. They host the zone files, the records for whatever my AD domain was. And then it would also act as a recursive resolver. It will go and recursively look up and find answers for things it's not authoritative for. But I have some DNS server and it could be multiple ones that I have been configured that when I'm trying to find something, this is who I'm going to go to to try and get the answer from. So great, I have a DNS server configured. Now I want to go and look up something. So what I'm going to do is I'm going to ask, hey. I have a request now in this case my request is going to be WWW. YouTube. Dot com so anytime you see? That's what I'm actually sending. Now my DNS server. Forget about cashing for now. Has no idea. No matter what name I type in, it knows absolutely nothing. So it has to find somewhere to start that it can start trying to work this out. So how does it know where to start? Well. This. Route that I drew over here. There are a set of root servers that are defined and we can actually look at what they are. O if we jump over to here, these are the list of root servers and we can see there's 123456789101113 of them. There's thirteen of these root servers and notice each of them only has one IP address. There might be an IPV 6 as well, in fact there is. But there's one IP address and the reason this is limited I think is to do with the size of the packet and the number of addresses you can return as part of a DNS request. So it's limited to these 13. And you can see some of them are operated by various sign one by NASA over here, the US Department of Defense and other various sign here icans, the governing body. So these are the root servers. And your DNS server probably knows about these. It's built into the DNS server to give it a hint of where do I go if I was to look at my DNS server for a second. So this is one of my domain controllers. But if I was to look at the properties of my DNS server, we'll actually see a root hints tab. And we will see those exact same servers, those exact same I addresses. It's configured as part of the DNS server, so it doesn't have to know anything else. All it knows is these root hints. But. That's enough for what it needs to do. So I need to start out somewhere. And so the first thing I now know a list of root hint servers. That as root hint servers do not have a catalog of every single record. So I'm going to go and ask these root hint servers. So I might say, OK over here, let's go over here. There are a set of root. Name servers. I'm not going to draw 13, but we know there's a group of these. So might DNS server very nicely. He's trying to help me out. It's going to say OK. Hey. Do you know the answer? Www.youtube.com it has no idea but it can say well I have no idea. But what I will tell you is for com. I saw the trailing part is com. So for com you could go and talk to these name servers. OK, well that gives me a hint and a place to start. And what I thought might be fun is we could actually mimic what DNS is doing. So if I jump over to a command prompt for a second and we can do exactly the same thing. Now remember? I don't know any servers other than the route. Now I'm going to flush out my cache because I don't want to mess things U. And remember those name servers that we had available to us so I could pick this first 198.41.0.4 or I could just say a.root-servers.net but we'll start off with that one. So what I'm going to say is hey nslookup which is a command line utility. I want to query. For type nameserver records. And I'm just going to ask it for com. And that IP address, so that's the IP address, one of those root name servers. And what it's responded with is a whole bunch of name servers. Facom. So any of these I could now ask for some answer. So I've got a step closer to my problem. So it's given me a list of name servers that are specific to com. There will be a different set for edge U, maybe different for .net for.uk. But it's now giving me a hint. I was asking for www.youtube.com. It said I don't know, but com is a good place to start and it's given me the list of them. So now there's com. Ohh, wrong color. Let's stick to the same color. Now there's com. Name servers. And again, there's a whole set of those we saw. So what my machine does is I'm gonna ask the same thing again. Hey you, I've been referred to you this time. Do you know www.youtube.com? And once again it doesn't know, but now the com name servers do know all of the authoritative name servers for domains under com, so it knows which name servers host youtube.com. So it's going to say I still don't know the answer, but what I can do? Is. I'll refer you to the YouTube. Dot com. Name servers. Great. So let's mimic that as well. So let's let's carry on down this chain. And I'll just pick one notice it now. It's gonna be all of them every time it returns to responses because it doesn't want one server to get hit more than any others, so it does kind of muddle them around. Now I could at this point, just to give you an example, so I might say, hey, those records have got responded, maybe I need to find the actual host record, maybe. OK, I'll pick the first one. I.gtldservers.net. And who would I ask? What ask that root name server again, the same server I asked for. Hey, who is com? I'm asking it. Well, what's the IP address of this thing you gave me? So now it gave me the IP address of one of those com name servers. So now I can actually go and say, well OK, well now I need the name servers. So I'm doing an NS query again. This time I want youtube.com and I'm going to ask this 192.43.172.30. And now it's actually responded with very logically. Does Google own YouTube Google name servers? But they are authoritative for youtube.com. So I've got another step closer. So now what I have. All. The youtube.com authoritative nameservers. I now know those. So now finally what I could actually do is my DNS server can now talk to these. Do you know who www.youtube.com is? And it says. Yes, and here's the IP addresses. Exactly that and we might as well finish the journey. So let's go back over to here. So now I know them and I don't actually have to go through all of the thing of getting the host address, it will just do that for me. So now I want to say, hey, give me the host records for www.youtube.com and last name serverfor.google.com. And there's my answer. Now what I can see is actually. Youtube.com is an alias, so it's a C name. And then it points to a whole bunch of different IP addresses, which makes total sense. It is such a big service, it's going to be used by many different things. Now, if I didn't ask specifically for host address records, which is an A, and instead did any, it would probably respond with just. My answer is just. The C name record. And that's actually this U2, dash, UI-L, Google, which is all of these different addresses. So now we actually have an IP address. Woohoo. And now my DNS server triumphantly after doing this recursive resolution, it's done all of this work. We'll finally respond with hey look. Here's the IP addresses. That I could now actually take and go and talk to the service. Now 1 interesting thing you might say I've got the IP address. I can now just go HTTPS: wack, wack the IP address. It won't work, so we can try it quickly. So if we jump back over. And I mean it won't work in any way that we want it to work. So I could take one of these random IP addresses because all of them technically should respond. And give me an answer. It's upset. It's upset. The certificate authority invalid. It's not matching the certificate. Now I could still carry on going, but it's telling me up here this is not secure. The certificate is not matching this IP address and we would absolutely expect that now I could carry on. But notice it took me to Google. It took me to Google instead of taking me to YouTube. There's a couple of different things happening here. You have to think about that. So at the end of the day now there were some actual servers hosting the YouTube, so there's actual boxes under here. That are hosting the web service. So these are the web servers. But remember, we use certificates. We use certificates to guarantee the authenticity of who we're speaking to. Well, that's a certificate name has the fully qualified domain name in it. It has www.youtube.com. It probably has youtube.com as well. But if I just start talking to an IP address, even though it ends up resolving to an IP address, the certificate name doesn't have an IP address in it. So now there's going to be a mismatch. Saying hey, I don't trust this. This isn't secure. And there's also additional problems now I went to a different site. That many times you may have one IP address that actually hosts multiple different sites. And there's a technology called server name indication that makes that possible. Now, what used to happen with HTTP is as part of the request, it would say the domain name in there. But we're used TLS now. We don't send a path until we've already established a TLS session. So if we think about these two sides as part of any conversation, so that, let's say there's me on my computer here. And then there's them on their side. Well what actually happens is firstly I stablished TCP says that whole 3 way handshake. Going on there. So I was sending an ACK, then I get a SYN ACK. And I send a sin back. There's a there's a bunch of things back and forth, but only then do I go and establish TLS. Now the first thing we do with TLS is we send a client hello. And what we include in this is this SNI, the name of basically fully qualified domain name we're going to try and talk to. And now this gets this as part of that initial TLS negotiation. So it knows well what are you trying to talk to. So now this side knows maybe which certificate to use. With this TLS so it matches it. Also maybe this is a content delivery network. Now it knows which batch of servers should actually receive this and actually be used as part of the communication. If I do an IP address in here it has no idea so it doesn't work. So just because we have an IP address. Doesn't mean you can do that because again, there could be a whole bunch. I might have one IP, but it's actually fronting multiple different sites. So I need the server name indication to tell me which particular maybe sites I'm going to forward to which certificate to use based on what I'm actually trying to do. So this is really important to be able to do that. And what you may actually see is is very common, we could try it so if we jump over for a second. So what we got earlier? Let's go back. We looked at www.youtube.com. We got a whole bunch of records. Well, there's a completely different service, which is studio. It's the same records. Now take a while to go through again. It reorders them every time but the 1st 11722171.238. I'm gonna see it down here as well, probably. Maybe it's doing something slightly different this time. There's so many records that can only return so many. But let's pick a different 1422532174. There we go. There's that one so I can see. Some of them are reused. The same IP addresses are used for multiple different services because it's looking at that SNI to actually workout, well, who should we actually be talking to? Sure realize This is why many times I have to get DNS working correctly, because if there's any kind of secure TLS session between me and who I'm talking to, I can't do it based on the IP address. It has to get the host name to match the certificate name, maybe even know which set of services I need to actually send it to. So that's a really important point that still has to happen now. We got the answer, remember? But this was a lot of work. I do not want to do this every single time. So what actually happens is when I go and get my response. I cache it. I have a cache and these responses these records back had a time to live, so I cache it for whatever that time to live was. And also on my machine I have my own cache as well. Which again I will cache for the period of time if we were to look at my box now. I could for example go and say IP config. Display DNS. And this is now what I have. And there's a bunch of records. Remember I I flushed it at the start, but I've already got records in here. Some of them are to do with the queries we did. We can see hey the Google ones in there, which makes sense. But also you can see some Microsoft things. There's things going on on the back end to just make it function. There's not that many yet. In PowerShell I can actually do it a bit nicer. DNS, client cache. And now I can see those different records my DNS server. If I do. So I get the right menu. If I have the advanced view turned on, which I do. I can see cache lookups. So these are the lookups that it has done and how is that? I cleared this just before we started, but already just for things on my. Domain. These have all been queried. Just buy things running in the background through various things notice. It knows. Those name servers for com. So it's already gone and got a response for com. And then it would also have now gone and got responses for, well, Azure. It found the name services for Azure. So my DNS server has cached these records and each of them, like I say, would have a time to live. For net, it's also gone like Akamai is being used and it's got various pieces of information for these as well. So we have this caching going on for all of the different things we do now. What's interesting? It is not such a problem today, but you may have heard in the past about DNS cache poisoning. And it it works. Because let's imagine this is maybe an ISP server used by many, many different people. The DNS is a packet, it's UDP which. Even UDP is part of the problem. Is not a TCP session to establish some kind of consistent communication. It's a UDP packet, so it's stateless. Imagine I was a bad guy, so I'm going to be evil. So I'm a bad server. I'm angry and I've got a certain IP address and what I want to happen is when someone goes and tries to talk to. NatWest Bank or Google? Whatever it is, I want DNS to respond with my IP instead of the proper servers down here. So what would happen is I would make a request. 4 Google, com or NatWest Co, UK, whatever that might be. Remember this takes time as to go out and make responses, and it may take hundreds of milliseconds. As part of this, when I send this it will send me back what my identifier was. Now identifier was a 16 bit, so 2 to the power of 16. There's like 60 odd thousand different numbers, but what used to happen is these would be just incremental. Request one, request two, request three. So it sent me back what my request was. I would instantly spam it with a whole bunch of requests. Well, actually responses. For the name with my IP in it. Because I'm assuming if it's just incremental I'll send it plus 1 + 2 + 3 + 4 + 5. Assuming other people have made little queries to it spam with fifty of these, hopefully I get my response in before the proper response. It would now store that record in its cache for whatever time to live I gave it. So I might give it a really long time to live so it doesn't try and get a better record. Well, I've now poisoned the cash with my IP. Instead of the proper IP and then it would give that record out to the people that ask. Now depending on what I have on my box, maybe I try and harvest credentials. Maybe I scan what the browser and I dump a bunch of rootkits and malware on it. Who knows what I can do, but I can do bad things. I've essentially redirected requests to my box now. This is not as common now. Firstly. Those identifiers are not incremental anymore, it's random. So I have to be very good now at guessing that one out of 60 odd thousand options and the port we make the request from, we randomize as well. The port is also 16 bit, so we have another 2 to the 16th, so another. You can't use all of the numbers for ports, but let's say 64,000 usables. So now it's two to the 16 * 2. It's billions, so this is far less likely to work now. But after a huge amounts of bandwidth and it's still not likely. Additionally, DNS SEC these are DNS security extensions are leveraged and they use PKI public key cryptography to ensure and validate these domains so that hey, someone can't try and pretend they're from youtube.com or google.com because it's difficult would not match. So there were things now to protect against that type of pollution. And just finally. Might DNS server, especially if it's like my local company, I may not want to go and do all these requests directly. There may be a higher up DNS server like from the actual ISP and I can just do forwarding. So I can say if I don't know the answer, I can't be bothered to go and do all this work. You're much bigger set of DNS servers. You probably already gone and got answers and got your nice cache and in your cache you've got tons of answers. I'm just going to ask you. Maybe you can fulfill the request from your cache. Maybe you go and get the answer for me. But rather than me doing it myself, I'm just going to ask you. In Azure, my DNS server, I forward to the Azure DNS service. So then I can also resolve Azure Private DNS zones that have been linked to me and things like that. I can do conditional forwarding. I can say, hey, I want to forward to these particular DNS servers for this particular domain name. There's a lot of cool stuff I can do, but there's DNS, it's a hierarchy and all I do is I recursively resolve to try and get closer and closer. To the actual name servers that can provide me my record so I have to start off finding. These root name servers which are coded into the DNS servers. They have a standard set of 13. That's gonna give me a hint of I don't know, but go and talk to these that are responsible for the top level domain. They will be able to tell me which name server is responsible for the domain name. And then I can talk to they those name servers to actually get the record. And then I can go and establish the connection, establish the TLS. We use server name indication so it can know which certificate to use, because very often, especially with CDN's, there might be multiple sites offered. So even if we get the IP, although we're using the IP to actually do the TCP and the TLS, we still send that server name. So I can get all the right configuration. So that's it. That's DNS. That was a bit of a fun that was useful. As always, until next video, take care.