Internet - CS50's Understanding Technology 2017

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[MUSIC PLAYING] DAVID J. MALAN: The internet. Odds are, you use this every day, and odds are you have internet connectivity at home these days, or at work, or at school. But how does it all work? How is it that you can use your phone wirelessly, how is it that you can use your laptop, and your desktop, and so many other devices all, somehow, on a network. Well let's consider what you yourself might have at home, or in your office, or at school, and let's assume for the sake of discussion that it's a home network. So over here is, of course, your home, and inside of that door some number of devices that actually get you on the internet. But what are those devices? Well odds are, inside of your home for instance, you have a device that might be called a cable modem, or a DSL modem, or a FiOS device these days, and that device is something you generally pay some number of dollars per month for because you're paying for an ISP, an internet service provider. So that device is somehow connected to the internet, which for now for our purposes right now we'll just draws a cloud, and that there is the internet. And that device comes from an internet service provider like Verizon, or Comcast, or any number of other providers, and somehow they themselves are on the internet. But how do we now get the rest of your home on the internet if all you have is just this one device? Well depending on how this device functions, it might just be all that you need. And wirelessly, somehow now, your phone and your laptop, and all of your other devices just work. Or, maybe you need a second device, that we might call home router, that somehow connected to that cable modem, or FiOS device, or the like, that in turn makes network connectivity possible in your own home. And maybe this little home router does a little bit more, and maybe it's got a couple of antennas that actually provide the Wi-Fi service. Meanwhile, maybe it also has some jacks or some physical ports in back into it you can plug cables so that if you have a wired device like a DVR, or an XBox, or something else that's not necessarily wireless, you have some place to plug those devices into as well. But this is so high level, and this sense this is so poorly drawn. What is actually going on underneath the hood, so to speak, and how is it that bits, zeros and ones, can transmit themselves from my house to everywhere else in the world and back. Well, let's take a closer look. Every computer on the internet, it turns out, has something that looks like this, so-called IP address, or internet protocol address, which really is just a number dot another number dot another number dot another number. So four numbers separated by dots, and each of those numbers is a value between zero and 255. So there's 256 total possibilities for each of those values. Now it turns out there's other types of IP addresses today that are actually much bigger than this, but more on that in a bit. So these IP addresses, much like our postal addresses, uniquely identify computers on the internet. So if you have a laptop, if you have a desktop, if you have a mobile phone, if you have an Xbox on the internet, that device, by definition of how the internet works, has an IP address. It has a unique address that allows other computers on the internet to talk to it, much like you might live at 123 Main Street in Anytown, USA, or the computer science building down the road at 33 Oxford Street, Cambridge, Massachusetts 02138, USA. These very specific phrases describe uniquely some building in the world, much like these numbers just fine uniquely some computer in the world. But where does this number come from? If I open up my laptop, or turn on my desktop, or take out my phone, how does any of those devices know what IP address to use? Because, it might have just been some time, but I don't remember ever having typed in a value like that into my phone, so it's got to be coming from somewhere else. But where? Well this is one of the things you get from your Internet Service Provider, or ISP. You get an IP address. And back in the day, not all that many years ago, there would actually be a technician that would probably come to your house, or your home, or business, and actually configure your computers to use this numeric address. But these days, software is a bit fancier. There's actually something called DHCP, Dynamic Host Configuration Protocol, which is software that ISPs, Internet Service Providers, run and really provide to you that allowed your Mac, or your PC, or your iPhone, or Android device to say upon turning on, hello world, I need a unique address. And that DHCP server responds to those open ended questions with a specific IP address that the internet service provider controls and has allocated specifically for your home. Well, that's all fine and good. But if my ISP is only providing me with one such address, how is it that I can have multiple devices at home on the internet at the same time? A whole family, indeed, could be on the internet simultaneously, and yet if that means four separate, or five, or more separate devices, gosh, that means that somehow each of those devices needs its own IP address. So where do those come from? Well those two come from DHCP, but not necessarily from your ISP, your Internet Service Provider. Those additional IP addresses come from a device in your very home, that home router to which I alluded earlier that's probably connected to your cable modem or your FiOS device. It's this home router that might have those little antennas that itself also supports DHCP. So when you turn on your laptop, turn on your desktop, power up your Xbox, or take out your phone, and those devices say, hello world, I need an IP address, odds are it's this device within your home that's answering that question, but it's providing other answers as well. It's not just giving you an IP address, it's also telling you how to communicate, it turns out, with the rest of the world. Because indeed, when I type an address into a browser, it's not numeric last time I checked. Indeed the last time I typed something into a browser was not something dot something dot something dot something, it was like Facebook.com, or Twitter.com, or Gmail.com or any number of other domain names. Because indeed, recall that most any web sites certainly these days that you'd visit has a domain name. It's something.com, or something.edu, or something dot any number of other Top Level Domains, or TLDs. So we humans are much better at remembering, I would think words, and/or phrases like dot come and dot edu, then we are arbitrary numeric addresses, like 1.2.3.4, or 5.6.7.8, or completely arbitrary numbers that aren't even so simple to remember as those. So how is it that when I type in Facebook.com or Google.com, my computer knows how to find that computer in the world, if in the world there are computers with just these IP addresses? Well, it turns out that computers not only have IP addresses that they get from DHCP servers, they also have what are called DNS servers. And indeed, DHCP provides us with access to exactly that as well. So in addition to having a DHCP server somewhere out there in the world from your ISP or maybe even your home, you also have DNS servers. And DNS servers or Domain Name System servers, and their sole purpose in life really is to convert domain names like Facebook.com and Gmail.com to corresponding IP addresses. And these DNS servers, therefore, can help our computers talk to computers that, by definition, have IP addresses but that we humans would never know if someone didn't tell us. So there's already so many acronyms piling up here. Just to recap, every computer has an IP address. That IP address typically comes from a special server called the DHCP server, that lives within your ISP, Internet Service Provider, whoever that is, or maybe even within your own home, more on that in a bit. And meanwhile, there's also DNS servers in the world, also controlled by your ISP, that convert domain names to IP addresses so that when you actually try to go to Facebook.com your computer, Mac, PC, iPhone, Android, whatever, knows what the actual IP address is. So why is that? Why does that matter? Well turns out that the way computers intercommunicate on the internet is by sending packets to one another, or virtual envelopes or much like you might, or once much like you might have in the past sent someone a physical letter, a handwritten letter inside an envelope with an address on the front and probably even a stamp, so can computers communicate in very much the same way, but it's all digital. It's all zeros and ones. So what do these envelopes look like? What are these packets look like? Well, why don't we go ahead and construct something a little more physical? All right so I really like cats, and I want to find myself a cat on the internet. And so, I'm going to send a request to someone, a server, in fact. Maybe someone like Google, and I'm going to say literally get me a cat dot jpeg, where jpeg is a common file format for cats, so this is the message I want to send to some server. Of course it doesn't have any information on it, so who is actually going to feel this? Well I also have to go ahead and put it in an envelope, so I might go ahead and do this, put this message here in an envelope. Just a moment, I'll make the envelope and message disappear. No, we'll now go ahead and address the envelope to the destination. So as a destination on the internet, this server is going to have its own IP address, and little old me as a computer on the internet laptop, desktop, phone, or whatnot, I too am going to have an IP address. And so, what I'm going to go ahead and do is put my IP address in the top left corner-- doesn't really matters since this is imaginary-- and my IP address shall be 1.2.3.4 just for the sake of discussion. The server, meanwhile, I don't know what the IP address of the server is. I know my own IP address because that came from my ISP's DHCP server, but the other server's address, unless they really know Google's IP address, I wouldn't know it myself. So I'm going to have to rely on DNS. So I, as a computer, would actually send a request to my ISP's DNS server, saying, hey, DNS server, what is the IP address of Google.com. Hopefully, my ISP knows, and a response will come back, and maybe it's 5.6.7.8, and so I'm going to go ahead and write 5.6.7.8. And frankly, if my ISP doesn't know-- which is unlikely these days just given how popular Google is, but smaller web sites might not be as well known to an ISP-- well, my ISP is going to be configured by the owners of the ISP to know about some other DNS server in the world. And so, they will simply escalate it to another DNS server, and maybe that DNS server will escalate it to someone else. And thankfully, by nature of how the domain name system works, there's going to be some number of root servers, special servers, that in the worst case, at least know, who else knows, what the IPs are of all of the dot coms, or all of dot edus, or all of the something other top level domain. So there's this recursive system, this tiered system of questions, that can be asked for that finally someone knows, and then my own Mac, or PC, or phone, can remember it. So this message is going to go 2.5.6.7.8, which I'm presuming is the IP address of Google.com as per the response from my ISP's DNS server, and it's going to be from little old me at IP address 1.2.3.4. So I'm going to go ahead and seal this, all right, and I'm going to hand it off on the internet. Now where does it go? More on that in just a moment. But some number of seconds, or hopefully some number of milliseconds later, I'm going to get back a response, and indeed I'm going to get back, of course, a cat, this here happy cat. But it's not going to be as simple as just being handed a cat off the internet. This cat too, meanwhile, is going to be in one or more envelopes. That is to say Google's own server is going to put this cat into an envelope. But maybe, Google when trying to do that, oh, maybe it doesn't quite fit. And frankly, maybe this image is so big that it would just be rude to other customers to cram this whole big image of a cat into just one envelope, thereby blocking other customers' data from potentially getting to them as quickly. And so, what Google might actually do, and this is very common, is divide the cat into fragments. So hang in there little guy. But we might chop up this larger image into four or so smaller fragments, so that now these are much more reasonably sized, and what Google can do is put one of these into one envelope, can put another of these into another envelope, and then of course if there's four fragments in total, we can put like a third in this envelope, and then we can go ahead and put the fourth in a fourth and final envelope. Now of course, I'm going to have to write some information on each of these four envelopes. So what goes on the outside here? Well previously, my IP address was 1.2.3.4, and Google's was 5.6.7.8. If they're responding to my original request with this response, those numbers are going to have to be reversed so that this packet is going to be coming from Google at 5.6.7.8, and it's going to be going to me, which is 1.2.3.4. And they're going to go ahead and put that same information on every one of these envelopes. But that's not quite enough, it turns out. It's not quite enough for them to just put my address on these envelopes, because there's four of these packets. And so, you know what, they're going to have to provide another clue. They're going to have to tell me how many total packets there are in the response. So I'm going to put one of four, and this one will be two of four, this one will be three of four, and this of course will be four of four. So what Google has put on each of their envelopes now looks a little something like this. To 1.2.3.4, which is me, from them, it's 5.6.7.8, and per this mark down here, this is packet number one of four. So this is to say that IP goes beyond addresses. IP, Internet Protocol, is really a set of conventions. It's a set of rules that computers and servers are supposed to follow, so that when they enter communicate, one knows what to expect from the other, and the other knows how to respond to the first. And so, this support for fragmentation is also part of this feature of IP. Now what is the benefit of this? Well this way, if I now get as little old me off the internet, packet two of four-- it's a little strange that it's out of order-- packet three of four, and packet four of four, but I don't seem to have actually received packet one of four. I can logically infer from the packets I did get which of them I'm missing. But IP, Internet Protocol, alone says nothing about what I should do as a computer in that situation. So it turns out that computers actually use not just IP, Internet Protocol, but another protocol, another standard called TCP. And in fact, these are so commonly used together that you might have heard or read at some point of something called TCP/IP, or TCP slash IP, which is just Transmission Control Protocol slash Internet Protocol, which just refers to the combination of these two protocols in order to transmit data on the internet. Now among the roles that IP plays is to support addressing, and fragmentation, and a bunch of other things too. And among the roles that TCP plays is to ensure that packets can get to their destination. And in fact, TCP support something called sequence numbers in addition to any fragment identifiers that also allows to ensure that data gets to its intended destination. And so upon receiving just three of these packets, clearly missing fourth, what I, a computer, can do is say, hey Google, I need you to send one or more packets because I know I'm missing them, because they haven't been properly acknowledged. And so, oh, thankfully, Google has retransmitted to me this packet. And so now, I have all four, and I can, of course, on my end, reassemble albeit with some virtual tape, the cat in its final form, which is going to look like-- if all of the packets indeed came through the cat in question. And because of course, these are all just bits, all just zeros and ones. They can certainly be stitched back together, so that we never actually know that the splitting happened. So, turns out TCP does something a little more. Because what if my original request to Google went to a server that does multiple things? Like Google is obviously a website. They have search results, they have email, they have calendars, and so much more. But they also have email servers, right? Gmail itself, not to mention their own employees' e-mails. And they probably have chat servers, or video conferencing servers, like Google Hangouts and the like. So when I originally sent a packet to Google.com, it probably needed a little more information than I gave it. It probably wasn't sufficient for that original message for me, get cat.jpeg, to contain only Google's IP address, which again was 5.6.7.8, and my own from address, which was again 1.2.3.4. I, just for thoroughness, could on this envelope say one of one, because it's a pretty small request to just say get cat.jpeg, but I probably need a bit more information to make clear to Google that this is a request for a web page, not a request for an email, or not a chat message, or not certainly a video stream from me. And so I'm going to actually append one piece of information. I'm going to put literally a colon after Google's IP address, and I'm going to go ahead and say 80, the number 80. So it turns out that per TCP, the world has standardized on certain numbers that represent different services, that servers might provide. 80 means HTTP, Hypertext Transfer Protocol, and that's just the language that web servers speak, and it's the language that I've been speaking inside of these envelopes. So that little message I wrote a moment ago, get cat.jpeg, that was an HTTP message. And this cat that came back in several parts, that together was an HTTP response. And so by clarifying on the envelope, this message is meant specifically for port 80. That is the service, known as HTTP, Google's physical servers know we should hand this packet and any others to our web server, not to our email server, or chat server, or video server, or the like. And it might not actually be 80. In fact, odds are these days Google, like many websites, is using SSL, or HTTPS, a secure connection, and that actually happens to use a different number than 80, technically 443. You don't tend to see either of these numbers because they're just assumed to be the default in modern web browsers, but they are there underneath the hood. They are there on the virtual envelopes. Turns out there's other numbers too. E-mail tends to use 25, TCP port 25 and a few others, FTP, File Transfer Protocol, and many other protocols all have their own numeric port identifiers, and indeed that's all this number is. Whether it's 80, 443, or something else, it's a so-called port number. So this then is a more representative picture of what it is that's going across the internet and coming back to me. This is more of the information, though not all of it that's going back and forth across the wires, or wirelessly. So these things, protocols, IP is Internet Protocol, TCP is Transmission Control Protocol. What is a protocol? Well again, it's just kind of a set of standards, a set of rules. And in fact, we humans have protocols. And some of them, if you stop to think about it, are a little silly. Like in a lot of cultures, when you meet some other human for the first time, you do something kind of weird and you extend your a hand to shake that person's hand, and then you just do this down thing for like a second or two, sometimes longer awkwardly, and that somehow completes the transaction. Well that's actually what's going on with computers. When I send that message originally, get cat.jpeg, Google according to the HTTP protocol, Hypertext Transfer Protocol, it's going to read that message, and realize, oh this user wants a picture of a cat, let's search for that file, and let's actually return cat.jpeg. And I'm simplifying the format of the message because when you're actually searching for results, the message actually looks a little more complicated than that. But we're assuming we're just getting a very specific cat from the server. And according to HTTP, Google's web server, because it supports that protocol, it speaks that protocol, it speaks human just like I and my colleagues do, it knows to respond with one or more envelopes of its own containing that cat. But there's even more protocols than this. There's a UDP, which you don't use quite as often, but actually has value. And the biggest difference between UDP and TCP is that UDP does not guarantee delivery, and we're guaranteed delivery so long as the internet is actually up and running between you and some endpoint. Why does TCP then guarantee delivery? Well, it knows how to respond packets as needed, UDP by definition does not do that. That is just not a feature you get. You can still use it with IP to get data somewhere, but it's not necessarily going to come back what you request. So why would you ever want to send a request, and maybe or maybe not get a response? Well sometimes, this is useful. Like if-- video conferencing-- if you've ever used FaceTime, or Google Hangouts, or Skype, you sometimes see things buffering. But if while you're trying to talk to some other human in real time so to speak, if the video kept buffering, and kept buffering, and kept buffering, and prevented you from seeing that person, or hearing them in real time, frankly it would get pretty annoying pretty quickly and you just take to your phone or take a phone off the wall, an old landline, and make a call which is much more synchronous, much more real time. But movies of course do this. If you're watching Apple TV, or Netflix, or iTunes, or something, those videos do tend to buffer because you don't really want to miss a few seconds of, or a minute, of a movie or some climactic ending. But in real time when talking to another human, it's not really ideal to just delay the conversation while someone else is there on the other end of the line. And because there's so many packets going back and forth for things like video conferencing, you know what, if you drop a few, literally, like if some of those packets just kind of get lost, don't worry about it I will infer from context, I'll infer from the conversation I'm having what it is I missed and we'll just forge ahead. Or you know what, I'm just going to say hey, hey, buddy, what is it you said, can you repeat that, and he or she can simply oblige. So sometimes, when you want the data they keep coming, and keep coming, especially when it's high volume, you don't want to stop and resend data, you want to just ignore it and trust that the users are going to be OK with that. And for live video conferencing that might make sense, for live sporting events that might make sense so that you're not drifting behind the rest of the world. So some applications that actually does make good sense. But where do these packets keep going as they leave my hand, and where are they coming from when they land in my hand? Well there's a whole internet out there that uses TCP or UDP, and uses IP, but there's a lot of devices between me and Google, me and anyone else in the world, that somehow routes that data left, right, top, bottom. So how does all that work? So we know then that my computer has an IP address, and we know that it's of this format. And this format, again, is just a number dot a number, dot a number, dot a number, and each of those numbers is between zero and 255. And we dive in a little deeper, if you remember your binary, that actually means that each of those numbers is 8 bits. So that's eight, plus eight, plus eight, plus eight. So that's 32 bits, and-- hang in there-- that means there's two to the 32. That's four billion possible IP addresses. But I mentioned a bit ago that there's also a longer formed format because the world, it turns out, is running out of IP addresses. Even though there's as many as four billion possible, there are so many phones, and people, and laptops, and servers, and an internet of things, IoT devices these days, all of which need an IP address that frankly, we've been running out for some time. And so instead of using this format moving forward, IP Version 4, or v4, the world is gradually starting to use IPv6, which actually uses 128-bit addresses which are much, much larger. If you were to actually multiply this out, if you have two to the 32, that's roughly four billion possible IP addresses. But if you use not a 32-bit IP address, but a 128-bit IP address, it doesn't sound like that much bigger of a number, but this is exponents, not just multiplication. And so, that is how many IP addresses. I can't even pronounce that but the world is now going to have access to. So with that said, where can you see this kind of information? Well turns out that if you have a Mac for instance, you could to go to System Preferences and then Network, and then poke around, hopefully without changing anything, and you'll see something like this, that you'll see a mention of IPv4, and you'll actually see a mention of this protocol using DHCP unless for some reason it's been statically hardcoded or configured by perhaps someone else. And you'll see that at the moment the screenshot suggest that I'm connected with IP address 10.0.1.34 and actually, as it turns out, there's a lot of IP addresses that are actually private. And so, if you have an address that starts with 10 dot something, or an address that starts with 192.168 dot something, or 172.16 dot something, turns out your computer is using a private IP address that most likely came from a home router, or a business router, or maybe even your ISP, but it's private in the sense that only with special configuration can someone talk to your computer. And this is OK, because generally our phones and our Xboxes, and our laptops, and desktops in our homes, and generally in our businesses, and schools themselves are not servers. People are not trying to contact us directly per se, we are trying to contact them. And even when someone sends you an email, it doesn't go to your own laptop or desktop per se, it generally goes to a server like Gmail, or Outlook, or the like, and your phone or laptop or desktop connects to that server in order to get the information. If now on Mac OS, you happened to click on Advanced here, you'll see some additional settings and you'll see that my IP address is again 10.0.1.34 in this case, you'll see a subnet mask which is used to decide whether or not some other computer is on the same network as you, and then most importantly, you'll see router, sometimes called gateway. And in this case, it seems that my gateway has an address, or my router has an address of 10.0.1.1. So that too of course is an IP address. And a router, as the name suggests, is responsible for doing this kind of thing, routing data in some direction. And if you run Windows, here's what a similar screen might look like on that operating system which shows, of course, your IPv4 address, and in this case, multiple addresses for DNS servers. Router's purpose in life is to be computers on the internet that have bunches of wires usually coming into them and going out of them, and they have essentially kind of a table, like a big list, like an Excel spreadsheet, inside of themselves like inside the RAM, Random Access Memory. And that table, generally has at least like two columns, conceptually. One of which has an IP address or a prefix, the first few numbers of an IP address, and then some explanation of where data should be routed to if it's destined for that IP address or that prefix. So maybe if an IP address starts with one, it should go that way out that cable. Or if it starts with two, it should go that way instead. Routers' purpose in life is to route data in some direction to some next hop, or that is to say to some next router. And so this means that this Mac here with IP address 10.0.1.34 is preconfigured by DHCP-- which again, came from my ISP, or from my university, or company-- is going to go to either local computers on the internet if I happen to be talking to another Mac or PC maybe to transfer file just a few feet or somewhere else on campus or in the office. But if it's destined for somewhere in the outside world like Google.com, well that's where the router comes in, because routers purpose in life is to get data toward another destination. And my little old laptop frankly doesn't know where in the world Google.com is, but maybe this router does because that's its purpose in life. And frankly, if that router doesn't know, no big deal. There's other routers in the world. And so long as that router can route data to some other server, well then hopefully that other router can get data closer to its destination. And hopefully indeed, within some number of hops, some number of steps, transmissions of packets from one router to another to another, the data will reach its destination. And frankly, generally speaking, data will reach its destination within 30 or fewer such hops. There will be 30 or fewer routers between me and some destination because humans and software have gotten really good at configuring the internet dynamically, so that data can route across continents, across countries, across oceans even, in order to get from one place to another. So if this then is little old me on my laptop here, and I want to talk to Google.com which of course is a big company over here, inside of whose door is a whole bunch of servers, well between us is the internet, and somehow we're both connected, and somehow or other data is going across the internet from me to Google. And that's because inside of this internet, there's a whole bunch of routers which I'll draw here as dots, and each of these routers is controlled by other big internet service providers, big companies, maybe even big universities, and they all have agreed to connect their routers. That's indeed what the internet is. It's a network of networks. So it's a network of Harvard and MIT's network, and UC Berkeley, and Stanford's, and Comcast, and Verizon, and all of these very big entities have connections among themselves, and each of them have some number of routers. And what happens ultimately, is that these routers are interconnected with cables, or some kind of satellite connectivity, or radio waves, or the like, and notice too there's very often multiple ways to go from one location to another, and indeed there might be multiple ways to reach your destination, depending on which path you take. And this is a feature. The internet of course, has its origins in US military design, and among the goals was to have some resilience against downtime. If one or more cities or one or more routers went down for whatever reason, that one of the design principles of the internet was to be able to route around that issue. And so it stands to reason that it's a good thing if data can flow from one point to another, but following different intermediate stops. Which is to say, when Google sent that cat over the internet back to me, that cats four parts might have gone in four different directions, but somehow all made their way back to me because the routers know how to get data to me again-- based on that envelope, based on that IP address-- but they might take different paths just because. Now what does that mean? Well sometimes, the internet gets busy. Routers get busy, they get overloaded with lots of packets, and so sometimes routers have to say go this way instead. Or sometimes, packets-- some things are just so busy that the router just gets overwhelmed and it has to literally, but slowly, drop packets on the floor so to speak, deleting the packets without ever delivering them, at which point, hopefully, if the users are using TCP, their computers will retransmit that data so it's not actually a problem. And all of this is happening so quickly, that you never really notice some of these delays or some of these reroutings, and so here might be several paths that data takes to get from me to Google.com and maybe a different path back, and each of these represents a hop, and each of these takes some amount of time. So how much time does it take for data to go across the internet? Well let's actually take a look. I'm going to go ahead here and run a program that is called traceroute. And this is going to, per its name, actually allow me to trace the route between me and some other computer. To do this, I'm going to type traceroute into the special window here on my Mac, and I'm going to do traceroute of-- well let's try it-- www.google.com Enter, and I'm going to see some interesting information here. Seems to be a little slow at the moment and that's interesting, it seems stars probably don't mean good things. So let's scroll up here and see what's going on. I'm tracing the route to www.google.com, and it turns out parenthetically, that is in fact Google's IP address at least at this moment in time here on campus, 4.53.56.109. So it's not, as it turns out, 5.6.7.8. It's that instead. And each of these rows of output-- one, two, three, four, five, six-- represent a router between me and Google.com. So what traceroute does is it sends a message to the first router, then a message essentially to the second router, then the third router, then the fourth router, and it asks it, one, for its IP address-- or it figures it out-- or its name. In fact, notice that some of these routers seem to have somewhat cryptic, but English-like names, and it also tells me, traceroute, how many milliseconds it took for the data to get from me to that destination. Look how fast this is. I don't know exactly where all these routers are, but all of these numbers are super small. 3 milliseconds just to get from one point, my computer, to another router. Now, you can infer what some of these are. I don't know where these IP addresses are, but odds are they're on campus. Odds are rows one and two, both of whose IP addresses start with 10, are somewhere on campus, routers on campus. Step three, I'm very confident that it is one of Harvard's routers because it's called Core GW, which I just know by convention means Core Gateway or Core router, and it belongs to the faculty of Arts and Sciences on Harvard's network. Then there's another one also called Core Gateway, which is probably somewhere slightly different on campus, maybe not the faculty of Arts and Sciences, but in the core Harvard network. And then it gets a little more interesting. Then apparently gets handed off to a bear on rows five and six, or two routers whose names have the word bear in it for some reason, but odds are they're indeed in Boston-- which is not too far here from Harvard-- on level three's network which is a very big common ISP, Internet Service Provider. Level three. Now thereafter, for whatever reason, the routers between me and Google are not responding to this inquiry. And that's fine. They might just be configured to ignore this type of request, but it's not all that enlightening. I just know that it's taking more steps to actually reach Google.com because their servers are beyond that sixth router. So let's try another destination. When in doubt let's just try again, and let's try someone like our friends at, maybe UC Berkeley who maybe are a little looser when it comes to sharing information. And let me go ahead and hit Enter now, and wow, just flew by. 19 steps later, notice what's happened. Looks like two of Harvard's nameless routers up top, then that ACore router-- this one's a little different-- northeast gateway, so it actually took a different route this time off campus. Then this border gateway, BDR, probably meaning border also in harvard.edu. Row five is some nameless router somewhere else, not sure where. Row six is something in northerncrossroads.org. Nox.org. This is a very big peering point were lots of ISPs interconnect. And then we're going to have to take some guesses here. Then we have SDN, SW. I don't know where this is, but internet2 is a network, a very high speed network of a lot of universities. So that's great. It looks like our packet's got on kind of the superhighway academically speaking, which is good because it tends to be pretty fast. And now, I don't know where all of these are. But I'm going to go out on the limb and say, you know what, this router in row eight is probably in Chicago just because of that abbreviation. The next one is as well, rows 10 and 11, maybe if you're familiar with US cities, Denver, probably there, Las Vegas, these next two, Los Angeles here in row 14, 15, probably [? LosLA ?] as well for LAX. For whatever reason, system administrators have historically often named their routers after airport codes like LAX. And then of course, we're in California at that point, so it's not all that far from UC Berkeley. Up north and indeed, it looks like the official name of UC Berkeley's web server is CalWeb for California Web Server. Farm, which means a cluster of computers. Prod, which means production like the official web servers in use. Then ist.berkeley.edu. Now it took me way longer to tell this story than for the actual data to get from here to there. It only took 80 milliseconds for that data to get from Cambridge, Massachusetts on the east coast of the US, to Berkeley, California, on the west coast of the US, and that might take a human like five hours, six hours at least to fly, not to mention waiting in the airport and then getting your luggage. That can be an all-day affair, when if I just want a cab from UC Berkeley, for instance, it's going to take me 80 milliseconds to make that request it would seem, with less than 1/10 of one second. And then the cab probably takes about that much time to come back. And notice the variability, though. Sometimes, routers are a little busier than at other times, and so there's variance between all of these various measurements, and each of these, to be clear, is not cumulative. So they might go up and down. It's how much time it takes to go from my laptop to each of those routers. You don't just keep adding them, you keep looking back at the origin. So about 80 milliseconds in total. Well let's try another one, one that's a little closer to home here. Traceroute www.mit.edu, which is also in Cambridge. Already done. Also in Cambridge, Massachusetts, only eight hops away, eight routers between us, and indeed we seem to be going through Harvard's Core network again then we get connected to Quest, another ISP, and this is actually kind of interesting. It looks like my data is making a little stop in New York City if these names are to be believed. And that's kind of wild, and yet it doesn't even seem to go to mit.edu, but akamaitechnologies.com. So this is interesting, and my inference here is that MIT has probably outsourced parts of its website to a company called Akamai, which ironically is themselves based in Cambridge itself, but their servers seem to be in New York City or thereabouts, and it seems that MIT is essentially using them as some kind of CDN, Content Delivery Network, which is indeed Akamai's business to host MIT'S website. So even though I think of MIT as being walkable from this theater here just down the road, their servers can certainly be somewhere else. And thanks to DNS, Domain Name System, and thanks to these routers, nonetheless can my laptop reach MIT'S web servers really wherever they are in the world. And in this case, they're only six milliseconds away. So not necessarily as compelling when I can still walk to MIT pretty quickly, but six milliseconds is certainly faster than the six minutes it might take me to drive, or the half hour it might take me to walk. And what about places even farther? What if I am interested in the news somewhere abroad relative to here? I might do traceroute www.cnn.co.jp if I wanted to trace the route between here and what I presume is CNN's Japanese web server for news. And here, we again see the data leaving Harvard's routers in steps one, and two, and three, and four. Seven didn't really answer, it seems. And then it got a little private, didn't really respond thereafter. But notice something interesting here is going on. Somewhere among these first few steps, I'm going through internet2, which is encouraging because that's a fast connection typically, then a nameless router, step 7, can't quite make sense of all of these. But maybe SEA is Seattle if those airport codes are to be believed. And then, wow, notice this gap. We're starting at like less than one millisecond, less than one millisecond, one millisecond, 20 milliseconds, then 85, then 106, then like 193, from 213, 191. That's a big jump, and it doesn't seem to just be a bit of variance. It doesn't look like just the routers are busy, it seems to persist because getting to each subsequent router takes about the same amount of time. Why is my connection so much slower all of a sudden? Why is it taking so long between steps nine and 10? Well, Seattle, where is that? That happens to be on the far west coast of the US, and maybe Osaka, Japan is right there across-- what-- Pacific Ocean? And so it would seem that between steps nine and 10, maybe there's a really being body of water between these two routers, and that explains why all of a sudden there's so much of a delay. And indeed, the internet of course spans the globe these days. It spans oceans, either through big trans-Atlantic, trans-Pacific, trans-oceanic cables that are laid down by really large ships, or maybe it's via satellite, or microwave, or other technologies. The world is so incredibly interconnected, but you can see visually how those interconnections are laid out, and where they actually are. In fact, thanks to this animation, we can see even more visually what the internet looks like around the whole world. [MUSIC PLAYING] All right. Demonstration time. So within your home, or campus, or office, we had a number of devices, and one of them was like a cable modem, or DSL modem, or a FiOS device. So what does that device look like? Well if you have a cable modem, maybe from a company like Comcast whose brand name is Xfinity, you might have a device like this, and it usually stands up on your counter like this. It's got some blinking lights in front, and in the back are a whole bunch of connectors. Now what are these connectors? Well the biggest of them, and frankly the oldest one of them, is this metal thing here which is a coaxial connector, and this is what's long been used for TV antennas and cable connections for your own TV into the wall. And the kind of cable that you might use to plug into that generally is pretty thick, and it's got a cylindrical end, and a little pin in the middle, and it's often kind of annoying to screw the thing in there. But if you have a cable modem, odds are you've got a jack that looks also like this somewhere on one of your walls, maybe near your actual TV, and what you really just need is a cable like this. One and it goes into the cable modem, the other end goes into the wall, and that's a haul physically you need in terms of a connection to the wall beyond, of course, the power cable which would plug into down here. And those are going to vary based on the model. But there's some interesting ports up top here too. There's some phone jacks it seems, because it turns out that a lot of internet service providers these days, especially those who have digital's support for not just internet services but also TV and phone, you can actually plug one or two landline telephones in here and get telephone service. And then below that are four jacks that look pretty similar, but they're actually a bit wider, a bit fatter. And so these phone jacks, if you never knew are called RJ11 connectors, and that is what, historically, you would plug into the wall of your home or now the back of this device. And these other bigger ones are RJ45 jacks into which you plug generally the ethernet cables, which is the name given to network cables. So if back in the day, you had a phone with one of these things on the wall, you would have one of these RJ11 connectors, super small, and you'd plug that into the phone and then into the wall, or the back of this device. Meanwhile though, you might have a ethernet cable, which is a little wider. So whereas the phone connector might look like this-- yeah-- ethernet connector is going to look like that, and you can probably tell here just how much bigger one is than the other. And so inside of those cables are just a whole bunch of wires that actually allow the electricity to flow, the electrons traveling across them copper wires from this device into the wall. And from there, Comcast, or Time Warner, or whoever your internet service provider takes care of the technology there on out. But what you can plug into this device via those cables-- not the phone cables, but the ethernet cables-- is your desktop computer, your Xbox, or some other devices that use wired internet. Or if your cable modem has, like this one does, Wi-Fi support, wireless capabilities, and even though there aren't antennas on this one, they're actually inside the case, which frankly might partly explain why this thing is so darn big. There's absolutely no good reason that these devices need to be this large, but this device happens to be not just a cable modem, but also a home router inside of which is support for DNS and DHCP. It also has Wi-Fi capabilities. So you don't actually need, with this cable modem, a second device. You don't need your own Wi-Fi device in the house. You can get all of that from your ISP. Now if you have FiOS, another technology that's in some cities here and abroad, you might have a device that looks pretty similar. This one, frankly, looks a little more elegant, and it probably has very similar jacks on the back. Some kind of coaxial connector that goes into the wall, and from there, Verizon or whoever your provider is might take it from there, Frontier in this case, and then you might again have some RJ45 jacks that allow you to connect devices in your home to this very device. But not all devices are this big. Here is another cable modem made by a company called Netgear, and it's this small. So case in point, ridiculous, not necessary, same technology, much smaller. Much smaller form factor, so the hardware that's inside this device is obviously much smaller. But we still see the coaxial connector, some kind of power connector there, just one jack for an ethernet cable, but that's probably fine so long as you have another device, a home router, or a switch to connect it to. Indeed, if you simply want to provide your home with a bunch more wired jacks, you might use something like this. So this is a Cisco Linksys this device. It's a pretty dumb device. It's just a switch that's got a whole bunch of those RJ45 connectors. So you plug one of these into your cable modem, or into your home router, and then you can plug up to seven other devices into this device, thereby creating kind of a mesh network among those many devices. And this switch simply switches data, switches traffic among the several ports based on who's talking to who. Or, you might have something a little beefier that looks pretty darn amazing, I must say. Very geometric these days. This one also made by a company called Linksys, owned by Cisco. This might have these antennas on back, which suggests that this has Wi-Fi support. This device happens to be a home router, and it also has firewalling capabilities, Wi-Fi capabilities, and switching capabilities. Indeed, in back, it has not just a connection for your home router, or rather your cable modem, or your FiOS device to plug into, it also has a few, but not as many, ethernet jacks, or RJ45 jacks for your several devices. So which devices you need entirely depends on your own situation, and odds are the first person to ask is your internet service provider. Increasingly these days are internet service providers bringing you, or selling you, or renting you a device that takes care of all of this. So odds are you just need these days one device, and not several, but sometimes you might get something lower profile like this one here, and maybe you'd buy it yourself and plug it into the wall, and all your ISP does is take it from there. They don't give you any devices for your own home, so you might have to wire some of this up together on your own. Now at the end of the day though, it is all kind of pretty simple, whether your cable is this to connect your various devices, or this, the coaxial connector, or even this, which is a fiber optic cable which essentially has little strands across which light travel even faster than electrons across these copper wires. Inside of many of these cables, like this one here, is just a bunch of wires and they're actually pretty cheap devices. And in fact, I thought it'd be fun to maybe get our hands dirty here with a cable that hopefully I won't need any more, and see if we can't see inside this here thing. So wouldn't necessarily do this more than once because scissors aren't going to work very well on this one. And actually, we can see what's starting to happen before I even finish. Notice that as I pull back, the blue part of the cable, which is really just a rubbery sheath, you can see that there's eight different wires in there, two of which I've cut, so hopefully those were the bomb diffusing wires I cut. If we just keep pulling, you can see a lot of the wires inside. And these wires all are different colors. Some of them are striped, some of them are solid, some of them have been cut so they're shorter than others, and so long as the right colors on this end line up with the right colors on this end, your two devices will be able to talk because some of these wires are used for transmission, some of them are used for receiving, some of them might not technically be used at all. They're really used for insulation and cancellation of what might otherwise be interference. So inside of here is pretty simple technology, and much like we've seen in other contexts is there's just this layering, and layering, and layering of complexity so that at the end of the day, this is what's carrying your data, but there's just so much software and so many interesting advanced ideas on top of it, all of which ultimately make the internet work. Now how about some homework? So your homework for tonight, perhaps, is when you go back home, whether it's your house, or your dorm, or maybe your company if you're staying late, find a device that looks a little something like this. Maybe it's your cable modem, or your FiOS device, or maybe it's your home router, or maybe it's someone else's home router, and turn it around carefully, take a look at the various, connectors on the back see if you don't recognize some of the various shapes, and some of the various labels, and some of the, ultimately, technologies that we've been discussing here. If you really want to be brazen, go ahead and hold your breath and unplug everything, and see if you can, via a bit of pattern matching, plug everything back together. Of course in the process, you'll take down your entire internet most likely, or your company's, or your neighbor's, in fact very much, possibly your neighbor's as well. And that's OK if you're sort of confident you can reassemble that. I mean, if you're really daring, and you have an extra ethernet cable lying around, go to town on one of these things here. You're not really going to be able to put this back together without special hardware and a spare little clip, but that would be the extreme form of getting your hands dirty here with the internet.
Info
Channel: CS50
Views: 70,834
Rating: 4.8958783 out of 5
Keywords: cs50, harvard, computer, science, david, j., malan
Id: n_KghQP86Sw
Channel Id: undefined
Length: 51min 50sec (3110 seconds)
Published: Fri Sep 01 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.