Internet History, Technology, and Security - Full Course from Dr. Chuck

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
dr chuck is one of the best people in the world to create an internet history and technology course he's lived through it and has interviewed many of the most important people that helped create the internet we know today you are going to learn so much in this course welcome to internet history technology and security my name is charles severance and i'm a faculty member at the university of michigan school of information i have been working on this course one way or another for nearly 20 years for me it actually goes back to when the internet first started that was in the mid 90s and in the mid 90s i had the good fortune to have some friends that decided to help me go on television uh john liske amy leahy and richard wiggins and we had a television show first it was called internet tci and then later it was called nothing but net and then we called it north coast digital but it gave me an excuse in the mid 90s to make a bunch of videos and film folks and just ultimately create a an interesting oral history of the internet this is in the 90s and we didn't even know how it was all going to turn out so i we were walking around rich wiggins and i were walking around and filming all kinds of things another thing that i did in the late 90s was to build a technology for teaching and learning on the internet i hold the distinction of being one of the first people to stream an entire semester online i streamed it so early that i couldn't even stream video i could stream audio using real audio and i had a set of slides that would flip every few minutes and you see a terrifying picture of me staring at you because there was no video in the upper left hand corner of the screen i went from the michigan state university to university of michigan in 1999 and i rebuilt a new lecture capture system that captured cameras and document cameras and screen capture and allowed for annotation and i called that clipboard 2000. everything back then was called something 2000. in 2010 i was part of a group that was going to convert ieee computer magazine from paper magazine to a digital magazine and one of the ideas was is we should have some videos so what i would do for about five years is add add a few days to a business trip and take my camera and go somewhere i had two cameras and a little backpack and some lighting kit back in 2010 11 12 13 um and interview various people and and that let me complete my uh set of interesting people uh in the internet so this this all culminated in 2013 when coursero was founded and i was invited to be one of the first six people at the university of michigan to to be on coursera and create a course and so i created this course internet history technology and security and back then we didn't know where this was all going to go but for me i just looked at it as a fun opportunity to share nearly 20 years of my oral history materials and also try to teach people something about the internet the idea is is that the course is supposed to hook you with a little bit of history but then teach you tcpip and other things and one of the things that i did i always sort of have a media bent to everything i do is i would go around the country there's you can go to find these at doctorchuck.com office and you can see 70 videos from all over the world of me meeting my students since 2013 when i first started teaching and so that leads us to this version of the class which is uh updated version but again this is part of the beginning of a technology oriented curriculum with very liberal arts kind of focus and you may have seen some of my other courses python for everybody django for everybody web applications for everybody postgres for everybody and you know who knows i've got a whole bunch of things in the hopper that i'm working on so by the time you're watching this this might be more and keep an eye on programming for everybody www.pr4e.com for this and upcoming materials so let's talk a little bit about the course at from at ihts.pr4e.com that's internet history technology and security programming for everybody so ihts.pr4e.com as i said the idea is that we can't look at technology without understanding history and i want those with a liberal arts background to have a way into understanding technology because technology is not exist in a vacuum technology has continued to change and is going to continue to change and if we don't take a look back we better we will have no good idea about how to deal with the future and so we need to understand the past as we look toward the future so this material came from a course i taught from 2009 to 2017. most of the course was a python course but about halfway through right after the midterm exam i gave everybody a break from python and taught a couple of weeks of internet history technology and security and so that's what this course is this course is the break from python now in my programming for everybody curriculum i kind of think of this as the first course the place that you start so the outline is starting with world war ii going through 1999 as our history then we look at the technology from the arpanet to the internet tcpip and all those protocols and then we take a look at the layer of security that makes all the kinds of commerce etc that we do on the internet possible so we start back in the beginning at i think of the beginning of computation communication that that is the internet and the technology we use on the internet is going back to bletchley park and so uh i love bletchley park i've often met my students at bletchley park and i look forward to the next time i get to go to bletchley park in the second part of the course we actually and believe me you'll be able to get it don't worry we're not going to teach you any math we're going to teach you things like what the layered network model is tcp and what's ethernet and what's wi-fi and how do those things work and how do i p addresses work and how do various applications work and so that's the middle part and then what we do is we figure out take a a look at what it takes to protect the information that's moving back and forth across the shared internet the key to the internet is that shared by all of us but we got to keep our data separate and then we got to keep our data secure from prying eyes so we talk about the technologies and math again now no big math but just like the basic concepts um there's a companion website ihts.pr4e.com and it gives you the same course material in a nice modular form with a bunch of links to all the references and quizzes to test your knowledge and if you want there's some threaded discussions if you log in that you can participate in so it's not practical for me to interact with the scalable class with hundreds if not millions of students and so my classes do have millions of students i mean i watch things but twitter is probably the best way to get me and i do want to know if something's broken i don't i don't i want to fix things as fast as possible so just a little bit about me if you're still listening i do have two tattoos these are real tattoos on my on my left shoulder is my research and on my right shoulder is my education and i've got three degrees from michigan state university and engineering and my faculty position is at the university of michigan in a in the school of library science that's currently called the school of information on my on my right shoulder is my research and my research since 1997 has been in uh building and using educational technology to teach and the the this all the logos are the what i call the learning tools interoperability ring of compliance and lti is how we plug in all of my auto graders etc into these teaching systems and so my research has been how to best teach with technology literally from 1995 until the present time i have a race car which is one of my hobbies and my other hobby is hockey and so that's a little bit about me i have a number of folks to thank a lot of the material here was the work i did for ieee computer magazine while i was a while i was a writer for ieee computer magazine they given me some permission to reuse some bits of it uh richard wiggins my lifelong friend who passed away sadly he has allowed me to use all of his video material and the copyright clearance is an important part of any free online course and open michigan helped me with all of that so with that let's go ahead and get started at ihts.pr4e.com hello and welcome to our first real lecture of internet history technology and security we're going to start of course with internet history and i'm charles severance and i'm your instructor we're going to start at the dawn of electronic computing i mean computing started you know early with abacuses and humans but we're going to start with the moment that electronic computing in particular because it was the moment where computing and conv con computing and communication sort of were sort of co-born at the same time then communication before the internet became normal uh then early internet research and then the internet itself that was academia and then of course for uh went went out into the real world and then the web which really took all this connectivity and made it easy to use for everybody it's really what their view now this network is very much through the web and then uh from that point forward we look at sort of the commercialization of it and the ubiquity of it the widespread use of it so the first picture the first video that i want to show you it's a little longer than most of the other videos it is about bletchley park and um there are many heroes at bletchley park of course this was a top secret code breaking effort by the british government during world war ii um world war ii i mean if you think about it in history perhaps there's no time in history uh quite like world war ii if you go back to what kind of technologies were using in 1910 and 1920 to the technologies we were using in 1940 it's an amazing difference you know jet airplanes radio radar so many things were invented and made made usable and made production quality during that period of time war of course is terrible but it does cause governments to fear for their lives and invest heavily very heavily in research and so we in some sense uh even though the war is a terrible thing we sort of benefit from the extensive research they were trying to solve wartime problems but they ultimately solve problems that have changed our peacetime world in wonderful ways so blessedly park is north of london between oxford and cambridge in england and it's uh it was a code breaking effort and at one point there were over ten thousand people working on top secret efforts to decode uh encrypted messages uh initially from germans uh as they were using radio so they call world war ii a world war mostly because it touched geographically more than any other war had ever touched that since before or since really um you know alaska united states italy africa russia japan philippines you know it just it truly was geographically distributed and it was necessary to do community unprecedented communications just to effectively pursue the war and this meant that communications had to be wireless and the problem with wireless is anyone can put up an antenna and listen to the wireless signal there's no way to hide the wireless signal unlike a wire that you could hide it and if no one had access to the wire you can't see what's in it but if you're using wireless and fast moving armies in long distances then someone can intercept the wireless signal and they can so the trick of course was to create an encrypted wireless signal so they could see everything you sent but it didn't make any sense to them it's all sounded like gibberish unless you of course know the code and so this was the a key technology was building codes and code making machines a good example this was the enigma made by germany which is uh we'll talk about these in this last part very last part of the class and the security part of the class how these codes and ciphers work but they scramble material in a way that's generally unintelligible and of course the really bright folks at bletchley park alan turing one of many really bright people at bletchley park um used you know mathematics to to say you know these codes may be more more crackable than we think uh there was folks from poland uh who also informed the them to say look we let me let's show what we did to crack it right before poland was involved they were working heavily on the mathematics of cracking and so then they built these machines and so in this video that i'm about to show you it's really what i try to do is i try to contrast the the two machines and one is an extremely fast mechanical computer with relays and switches and things that spin and gears that move back and forth it's a very physical computer that's uh that's looking for patterns and as it's spinning it's checking for possible encoding combinations to try to just do a brute force checking of lots of different possibilities and so it was a mechanical computer the bomba was a mechanical computer and then as the german encryption improved and they used different techniques with more sophisticated encryption they just couldn't decrypt it with a mechanical computer anymore so they just were forced to build something faster and that faster thing was the first truly powerful general purpose electronic computer in the world of course it was kept secret until the certainly kept secret until the 60s and the 70s and much of it was still kept secret until even the 90s and so its place in history is is kind of a recent uh recent understanding you can look at early history texts that talk about the first computer and they don't mention this one well that's because it was a secret until a long time so the bomba was a powerful mechanical computer the colossus was a powerful electronic computer but i had this picture that was drawn by an artist for me and um in addition to showcasing sort of the moment where a mechanical computer no matter how hard you tried wouldn't work fast enough and the electronic computer was sort of forcefully created out of a tremendous need what's also really interesting is the fact that bletchley park during this time with 10 000 people had all kinds of people you know language experts mathematicians engineers welders and it's a really cross-disciplinary activity and they were solving a problem of decrypting german transmissions but they ultimately solved a problem in the pursuit of that of electronic communications and computation and so this picture is really trying to show how you know a lot you know alan turing was really very critical and but there are other people like gordon welshman and doc keane and and the folks from poland that informed all of this university colleagues and this whole thing was a very much a connected collective group of really bright people highly motivated well-funded and they created this so um so let me go ahead and uh and pause now and let you take a look at at this film [Music] do [Music] today we're at bletchley park in honor of alan turing's 100th birthday bletchley park just north of london in between cambridge and oxford was considered by many as the birthplace of modern computer science in 1938 uh again anticipation of war the government code and cyprus girl uh through an in in the main their operational director a gentleman called aleister dennison had drawn up a list he had sent two people out to troll through the uh people at oxford and primarily cambridge and some extent oxford and drew up a list of people who they approached who agreed in an event of war uh would uh report immediately the blood street park britain declared war on september 3rd 1939 the following day september 4th touring and several others reported to bletchley park as the requirement for more people was needed uh some predominantly a gentleman called gordon welchman who had all he had been on the initial list he'd arrived the same day as turing he went back to cambridge and he started recruiting all his best students and in effect what they created here was the world's first skunk works a secret innovative organization where there were no rules eventually uh their main benefactor became the prime minister winston churchill uh and they persuaded churchill that this place needed uh more resources churchill agreed and he wrote a famous uh a letter was written by welchman turing and the two deputies to churchill and one other number a gentleman called stuart milner berry actually delivered it in person the downing street churchill amazingly rece read the letter the same day it was delivered he put a famous action this day stamp on it with a handwritten note to his chief of staff a chap called general dismay which said expedite with extreme priority and report to me when it is done and really from that point september 1941 bletchley park got all the resources that they needed they threw in effect the smartest people together in britain and said here's the budget this is the end game and that's why they invented some of these technologies that probably wouldn't have been invented for years in many ways bletchley park was an early version of a multi-disciplinary science center much like cern where ncsa is today many brilliant people with different skills and backgrounds were brought together to solve difficult problems the combination of the skills and collaborative environment resulted not in just solving the problems of cryptography that they were facing but in addition solved broader problems for all of computing in all of society when the people arrived here they knew when they arrived that the germans had taken a machine a a machine which we tend to call enigma enigma was a particular variant of an encryption machine that machine was modified from the commercial version and it allowed them to encrypt messages this was going to become a machine to be used for operational communications very short 200 250 character messages hitler had conceived and as generals a type of warfare never seen before which became known as blitzkrieg very fast movement 50 miles a day particularly when they invaded france and the latter stages of their invasion of poland you couldn't use fixed line communications you needed to use wireless communications here was a device which was portable weigh 25 pounds ran off a battery you could encrypt messages and then it's a separate process they could be sent using the fairly new technology of wireless radio it was the polls who were the first to recognize that the age of machine cryptography had arrived and the sort of people who would be good at dealing with it were mathematicians in fact they even went so far as to put on a course in cryptography at the university of pause man uh invited 20 or 30 young german-speaking mathematicians to enroll it was very difficult course and by the time the course was finished there were really only three graduates who they recruited uh their names were mario rodriguevsky hendrick zakalski and yazzie orozinski and these three were the core of this team and they are the ones who really made the early breakthroughs they convened the conference at their secret headquarters this was actually july 39 and at that conference they gave the british everything they revealed what they had done they gave them a replica enigma machine and all of their work that information came back to bletchley park and once they were established in bletchley park they then used the polish method to break the enigma machine the polls had actually built some machines themselves one of which was called a bomba that machine the name apparently comes from a polish ice cream dessert of the same name it's an effect vanilla ice cream with chocolate sauce around it this machine was called bomb in honor of what the polls did but the polls technique was based on the particular way that at that time the germans incited their messages they at that point were repeating the message header and that was the attack which the polls developed on it they believed correctly that went when the war would start that the germans would change the way they were doing that and they called for help they told chewing and the people here everything they'd done which we are and should be very grateful but they also proved that it was possible and that i think was the spur that kicked the brits into actually doing something about it now turing understood the weaknesses of the way they'd done it and developed a clip-based mechanism based on the fact that every military organization in the world cannot stop itself sending stereotyped messages he then talked to a brilliant team of engineers led by dr keane who told him roughly the speed that it would be possible to examine potential stops for so between between them they then designed the machine called bomb which would look for that and do so in sensible time here is an example where we have intercepted this message snmkg we're pretty sure that that is because this particular operator always did this this is his morning weather forecast so here we have the guest german plain text vetter web forecast and the two are lined up at that point the enigmas ability or inability to encode letters itself is useful because you can make sure you've got no caches here and from these guest letters we that the crib makers would derive a picture like this called the menu where you'll notice that at position z e here g is encoded as e so we have g going to e at z p at the end here e goes to v this is reversible of course we can describe this other way up so here's e to v that's z g here s is going to a sorry s is going to be excuse me so here's another link and eventually we can go all the way around and we can close this loop like that it was the closed loops that they were looking for the operation of the bomb was simply that if there is a letter which you can feed in at this point and which comes around through a string of one two three four five six enigmas and comes back as that same letter then it's possible the position of these wheels match the position of the wheels when gunther started encoding his message on his enigma it turned out to be a bit more picky about loops and menus than um perhaps they had hoped although it worked and at that point welchman added a completely left-field master stroke called the diagonal board which i think it is fair to say made the difference between success and failure to turing's original idea and here are my three diagonal boards that's one that's the other this is the third one coming in a place where you can't see it but can can you see z down to a okay um there's nothing connecting that to the machine it just bolted there for convenience the only connections to the machine are when you connect x into one of the enigmas what will happen is when we turn the machine on you'll see it will run up to speed okay um the motor will be running and the clutch will drop in then you'll see the various carry mechanisms happening and then when we stop the machine you'll see it slow down and then the clutch drops out so tony if you'd run it up for us please okay would you turn on carry hole okay so what you saw there was the middle carry then we switched on carry home to make the slow carry work all the time so these bars are just pushing around all the drums and you can see that all the 36 enigmas are just in step they're all just turning and they are a resource these are the very fast relays the three banks actually four for the chains that do the detection here this is the control logic okay and then these are this is the 26 bit register that notes a stop and then allows the machine to slow down before it stops allele our favorite statistic is the sheer amount of wire in the thing there's no we've lost count but it's somewhere between 10 and 12 miles of wire in each one of these germans really there was two main systems they were using enigma was their operational system that was being used for all specific operational communications again very short maximum of 250 characters uh instructions orders for panzer division to move from one location to other orders for a submarine to attack a convoy for aircraft to during the battle of britain to attack specific targets the germans started to introduce through probably 1941 a different system the intercept service which britain had put into place to intercept these messages started hearing traffic which was clearly not enigma traffic a negative traffic was morse code easily identifiable people can be trained fairly clear simply to transcribe to listen to morse code and transcribe the letters this was a totally different signal it was not discernible it was not transcribable this was adaf hitler's communication channel to his army general so very long-winded communications top secret uh and so this was an even more daunting task arguably than enigma because uh this encryption machine that was used with this system uh there were various uh ones of them hitler referred to these as heim his secret writers several companies siemens uh that exists today making we're making a machine haglin but there was also the lorenz company and that was the primary machine particularly lawrence said 42 that machine unlike enigma that had three encryption wheels had 12 encryption wheels they approached the post office research lab at dulles hill in london a young engineer called tommy flowers right into the project uh he was in effect told well go ahead if you can get approval flowers went back to the post office got approval put a small core team together and in less than a year had a working prototype uh quite a remarkable achievement the first class colossus as this machine became known the mark 1 had 1 500 valves or vacuum tubes in it they brought it to bletchley park it worked almost right away the people here were convinced they asked for a more powerful computer uh flowers already had anticipated this he already had it in production 2 500 valves in an era when no machine had more than that used valves had more than about 15 here is a machine with two and a half thousand flowers was told that they needed the mark ii by june 1st 1944 because there was a certain date in the diary june 5th 1944 was a scheduled day for operation overlord d-day it actually ended up being june 6th because of bad weather flowers delivered the marked first mark to the bletchley park on june 1st they switch it on it worked straight away they immediately started decrypting messages between the german high command and much of the light the light they discovered that all of the subterfuge about the d-day landings had worked hitler had believed that the main invasion was going to be at the pot of calais not at normandy that intelligence was fed back to general eisenhower at allied uh supreme command headquarters and there is reasonable evidence to conclude that the game i will not have gone ahead on june 6 1944 without the intelligence from bletchley park this is the colossus computer at fletcher park now we're in hutch 4 at bletchley park which was the one of the first purpose-built computer centers and some 10 of these machines were installed here starting in january 1944. now in the years previously a new way of encrypting messages was discovered that was being used in germany and it was those high-level messages between hitler and his generals and they were encrypted on a machine called lawrence machine and it used teleprinter traffic teleprinter codes to actually transmit those messages now by 43 it was taking some six hours sort of six weeks to actually decode those messages and decode those messages laboriously by hand the problem with that is within six weeks the usefulness of the intelligence that you gain from those messages is obviously gone so that process needed to be speeded up now some techniques have been tried from so many electronic techniques have been tried here at blue park um but this machine when it was put into use in early 44 reduced that six week period to decode the message down to six hours and that's really just a phenomenal jump the machines the messages themselves were received at various intercept stations around the country and were punched onto paper tape now there are two paper tapes on the machine here no one that we're running and one which is ready for the next run now this tape with two here if i take one each row across the tape are punched five holes and each of those is an alphanumeric character and that's the encrypted message that was received has been punched onto tape now although colossus has really all the elements of a modern electronic computer it doesn't have the memory that we expect in a stored program computer so what happens is that tape of that message some 5 000 characters that message is actually formed in a loop and is read over here with a series of photoelectric centers and lights and each character is red and the 5 000 characters are set up 5000 characters a second are red now each clunk of the machine we can hear there is another 5 000 characters being read into the machine so we're reading that same 5 000 characters each time this is the control panel of the colossus computer so it's this side of the machine that would actually be the the wrens and the wartime would actually be using now i can set up the particular algorithm i want to test and routines i want to test using these control panels and each clunk of the machine at the moment is another state where we're actually trying a new algorithm and repeating that algorithm on that repeating 5000 characters the results that the machine detects are presented here now we're doing a statistical analysis of that encrypted tape now the machine was run for some six hours uh there are two and a half thousand valves in this machine you can't see that many from the front but once we go between the acts of the machines you get an idea of the scale two and a half thousand valves each file has a heater a hot wire heat in the center of the valve and operates at about two three hundred volts dc now five and a half uh two and a half thousand valves plus the power supplies generates over eight kilowatts of uses over 8 kilowatts of electricity and by this point when we're standing in between the racks you can feel the heat coming off of these racks [Music] colossus is a combination of electronics and we can see the the electronics here in these valve chassis numbers and also electromechanical switch gear so there are backs of relay panels here these big single motion detectors are a sort of motorized switch that will actually spin around there's banks of those in the machine the technology is equipment that was well known for the british post office this machine was designed by a chap called tommy flowers and tommy flowers was an engineer at the research center in london for the british post office tommy was asked to look at the problem of how to automate this task this manual task and tommy came up the idea of using electronics to do this at the time the idea of using more than half a dozen valves in any circuit was just simply pooh-pooh valves had a bad reputation people had valves in their radios at home they failed when they switched the radios on so the idea of using two half thousand was just simply just phenomenal but tommy understood that it's probably it's almost certainly the thermal shock that kills valves in the first place so if you leave the machine on and don't subject it to that shock you won't have a problem now all the 10 colossus machines here at bletchley park were left on permanently the machines were operated by wrens in in three shifts throughout the day we have to be equally careful this is obviously this is a replica of the machine itself but the same problems apply we need to be careful in bringing the supply voltages up gradually so we don't suffer from thermal shock and that in the same way shutting the machine down tommy had the idea of generating the key because it's a key that we're comparing with that encrypted type generating that key electronically and these counters here that are called the thyrotron rings the thyrotron valves actually hold that that count now we describe colossus as not having a memory we say not having a memory in the sense that um modern computers have a a memory that's common for data and for programs well the program here is set on those control panels at the front and that's switch um switch and plug programmed much like say it's contemporary eniac machine um but these machines these valves at the back do act as a store in the sense of counters and they're actually counting the score that we're actually getting each passed through the algorithm the we can see at the back here well certainly some of the modern technology um i suspect the wartime guys that were building and debugging these machines would have given anything for a reliable oscilloscope and let alone a logic analyzer but if i look into the machine down here back you can see the projected image that's coming off the tape and that's shining onto five photo sensitive valves and it's those valves that are actually reading the 000 characters every second paper tape was used well into the 1960s and 70s at this speed was simply phenomenal and hadn't really been thought of before and our original museum director techno sale wanted to rebuild the colossus and build a replica of the glossary machine and this was done 20 years ago there's very little information out there there are a few pictures that were kept possibly illegally a few scraps of circuit diagrams and just the memories of some of the original pioneers who worked on the machine and when tony started the plan tommy flowers was still alive and remembered and was able to draw circuits of course finding the parts as well but the important thing is the machine was designed by a post office engineer at the time an engineer that was used to designing systems for telephone switching so a lot of the components here everything from the relay banks to the switches to the power supplies were common to british post offices in the pre-war period so when tony then wants to find the components to rebuild the replica he's lucky in the sense that the last of those exchanges are being decommissioned from post offices around the country tony was therefore able to really back up a pickup truck at the back of the exchange and take away all of those components and dismantle all of those components and they are they were absolutely perfect and it was just that tiny really couldn't have been better so alan turing actually left bletchley park after about three and a half years his involvement really uh finished he went to the united states of turing became was not actually involved in the development of colossus some of the statistical work that he did was used by the people predominantly bill tutt who's the guy that actually reconstructed the lorentz machine quite remarkably a remarkable feat indeed uh cheering got involved in things like speech scrambling systems and went to the united states and was involved in other projects so he wasn't really involved in colossus at the end of the war trey went to work for the national physics laboratory again pursuing these sort of ideas and then of course he ended up at the university of manchester where max newman who had run the department were the colossus computers uh had been located but she parked also ended up as the head of mathematics and obviously then at that point turned became involved in the very early uh computer developments in britain uh gordon welchman immigrated to the united states at the end of the war in 1948 he immigrated to the united states and became involved in many of the early american competing developments project whirlwind weltzman worked at mit taught the first course in computing science at mit so cheering then was involved in things like the manchester baby in the franti mark one and the early uh competing developments uh i guess this is well known of course during tragically uh committed suicide in 1940 1954. [Music] do [Music] so i hope you enjoyed that again i tried to showcase the the juxtaposition of the mechanical computers moving to the electronic computers and how it was a collective effort by by many different people and so one of the things that happens sort of after this is um after the war they didn't need bletchley park anymore so they shut the bletchley park down and all these people who had worked on all these things and built all these things went to various academic places like mit manchester harvard and what happened was is what they had done was a secret and they could say nothing about it but they had realized and they couldn't forget that electronics could do computation and electronics due to computation far more rapidly than anything we've ever imagined before so a series of computers came up really fast after the war they kind of you know relaxed the war's over we feel good but let's start thinking about what we would build if we were going to do computations like weather or other kinds of cool computations instead of just a wartime computations and and so these next generation of computers that are often thought of as the first computers in the world which really kind of like the follow-on computers to the computers built at bletchley park and so the immediate post-war period in the 1940s took the us and uk code breaking there were similar things in the united states as well and i'm sure elsewhere where that pushed and the envelope of computation but the folks that had built all these things for wartime purposes switched and moved into academic and peacetime purposes and so they built i mean if you think about it they have less pressure but they already know it's feasible and they could kind of sit back and relax and be like okay that would be a little better we kind of compromised there so they built some really elegant computers and and you know they can they can all sort of fight over who's the first um but ultimately they all pretty much came out uh very quickly one after another because the idea had escaped and so this is the beginning of sort of electronic electronic computation right the fact that electricity can be used to represent data and that can be used to change data very rapidly as compared to other things that used physical storage for uh computer information so this was a great time electronic computation from the specialized to the general purpose and these people really built some exciting computers many of the architectures of which are kind of still sort of with us a lot of the architectures were innovative the innovative architectures that we still use today were conceived and imagined there [Music] but students get a better understanding of something if they know the history of it and know where it know where it's come from that's my my view so when i go to class when i go to the interface design class for instance i can take them down to the museum and show them what an early interface was you know we talk about interface design and they think of their their handheld computer their laptop their desktop and they look at the screen and they think that's that's an interface and sure it is but where did you know how did we get there what was the early interface with computers i'm going to take them down to the frankie sirius and they say well what's the interface here with the computer so it's it's um giving them the idea of what an interface is by by taking them back to something that's that looks very primitive and and then we look at the evolution of interfaces over time we look at the different um applications you know we go back and look at busy calc and word star and things like that and to progress through the the different versions of these different products and you actually have like running physically no we don't know it's also from the web but i i think it just broadens their outlook on it and i think it's a danger with computing students that they become very can become very focused on the latest and and quickly discard the technology they're using when the next one comes along and forget about it and even before we had the museum we were lucky we had here cyrax i've mentioned cyrax to the world's number four digital store program computer it's left over used till about 65 so about 72 or so it arrived on campus just for storage purposes but we put it in a display area when i first arrived from my teaching here in about 88 to teach i would always take my introduction to computer architecture students past that it had so much to tell students about the origin of operating systems the primary function of operating systems to allocate resources and the efficiency of use of resources with the programming students i can show them things like punch cards and things like this you know how do we get information into the computers and show them things like that so it might be part of one class in a semester that i use the museum in particular for from my classes and others use it too in similar ways but we do have school children coming on visits and then we have about an hour or two and we take them through the museum starting with the the calculating machines and we talk about what do people do before computers and i've got things like this what is the computer where did we get the name of computing from and i've got this picture of women using slide rules in 1948 i think it was in this is in america doing their calculations and the idea that the first computers were people okay and and often they were women so that's kind of an interesting bit of social history there so trying to connect the computers and state of the technology to what was happening in society the time the physical artifacts are very important that idea of the physical we still carry forward even today as we're looking to see what we can do with the full museum that we've created now we've toyed with ideas of a web-based series of exhibitions photographing all of the artifacts and making sure that people can sort of browse through and move through halls we feel that even with that web-based teaching methods we would still like to have physical artifacts we've toyed with um a box of artifacts that we would send out to schools that are working on our program that do an educational program so the physical hands-on is a very important thing museum curators understand it people who aren't museum curators tend not to we had a staff member in our in our school and when we started the museum i was talking to her and and said oh we will have slide rules in there now she had done an honors degree in maths and she had not heard she not that she hadn't seen a side well she didn't even know what one was and that to me was amazing because when i went to uni everyone carried a schedule you know we all had this my schedule from our universities we all had these as science students engineering students we all had these tools and she was probably 10 years younger than me and suddenly i had no idea what i was even talking about hadn't heard the word so i think that really made me realize that histories are so quickly forgotten and there's a danger that um we just um yeah we we forget about these things and if we um don't have them around for people to see they forget about the history of uh you know the the technology that they're using and i think that's a real danger [Music] so [Music] do [Music] also in the uh post-war period in the 1950s as a bit of an aside um [Applause] there was a there was a realization by the certainly the federal government united states that academics were kind of useful right i mean uh the people that really had won the war for them were academics not just computer people but also people that built designed airplanes and radar and things like that and so there's a sense that the safest country was the country that had the smartest people and in many ways the entire us educational system was built to find really bright scientists and educate them very effectively and so um and so there was there was a a real boom in building science departments uh national science foundation was fun uh created right after the war and uh and so united states and and other governments really greatly invested in research and there's this movie a beautiful mind and i've got the url there at the bottom it's not required but i love the movie and it gives you a sense in particular the scene that i'm pointing to gives you a sense of how mathematicians were in many ways uh treated like celebrities so i'll pause just for a minute in case you want to pause this video and then go take a look at the john nash video [Music] so of course john nash was a real person he received his phd in mathematics at princeton at 22 years old he was a prodigy he was on the mathematics faculty at mit in 1951 through 1958 and he had adult onset schizophrenia uh right before he was 30 years old and uh and it was quite a severe case and it took it he he spent a number of years in sort of mental institutions homeless etc but in his work in princeton uh in an area called of economics called game theory he did something that was quite innovative and it really revolution revolutionized in many ways our understanding of economics and and he sort of he he got his schizophrenia under control in the mid 90s and uh and he was awarded the nobel prize for economics in 1994 and so this is just kind of this moment where this all this stuff was blossoming in the 40s and the 50s now in the 1960s we started seeing a different way of of computing this was moving from the research into the mathematics of building computers to the applications of the computers and we needed to have connectivity because you would build a computer and you know in 1955 there might be one computer available to a university researcher in the entire state and so you had to connect them together and so we would often end up with these things we call terminals in our office and they would use these dial-up modems and they would connect perhaps for a local call to a local computer or even a long distance call a very expensive long-distance call to a remote computer but these computers were so central and so precious that we didn't care about the cost of phones and we couldn't share them i mean we didn't everybody didn't have a computer the way they do uh today and so there was a couple of different models one is that you had people doing their work connecting to computers over these modems um and then more rarely you had computer to computer connections using what are called leased lines these lease lines were very expensive they were kind of like making a long distance phone call 24 hours a day seven days a week they were rarely used in academic situations because they were so expensive it was more common for a bank to use them to say move all their data from their branch locations to the central locations once a day they were generally slow and very expensive and they were justifiable in certain situations and so that's kind of how our world was we we had something on our desk and then we used that to connect through the phone lines to a central computer that was shared with many different people who called it time sharing and so oops so this next video that i want to show you uh just the first part of it you have to watch the whole thing um is kind of in the 60s and in 70s when i became involved in computing this was kind of the way that we perceived our computers you know we would dial them up and they made these clunking noises uh this teletype that you're going to see in this video as a is a world war ii artifact basically um and and my first programming personally was on this kind of a computer making this exact noise that you're going to hear in this video and i i think the the key to to emphasize here is that you know you it's kind of fun to look back at this old guy with his horny rim glasses um but but we really were enjoying ourselves and you think well i got this fancy cell phone and it does all these wonderful things and i place angry birds on it and weren't these people not having fun and the answer is we were having a great time we were connecting communicating calculating computing doing all these wonderful things we just were doing it sort of more slowly with simpler equipment but but even from the beginning in the 60s and the 70s people talking to people were an important part of the use of this technology so uh take a quick look at this video and then come on back i'm going to show you the use of the michigan terminal system a large computing system at the university of michigan computing center using the ibm model 360 67 we will be using a standard teletype and we will be dialing in through the ordinary university centrix telephone system this system can be used by a telephone system anywhere in the world as long as one can dial directly into the computer these teletypes are not wired in but or use the ordinary telephone system we start by pushing the originate button which turns on the teletype then we will dial using the regular telephone dial here now there is a handset which is available with every teletype system you can dial this number if you like with the handset either one works but as soon as you are connected to the computer do not use the handset for any purpose at all now i'll push the originate button i can control the speaker volume with this little knob and so you can hear it now i will dial the number 763 o 300 or in the centric system just 30300 and now the computer is coming in i will turn the volume down the response from the computer is the identification of the system the who are you is the standard teletype communication identification and the teletype itself answers with its own built-in code number in this case the teletype at the tv center now i must sign on to the computer and identify myself the number sign at the beginning of the line is the indication that the michigan terminal system is waiting for me to communicate something to it the first thing i must do is issue the sign on command all commands in the system start with a dollar sign now the keyboard on the teletype is very much like the keyboard of a typewriter including a shift key for the characters on the top of the key we do not have upper and lowercase letters however so the shift key is only used to get an additional set of characters and in this case the dollar sign is a character above the number four now i will use the word sign-on which is the official command s-i-g-n-o-n there must be a space at least one after every command and now i will give my own personal computing center number which happens to be w010 each user of this system is issued a computing center identification number through his instructor if he's a student or through the department for faculty now this line has been typed it is being held by the computer but i can if i see any errors in it do some editing and correcting we'll see that later i am satisfied with this line and now i will give the end of line character which must be given after every line and i will remind you each time that i'm doing it for a while the character for ending a line uses the control key on the keyboard which is a special key which indicates that i want to use even additional characters besides those that are added by the use of the shift key i hold the control key down and while i hold it down i can hit one of a number of other keys and in this case i will use the control with the letter q it's possible to use the letter s also but that's right next to another key which disconnects a teletype so i recommend using q i will use ctrl and q to terminate the line and now i'm asked to give my private password i'll give an incorrect password just some nonsense characters and of course end of line because i must indicate that i'm satisfied with that i'm prompted that i better give another password and now since i intend to give the correct password i will turn off the printing so that no one can see what mine is the to turn off printing we use this switch which is labeled hdx and fdx for half duplex and full duplex we normally operate this the teletype in half duplex mode if i push it to full duplex printing is suppressed and now i will give my correct password push to half duplex again and end of line character control q and it is accepted by the system and i am now signed on the information that's coming out here is time and date of the previous sign on and time of date of this sign-on hello and welcome back i just wanted to give you sort of a sense of the actual physical experience of connecting and communicating and computing in the 60s and the 70s the 70s was when i got my start and uh the lucky people got to use a teletype like was being used in that video and other people used punch cards um you also heard the squealing sound of the data being converted into sound and back and forth um you know it's the the key goal was there would be if you were lucky one computer within a 20 mile radius of where you were sitting so you could use a local phone call and you could be on all day and do things all day long and that was really exciting and a lot of fun um and this was sort of a common if you were sort of you know today you have a 2 000 laptop the fancy people had these teletypes right in their office because then they could conveniently sort of do computation and and and not have to go somewhere do that computation so that's a pretty exciting so dial up was kind of like the way the campus operated but you can also do data transfer with leased lines and the typical thing about lease lines is that their uh the the phone company if it's going between two cities um digs a hole and puts a cable in and has some number of copper wires in between those cities this is back then um of course it's fiber optic now but back then it was copper and every time you would make a call between those two cities it would have to find a free piece of wire and connect it up so that your call the audio of your call would go through that wire and if you wanted to you could pay them to lease one of those pieces of wire it meant that people couldn't use that wire for phone calls you would sort of take it out of service and if they had 500 wires between two cities and you leased one of them then they had 499 wires and so it was rather expensive and really very importantly the cost was based on distance right a mile versus 5 miles versus 20 miles versus 100 miles the cost would kind of go up linear based on distance and so that's a that's a key thing that distance is important in these very limited resource copper wires where you had to dedicate a wire either to a phone call or to data and so we ended up evolving in the academic world what we called stored in forward networking with the sole purpose of compromising everything so that we would keep our costs that we paid for our phone lines to a minimum we wanted to communicate in academia with people all around the country and all around the world so we had to keep the cost low and so we would lease a leased line you know so so so here would be us sitting with terminals in our offices and we would use dial-up to get connected and so this might be a few miles like one to two miles or one to two kilometers um and that's what you had you had to be live near a computer or have an office near a computer and then you would use the local telephone network to dial and to do your work and then there would be one connection out the back of that computer to a another computer that would also have users on it and then other computers with users on it other computers with users on it so this might for example be michigan this might be stanford stanford would have a computer michigan would have a computer the stanford people be connected to the stanford computer stanford would have one little connection out its back end and to the rest of the world so you sitting here want to talk to somebody sitting here so this is how it worked you were sort of always connected so your data would be inside you know you type an email into the computer and then you'd say send that out to my colleague at stanford and so the problem is is everyone else was sending and so let's say for example that somebody sent a pretty big thing big purple circle is what they're sending and then somebody next to you in the office down sends a kind of medium-sized orange circle and they're sitting there being stored in the local computer and then you finally finish typing your email messages and it's kind of a small green circle and now the problem is is you're in a line the way this worked is the software would keep a cue or a line of those who are going to use the uh the line and then it would sort of start streaming this stuff out and once it started it just kept doing the same file over and over and over now if it died it would just start from the beginning again and so it takes some time for that thing to move across the network and then would move across the network the next message in line would start to be sent across the network and it would use that link um come on finish up there we go okay it takes a while for these things to move across the link and then finally and this could be ten minutes later finally it was your turn and your stuff would work its way through the network and get to the next computer the problem is is once it's in that computer you face the same problem because if all these messengers are going from uh michigan oh forgot to hit the button thing from michigan to stanford and they're all on the way they got to go through all these links because there wasn't a direct connection because the direct connection to stanford would be very expensive and so you're we're in us we find ourselves in a situation where it you're stored you set on these computers and if there was an outage of a network or say this one this might be hours you might be sitting on this computer for hours on their disk drive right not even in their memory you'd be sitting on their disk drive for an hour and then finally this would come back or maybe this computer was down they'd come back up and then the data would start flowing again so it was haphazard i keep telling you it was awesome it was just slow it was awesome but slow so don't don't don't feel sorry for us we loved it we enjoyed it it was great talking to colleagues all over the world so we ended up with a network architecture that encouraged more hops which means it encouraged more latency and encouraged the likelihood of getting stuck behind something large as it worked its way through the network so for example um we were going to send a message from east lansing to say the east coast somewhere and we would have two connections maybe we would have a connection between uh michigan state university and east lansing in ann arbor and then we'd have a the data would get stored in forward in ann arbor and then we get uh further sent to cleveland and if you think about it the the wire between ann arbor and cleveland goes like this right they probably buried wires under the ground and if we could actually convince toledo the cost of the big wire versus the cost of the two shorter wires the cost of the two shorter wires is almost exactly the cost of the big wire because it's not really a big wire and so if we can convince toledo to join our little club then we don't increase the cost of our communications because we're it's about the same right the the wire length is about the same but we've got one more school for the same cost and if you keep doing that and you keep saying oh well let's put a school here and put a school here and put a school here in the middle of a lake and put a school here in the middle of a forest the economics draw said that you could connect more schools with effectively fixed cost by shortening the distance every time you made a connection and sometimes they would sort of make two connections but the way it often worked was from the perspective of this whole thing is the data would sort of tend to go you know up to some central location and then kind of fan back out from the central location and uh and so these it there wasn't a lot of redundant links in this whole situation so we find ourselves in a situation where we can save more money saving money by just simply adding hops so in this environment we had a lot of email back in those days we didn't have a lot of images computers you know barely could do upper and lower case sometimes it was impressive when we finally got to upper lower case let alone images and so for the longest time everything we did was very text oriented sometimes we would send small files like programs to each other but it was either text and these days those are tiny it's almost like sms right it's almost like the size of an sms message was the size of a typical mail message back in those days images we didn't have computers that could even display them and so it hardly mattered so you focused your life on this one computer within a 20 mile radius and it had this connection to like a snake-like connection that sort of went over hill and dale and to somewhere and then i finally got to stanford or wherever your mail had to go but again it was awesome to be able to sit at a keyboard and talk to people all over the world even though it took four hours we didn't we didn't send as much email back then and a number of networks came up one of them that became sort of pretty widely used was a thing called bitnet and it was uh where everything kind of came back to princeton was one of the main hubs of bitnet and uh it worked really well and and you know we could there was a way to say oh i need one more school and we'd go find the ones that are closest and we'd make the connection and so that's pretty much how the average academic saw networking one campus computer and sort of this weird slow storm forward networking that worked well enough but during that exact same time from the 1960s to the 1980s the us department of defense was investing in a research network called arpanet and arpanet was funded by the military arpa defense advanced research projects agency um and we'll meet in the later in the lecture we'll meet vince cerf we'll tell the story much better than i can tell but i'll just tell you a short version right now a lot of people wonder why the department of defense started this project and a lot of people say oh it's because of the fear of nuclear war and the project was very long and there were various motivations throughout the various phases of the effort but if based on what i've been told and what i've read i think it's safe to say that the primary motivation was to improve the use of their computing equipment that they were used for military purposes make it easier to access them make it so people didn't have to travel as much made it so that people could work from their office on many different computers and also so that people could actually talk more effectively so they could send email real fast so that the email would move faster for military people so the first thing is really kind of an end user focus to make them more useful to people and get people working together more effectively that was the first kind of heavy motivation the second motivation was actually had to do with reliably reliability and redundancy and resistance to partial failure and that really wasn't so much about nuclear attack that was more about battlefield attacks so the notion that once you start having all these links and you make them wireless and you put them in trailers on a battlefield and you know you got a few hundred miles and the date is moving back and forth then we have to worry about in a battlefield situation one of those trailers might get hit and how would you keep the network running if one of those trailers got hit so the notion that we have sort of a whole nation and we are worried about communications i'm sure there was some worry about that but it wasn't really the main reason and i mean the exciting thing to me is the main reason was a very very people-centered reason of people working more effectively but this was a rather exclusive club rather exclusive network because it was only for the people who were either military or funded to build it so in 1969 there were four hosts on it again that's research and these were lease lines the same ones we were all using to do our email and they were expensive very expensive but because it was a large grant they just paid it because the research was not about the money the research was if you had these connections what would be the best way to use them and how to deal with redundancies and how to deal with multiple paths to the same place all that stuff is is interesting research questions that computer scientists spent many years conceiving of and so by 1972 looks like we got about 20. we got a couple of cross country links and multiple connections and and this looks like a great test bed to see you know so all of a sudden this part goes down how quickly can we reroute around the outage and literally these things these days reroute around outages in far less than a second um and so that's really impressive our current internet owes its history to the research that was done in the 1960s and 1970s but what was fundamentally different between the story and forward networks of bitnet and this arpanet research network was the notion of packet switching so the it's a real simple concept it just takes a little more complex software to solve the store and forward network would start date sending data across the link and then keep sending it until it was done that would seem to be very efficient you didn't want to like you wanted to use it as fast as you can and then put in the next one out put the next one out with the next one it was really simple and it dealt with outages in a simple manner by storing them all locally on the disk but what if instead you made it so you broke every message let me see if i can even draw this i should probably make a slide about this so if there's a network here and here's here's my computer and it's got a network connection it's got one really big message broken into pieces many pieces and it sends the pieces one at a time if you show up with a really short message with only three pieces all you have to do is wait for one piece to go and then your piece goes in and then you share for a while right you share for a while and then your have made it through because you only have three packets this has like 1000 packets and then once your message is through they resume sending this data in order and so you were able to sneak and bypass the traffic jam that was this large amount of data and so that's the idea of packets and packets are these chunks to say what we're going to send is a piece and then at the end of that we're going to maybe send a piece of a different message it was allowing simultaneously multiple message to be in flight at the same time that's packet switching and what happened then is we also ended up with these special purpose computers called routers they're still computers but they weren't doing storing in the same way they were just forwarding they weren't storing forward they were merely forwarding so we'll get back to that in a second [Music] do [Music] when you walk into the back entrance of boulter hall on the ucla campus you may notice a seemingly random pattern of floor tiles in the entrance if you spend a little time looking at the pattern it might dawn on me that the tiles represent zeros and ones and then you might even figure out the tiles represent ascii characters the characters in the floor tiles spell out low and behold to commemorate the building where l and o were the first two packets ever sent on the arpanet from ucla to stanford research institute on october 29th 1969. oppa wanted a network so that they could share the large computing resources they had given to their researchers across the country university of utah had a terrific graphics operating system so i had database we had simulation university of illinois had high performance computing and every time opera brought on a new researcher they'd offered to buy him a computer fine but the researcher would say i want the same capability all those other guys have i want the graphics the database and all the rest and office said we can't afford that if you want to do graphics you log on to the machine at utah through a network that we think we're going to make so the need for the network was to do resource sharing and not to protect the united states against a nuclear holocaust when bob taylor came in as a neck director and he recognized his need for sharing resources by the way notice the phrase i'm using sharing resources is exactly what i built into network design now they wanted to share the external resources same idea you have it you're not using it somebody else you'll be able to so they brought in larry roberts another classmate in mind in fact an officer made of mine at mit to manage this project he came to me because he knew my work he he watched me do the simulation in fact i used his compiler on the tx2 computer and said len we need to know if this thing is going to work he knew that i had the theory so i could show it to him it's going to work in fact he even says he would never have decided to spend millions of dollars of the us government's money he wasn't sure this thing would work so the design began to be laid out by a few of us in 1967 in 68 they sent out a request for proposal the end of 68 both baranek and newman a cambridge massachusetts firm won the contract to produce the first switch of the arpanet and we became the network measurement center early on so we could test it out during the design phase some great people were there throwing their ideas out herb baskin was there a time sharing expert and he said if this network can't deliver short messages within a half a second i can't use it for time sharing specifications half a second by the way we got 200 milliseconds and uh westclock said a separate computer from communications i said look if this is going to be an experiment and i was also interested in the research and experimentation we have to build software in so we can run experiments artificial traffic generators measurement hooks a place where the measurements can be evaluated with the software and so howie frank began to talk about network reliability he said this if anything fails the network shouldn't collapse so we built in we so we didn't say this should be five nines of uptime we weren't much more pragmatic we said if any single thing fails everybody else can still talk so to do that you need something called a two connected topology two independent pairs between every pair of nodes built in so all those specs went to bbn they built the dawn thing they delivered the switch here at ucla on schedule eight months after they got the contract they were to deliver this new technology new applications new device they did it on time on budget came here we plugged it in and bits began to move back and forth between our timeshare machine and that switch on the day after labor day september 2nd 1969 but that was just a one node network the schedule was that another one of these switches will be delivered at stanford research institute 400 miles to the north and they would connect that to their machine and that happened in october so in october we had a two node network my machine my switch another switch 400 miles away and the sli host okay and it wasn't one single line it was a gang of 4.8 kilo per second lines so now what do you do you have a two node network well now you can do something so we decided one night one night late in october programmer charlie klein and i were in the room and said look let's let's communicate between the two machines so we got a hold of bill duvall their program there and we said let's simply log in from a terminal connected to our host to that machine the idea is these are both time sharing systems they expect terminals to connect in and use the services of the machine the big thing was sit at the terminal here log on to your machine here and through this wonderful network log on here as if you're a local user well that's easy enough so we got all set got charlie down at the terminal over here and just to be sure this worked we had a telephone here instead in fact i actually think i've got the here it is i just happen to have it that was the telephone set that's the telephone that was a telephone we plugged it in we drove we derived a uh we weren't using skype for a knife plugged it in we used a piece of the high speed line for the for the phone connection but with the interesting thing is we were using the defunct circuit switching technology to prove out the new packet switching technology and it really helped us so we could understand what's going on so charlie typed the l who said you get the l bill said yup got the o type deal you get the o we're trying to do l o g for login get the o got the o type the g get the g crash so the first message ever on the network was low as in lo and behold yeah now that's especially interesting because if you go outside this hallway here down under the alley and come into another entrance to this building and you i just discovered this about a week ago you you walk on on a platform and there's a mosaic of tiles down there and they're a strange pattern it turns out it's the ascii code for lo and behold i have no idea who did that you know it's about a year and a half now some very clever person put that in that was the first message october 29 1969 to 10 30 at night entering 1969 right now we reproduced this room to look as it did and smell and feel like it did some 40-odd years ago and if you look over here you're looking at the first piece of equipment ever on the internet this is that first interface message processor imp number one at ucla a honeywell mini computer adapted by both veronica newman bbn to operate as a switch which handles the functionality and this is the same physical four square feet where it served as the opening note of the internet the first piece of equipment ever on the internet and that's the actual one that's it i kept it for years i tried to throw it away many times most of the people who had imps have tossed them this was the one or two left around the world but this is number one this is the first piece if you open this machine you'll be privileged to smell it yeah it's got an unusual odor and just brings you right back yeah the emotion it's great you can't slow it through it this is a military hardware machine this this machine was essentially a state-of-the-art mini computer which was adopted by bbn and i first saw it in 1968 at one of the joint computer conference shows thousands of people on the big big exhibit floor and you see these sky hooks up here they had one of these machines hanging from the from the ceiling swinging in the air running and there's a guy big guy stripped to the waist oil skin with a sledgehammer and he was going whack right whack to show that it was military and it was but the most important document of the internet the most important document of the entire internet is right here you talked about who is working with me well one of my software programs was john pastel in fact he's this picture and he was not a hippie even though he appears to be he was the one who basically disciplined my my staff to do things properly and keep records he said we have to keep a record of what's going on so beginning in october basically a month after the improvise arrived we started keeping an imp log you know this is an engineer's log this is not you know madison avenue a piece of doctor just scribbled we use an old sds log and in here we kept a record of what's going on and the most important entry happens to be right here on october 29 1969 at 10 30 at night charlie klein the programmer who's in the room with me made this entry talk to sri host to host this is the only record of the very first message ever on the internet right here you know we had the technology we started making measurements we were the first experimental node so we we saw things happening how come we have a 50 kilo per second line the routing procedure either goes one way or the other one at a time how can you get more than 50 kilos per second between two nodes if there's only one path at any time and we say oh it's obvious that path is on now when it gets backed up you change paths so this guy's empty and it's packed back up while you're sending this way they go close to 100 calibers per second right too so i think but we could break it in words i said and every time we did we would call bbn and say fix it we did this fix it because they wouldn't give us the code they kept it proprietary until officer said we paid for that code you have to open it they did we saw it every time we got it and it would take them six months to fix anything right we discovered a fault this time we had the code we showed them how to fix it still took six months one of the things i was very much interested in with design was distributed control why right when i was a student of shannon and shannon's great work came when you had a lot of things interactive long code words for example that's when these emergent properties arise so i said i want to design large networks it's not a large network you can't have a single point to control you have to distribute it right so what does it mean to distribute control you're delegating authority to all the peers when office started funding the principal investigators they had the same philosophy say look you're a smart guy here's some money go do the thing you do best we're not going to sit on top of you make good things happen so here we are i'm a recipient like kind of money what do i do with it i've got my graduate students they're billion kids look we need a host host protocol here i'm not gonna sit on top of you gonna run with it that is not a product mentality right that is a research and development to create them and it worked so well [Music] so my favorite analogy to talk about how packets work is postcards so let's say that i want to send a 30 character message to my friend daphne at stanford but all i have is 10 character postcards so off i go i break my message into three bits of ten characters i write each piece on a different postcard i address with a from and a two address from chuck to daphne and then i give a sequence number so that when she gets the postcards she knows what order to put them back together so i have to label each of these and these are fixed width and fixed length they you know it's like a packet right it's i'm breaking a long message that might be very long into nothing so that each piece is a certain size so that we can all share it right and so then what i do is i send these i put them all in my mailbox and then i wait and this magical thing called the post office picks them up puts them in buildings people sort them they end up on trucks they end up on planes on trains on donkeys by hand and each of my postcards might take a different route i mean one of my postcards i should have a map here you know one of my posts not that one oops do that one of my postcards you know goes from michigan to chicago to kansas city to san francisco another of my postcards goes from michigan to chicago to dallas to denver to kansas city to chicago to san francisco and my third postcard goes uh really crazy it goes to cleveland and it goes to atlanta then it goes to washington dc and then it goes to chicago and then it goes to kansas city and then it makes it to san francisco now the fact that they took different paths and they might not even arrive in order some might get past each other they might get stuck somewhere it doesn't matter because each one of these has been labeled both with the from and the two destination so each of these intermediate locations doesn't have to get it to the final destination they just have to get it closer they make choices and even if they make the wrong choice it can be sort of dealt with one way or the other okay so this is some big mystery it's called the post office it just takes these little cards with labels and moves them it doesn't know if that's a youtube video or if it's an email it doesn't really care it just has these little chunks to move around that have from and two addresses i mean that's the essence of the network the internet the cloud that's why that's why we draw this picture you'll see like oh it's a cloud and that's because it's don't worry about what's inside here it's all super mysterious and don't worry about it it just comes out so after all that happens daphne who happens to have the exact same mailbox as i do it's kind of rare we both have a mailbox that's copyright creative comments that's what's cool about this mailbox at some point later these things start appearing she goes like oh looks like chuck is sending me a note but because of his limitation of postcards he's only sent me one i know who it's from and i know who it's two and i know that's the first of some number of messages and then the next day out comes another one because that was the one that sort of went the more southerly route and she goes well i'm missing something i'll just sit and wait and see what happens and then finally many many days later the uh the one that took the circa two is most circuit most roundabout route um comes and now daphne can reassemble the messages because she has the sequences so she simply puts them back together one two three boom so she puts them back together she has it she sort of throws the postcards away and she has the ultimate final message that's packets packets are breaking a big message into small parts labeling each one of them individually and then throwing them into the shared network and so this is what it looks like the post office in this is called a gateway okay come on gateway okay so that's a gateway so here's michigan here's stanford stanford has a gateway too that's like their post office in the middle is the post office box right and the post office box basically the post office has intermediate locations and it has trucks and locations these are links and routers and messages can take different paths right and then they arrive and they're reassembled and we have like their ability to hook from home and stuff like that so there's a local area network lan lan and then there's a sort of like internet the network of networks it's like the post office and it still breaks things into packets they still come out as packets and then daphne has to reassemble them on her computer to create the message so so the research network arpanet solved a lot of engineering problems [Music] so [Music] so i was a student at mit and i was scheduled on the wonderful scholarship program supported by mit lincoln labs to get a master's degree and it was a very rigid schedule you're going to be done in two years get a job at lincoln lab and do research and so i was on the program and i was one of the few people who actually finished on time and as i was finishing i scheduled to have my first child the summer of 1958 when i'm going to be done in that summer my professor at the surfer mechanisms lab at mit said you have to get a phd i said i don't want to get a phd i've done this program i'm scheduled my child has just been born so you got to do it so i kept twisting my arm and finally i said okay i'll do it but i'm going to do it i want to work on something that's really significant and not piddle around for the next three or four years so i decided i'm going to work for the best professor i know at mit and that was claude shannon the infamous wonderful magnificent person claude channing so i called him up and invited me to his house and we chatted a bit decided yes we're going to work together first thing i started working on actually was a chess playing program just working with a man who was a delight he was a great engineer great mathematician smart as heck could solve a problem like that i looked around at my classmates most of whom were working for him and most of them were working on the kind of work he had developed namely the field of information theory and coding theory and i looked at them and i'd taken those courses and i said you know the work that they're doing is the work that he left behind those problems that are left over hard and probably not of great significance little bits and pieces left i said that's not what i signed up for i wanted something to be fun exciting challenging and have impact meanwhile being at mit and lincoln laboratory i was surrounded by computers and i recognize you know one day these computers are going to have to talk to each other because this is way early nobody was thinking about that at the time but if they're going to talk to each other what technology would support their communication and the interaction and the answer is there was nothing available so i had an approach i figured the way to provide the communications would be to provide the ability for shared communication links because i knew that computers when they talk they don't talk the way i am now continuously they go blast and they're quiet for a while although i let it suddenly come up and blast again and you can't afford to dedicate a communications connection for something which is almost never talking but when it wants to talk i want immediate access so we had to not use the telephone network which is designed for continuous talking the circuit switch network but something else and i had an approach using some mathematical thinking namely cueing theory there was a model i could develop and it just made sense to do that which we had done in time sharing and time sharing you have a big computer and a lot of people share it when you're using it i'm not when you're not i am so we sort of interlace each other why not use communications the same way we're going to set up communications capability let everybody jump in and share and they only get to use it when they have something to send this is a new technology so there's an approach i have a mathematical tool to do it it's clearly important it will have impact and nobody's looked at this yet so there's a lot of low hanging fruit around it's not that hard it's not that hard so it's perfect for me so i started working on that problem now this was years before anybody needed this technology in time sharing for example which is the underlying technology that i adapted to communications in time sharing you want to let the little jobs go before the big jobs yeah because why should little guy wait a long time for a big chat he's going to get in and get out so when jobs come in you ask them how long are you and they're all going to say i'm tiny so you say okay you're tiny i'll give you a little bit of time and if you're tiny you'll be out of there no i'll send you to the back of the queue and you get another tiny shot the notion of round robin so you get a little bit of the time we're going to break you into little chunks you got a chunk at a time i said that's a great idea for sharing communications will give everybody a little bit of communications time the little ones will filter out the short jobs to filter out when i say job i mean message a short message will filter out and the long ones will take a little longer and they don't mind being interrupted by the little guys the important thing in this technology was not to make the very short messages wait behind very long messages this automatic round robin which is now called packet switching you chop things into fixed lengths you give them a little bit it's not that's not enough give it another piece and they go flying to the network on their own so the idea of packetizing was important the idea of distributed control the idea of large shared systems in which you get some terrific design benefits came along and so this got published mcgraw-hill book nobody cared i published early yeah this is published my dissertation ended in december 62 i graduated in june 63 this book was published in 1964. and nobody cared nobody cared in fact i went to at t the biggest network of the time and i explained to them you guys ought to give us good data communications and their answer was what are you talking about the united states is a copper mine it's full of telephone wires use that i said no no you don't understand it takes you 35 seconds to set up a call you charge me a minimum of three minutes and i want to send 100 milliseconds of data and their answer was little boy go away so little boy went away and with others developed this technology which ate their lunch but nobody cared and they said it wouldn't work and even if it did work they want nothing to do with it so that was the environment we faced it wasn't until years later when the government decided they needed a network that suddenly i saw a way in which i could implement the technology i had but getting back to what i said earlier i set up this mathematical model it was analytically intractable and still is by the way i had two choices one is give up and find something else or two make an assumption which allows them to move forward so i introduced a mathematical assumption which cracked the problem wide open from that point i could just sail through the solution get the get the performance behavior get the design principles going forward but then the question is was that assumption any good what was the assumption the assumption was what i call the independence assumption and it's absolutely a false assumption it says when a message travels through the network it changes its length independently every time it hits a new node on the top through the network right mathematically that creates an independence which allows you to proceed with the analysis but it's clearly not true right so what i had to do was to simulate a network with and without the assumption i simulated many networks on a machine at lincoln laboratory on the tx2 transistorized computer there and i spent four months writing the simulation program without debugging a single line of it at a 2500 line assembly language simulation and i knew if i didn't get that simulation right i would get no dissertation ran it tested with and without the assumption and the results were amazingly close so i had my solution i could prove it work the package wouldn't fall on the floor i could tell when things would work well and when they wouldn't got published again still nobody cared about it as i said lincoln laboratory sent me for my master's dissertation they also resent me for my phd so they supported me for years wonderful economic financial research environment support and i felt an obligation to work for them and i was prepared to take a job at lincoln laboratory which is a great institution and so i went when i went to work there the first thing they said is look len why don't you look outside before you can commit to work here make sure there's nothing out there that you really would like better this was a gen a magnificent step on their part so i took a trip to the west coast i went to some of the aerospace companies i was not interested in university position at all really but it turns out when i was going up to san francisco to look at some of the high-tech companies up there a friend of mine suggested i interview with berkeley so i did i interviewed i came back they lost my case they changed the uh the chairman i never heard from them but on sabbatical one of the professors that interviewed me came to mit while i was just finishing myself my phd then and he sees me and he'll says klein rock how are you and he thinks that i'm looking for a p a an academic position he contacts one of his friends here at ucla who then invites me out here offers me a job and now i've got a dilemma do i want to teach do i want to cross the united states it's almost like on a wagon train 3000 miles away from the east coast where the world is family is for a job paying half that i could earn back at at mit lincoln labs and try something i'd never tried before so i went to the box at lincoln lab and i said look i've got this offer it looks attractive it's it's a new challenge what should i do and there was a wonderful answer they gave they said len try it if you don't like it come back wow wow it's right yeah well i came here in august of 63 50 years ago and i'm still here [Music] so here is an example problem to solve now none of the mailbox run of the routers none of the post office offices know exactly where the thing is going and they don't transport it to the final destination all they do is they transport it to the future to a further down destination but what if they get it wrong what if one this is like michigan this is chicago this is dallas and what if michigan thinks sending to chicago is a good idea chicago thinks sending to dallas is a good idea and the dallas thinks sending to michigan is a good idea well then michigan is going to get it again and send it back to chicago and so then what you end up with is this situation where your packet your data is in a loop right this wouldn't be good for the post office and it's not good for the internet so you have to say how do we solve this problem when something's wrong right it's you wish it were perfect but it's not and so this is the kind of research question as to how to solve this kind of problem and we'll talk more about this in a bit uh when we get into a little more technical detail but i just wanted to sort of say that's the kind of research in engineering that took literally 20 years and four different versions of it of the arpanet to get right [Music] there was this whole world of of coders and hardware guys in the 60s and i thought to myself i bet there's a really interesting story here so my editor at simon and inchester was totally 100 behind it uh we the working title was really bad it was called building cyberspace so we knew we needed a new title and so and matt my late husband didn't come in on the project uh he was actually a better writer than i am and a wonderful just had an absolutely incredible mind and could grasp anything and uh he didn't come in on the project until about maybe a year into it when i was getting overwhelmed i had this new baby and i was working at newsweek and i was uh and i realized that there was just more to the story than i could do myself it would have taken me like double double the time so um that's when he came in on the project and i found indeed that there was this incredible untold story of not so much the internet so we have to keep it really clear the arpanet and how that started and um in fact matt's the one who wrote the chapter the fastest million dollars which is about how it got funded um at ipto the information processing techniques office at what was then derpa and then became urban and darpan and arpagon and uh these people were all as i had hoped incredibly interesting really smart and uh and had this new idea for how computers could talk to each other and um one of the most amazing things was uh visiting um larry roberts who was running ipto at the time that this whole thing was drawn out we were at his house in woodside and we went out into the garage this is how unresearched this topic was because you know huda thunk right and we go out into his garage and there are all these boxes of old mildewy papers and he starts pulling them out and they're old letters between him and people at mit and then he pulls out this amazing set of sketches of him just it's like remember when um trudeau doonsbury did inside reagan's brain this was like inside larry robert's brain he was like just sketching out all the possible configurations of um what this network could look like and i've got these pictures in the book which you can show but here you know you know these these are all his very early diagrams um just sort of popping out of his head this is when this is you know the the early nodes on the arpanet and i just loved finding all this incredible primary source material and it was a it was uh i felt like i should have been paying somebody to be able to do this rather than right because it's just in a it's in a garage that millions of people drive by right and it's also this important artifact yeah of the what we're living right now right and we were able to sort of you know capture it and bulk baranika newman was a big force behind the book the book was in part their idea and so these guys because what we tend to forget is that bbn they built the first amp and the interface message processor in fact there's a very funny story in the book where ted kennedy you know senator from massachusetts sends a letter to bbn congratulating them on getting the contract for the interfaith message process i know uh so a completely uncharted territory and they had to build this piece of hardware which they did and program it which they did a whole new thing and uh it was uh so i spent a lot of time with the folks at bbn got to know them well and that group they were completely on school in talking to you know reporters or a book researcher and there's this one guy ray tomlinson everybody said oh you should go talk to ray because he uh came up with the at sign i said oh really okay and so you know there's this wonderful really quiet guy ray tomlinson sitting in his office and i knocked on his door i said i understand you came up with the outside he goes oh yeah yeah and then he explains to me you know how it is that he did this and then you know and now it's a huge legend that ray tomlinson did this and uh and uh other and some the software guys uh really i have this image if you sort of just driving and driving and going and going back and forth i mean what how long was the research how long did you research this was this i think yeah the research took a couple of years and it wasn't so much driving that the hacker book was like driving all over all over the place because that was three different stories of three different hackers this was more in fact email between the hacker book and this book and the and wizards so 1988 and like when i started this book 1990 email had come it was much more popular so there were a lot of we did a lot of emailing back and forth i spent a lot of time i did go from i was living in austin at the time i did go from um uh austin to boston a lot and i spent a lot of time actually i did fly to la and spent a lot of time with john pastel who came up with the domain name system and um uh and he died in 90 i want to say 98 99. that was shocking uh i spent a lot of time with him tell me about it um tell me about john well john uh so anyone who's ever seen a picture of john maybe you have one you can dig up this very long kind of santa claus like beard and what's amazing is so i spent a lot of time at his house he lived in a teeny tiny little house in uh in the la area and uh with his girlfriend this very nice woman named susan and uh on the refrigerator there were all these family photos and they all they were all like these uh people with long beards long white beards and i thought one of an interesting family and he drove i think this kind of beat up volvo or something like that you know something completely unpretentious and lived so unpretentiously and uh and had all his files like right in his study in this little house all this history and also at isi where he worked in marina del rey and uh and we spent hours and hours and hours together first in boston um at where we were both at a an anniversary party of the um of the arpanet and we spent a lot of time together and i thought you know this guy is key and he was so quiet that sort of getting him to open up really meant spending a lot of time with him and he was oh and we met at one of the very first inec conferences in 1995 in hawaii and spent a lot of time there and he just was so patiently trying to explain to me um things but i'll never forget this one thing that happened with him in hawaii at the inec conference it's the internet society conference one of the very early ones and uh and i just was asking him some questions and he goes uh he looked completely disturbed and distracted and he said no no there's someone over across the room i have to talk to and it was someone from a very small country sort of wanting to talk to john about their domain the country's domain name or something but to john it was far more important that he deal with this person's problem then you know then cement his place in history in this arpanet history in fact when i was working on the book i sent him an email i'll never forget this i think i don't have the email anymore and i said just out of curiosity why haven't you ever like wanted to get rich because that's when people were just starting to get rich so this was in 95 and he said it just that's just not what this is about isn't that a wonderful thing it is yeah and that's who john was one of the things that i've been curious about is how this got by att right how this made it past 18 i know talk about no vision right and uh well you know paul barron had happened he had couldn't convince at t to build the very uh marvelous network that he uh designed and then these other guys the next guys the next crew that came in uh and they just they were very very you know think about it back then it was they had a monopoly what did they really care they had um they had the um bell uh research labs and they it was a non-invented here kind of syndrome and they they didn't see they didn't see it i mean talk about no vision they simply didn't see it but i think the fact that not invented here part of it was um like what would we need uh what would we possibly need this for yeah they were in a context they were in their own context yeah but thank goodness i mean thank goodness they didn't get their hands on this thing the same with ibm sorry you know just we don't we didn't want corporate america or deck for that matter um the original hardware was um based on a honeywell machine and thank goodness honeywell doesn't exist anymore uh it sounds terrible that i'm saying this but it's but it was all kind of happy accidental sort of coincidences i think al gore is oh what oh totally underestimated and underappreciated is that what you were going to say oh yeah i mean why mock him every time every time i tell people i wrote a history of the origins of the internet um people say very mockingly oh did al gore invented and i won't play into that he played a huge role in terms of policy and direction and uh and when uh clinton was uh running for president you know they were out there with this very important uh technology white paper that uh uh i mean you can't do things like this unless you have the support that people like al gore provided [Music] by the end of the 1970s there was quite a sophisticated network like i said they'd rewritten it four times it had gotten to be very sophisticated there'd been a good investment they had been careful about understanding how to improve it each time they rebuilt it and so it was a really a fine working piece of software and the people who used it at these research universities and um at the military it was kind of like a futuristic world right you could send email and a second later it would appear right it was really quick and everything was nice and you could even have instant message like interactions where it went right then and the problem was is this was a small group of people this is maybe looks like 60 or so that was all the computers on the entire internet right i mean now we have our cell phones and we got billions of them right but this is like 60. um so real question is how did this go from 60 computers of a very narrow group to a much larger group and the answer to that question is at urbana-champaign illinois and so at the same time that this kind of goes back to bletchley park right where the computers were part of science throughout the 50s and 60s and 70s scientists much like the military realizing the computers were mighty useful to advance scientific research and so the national science foundation would fund universities to buy computers and they'd say give me 10 million dollars to buy a computer and another universe would be like give me 10 million to buy a computer and then three years later the first university said my computer's obviously i needed another 10 million dollars now the research questions were good research questions and they were important to society but the national science foundation got tired of giving 10 million dollars to every everybody i was part of this because i used to be a high performance computer guy this is a convex c3800 supercomputer uh i would be about this tall on this super computer this is a lovely and very expensive model that i was given after afterwards they were going to throw it away because this is from like 1987. um and and basically i wanted one of these terribly badly each one of these is about two million dollars this is like two four six eight ten million dollars and i so badly wanted this and i thought that i i deserved a ten million dollar toy and i could do such great research but unfortunately everybody else wanted the exact same thing and they were important too so the national science foundation said to themselves hey this isn't going to work very well if i can't find my pen um you know we're not going to we're going to make a network how about we put a few of these things in and then make a network and connect them together of course it's never as simple as that so now we're going to meet larry smarr at the national center for super computing applications ncsa at the university of illinois urbana-champaign and larry was the director of ncsa and one of them one of the many people but one of the most instrumental people in creating the national network that we now think of as the internet and getting it moving it from being a research project to being a project that we all both academics and regular people that we all can can make really good use of and so let's take a look at larry smar [Music] well i am a trained relavistic astrophysicist and i got my phds always in physics departments working on astrophysics problems involving general relativity gas dynamics and so forth and so in the 70s when i was developing what is now called numerical gene relativity which is the way to solve line sense equations for dynamics of black holes colliding black holes or in astrophysics that require generality like supernova events and so on uh i had uh to go and get a top secret nuclear weapons clearance to get access to supercomputers because in the mid 70s the only place to get a supercomputer was either at livermore los alamos or uh in one area which was weather now astronomy has always been a driver of supercomputing and in fact on johnny von neumann's computer that was built by the institute half of that and went to army to aberdeen half of that was used of course for the army for trajectory calculations but the other half was used for stellar evolution so astronomy and von neumann's interest in weather meant that astronomy and weather for 50 years basically have been dominant drivers of supercomputing usage and yet uh to do pure astrophysics you're having to get a top-secret nuclear weapons clearance now nobody seemed to think there was anything unusual about this um and so i just went ahead as a postdoc and i would get a few months in the summer work 100 hours a week and then the last of the year i'd have to live off that go back to harvard where i was a junior fellow try to explain to them about you know one could solve the laws of physics that we had been laying down for 300 years and incredible detail for engineering devices like nuclear weapons that put on earth temperatures of the center of the sun stresses beyond anything that we could imagine in academic problems that we're trying to solve in lots of disciplines so this thing could revolutionize academic research well nobody got it it was like i really felt like i was transitioning in a flying saucer between this advanced civilization at livermore and the stone age culture at harvard and harvard was as advanced as any place and thinking about this so it was a what i did not figure this out until it was in the early 80s and by then the first cray had gone into the continent of europe in an open scientific institute the box plonk institute of physics and astrophysics so i'm over there in the summer along with people like dave arnett who is one of the great supernova super computer guys in our country and people in chemistry and it was like paris in the 20s with all these expatriates sitting over there and and we're like trying to figure this out like it's an american-built supercomputer right why are we in munich right i mean this is very strange so um but you know in america we don't question the infrastructure somehow i mean it's just like it's either there or it isn't there or that's just the way it is but i was having a moss of beer late one night actually think it was a second masa beer with my german host who had also been born like i was post-war and he finally turns to me and he says aren't you ashamed of yourself you big rich occupying country you come over here in our little country and and we finally get enough money scraped together after world war ii to buy one of these super computers and you americans come over here and use up our time it says how did you guys ever win the war you know what what is going on here and so this finally sort of just stimulated me to say what is going on here this is nuts and i went back and i found out for instance that after the sputnik program the federal government had funded the universities built the science buildings started the supercomputer centers ibm would go around and give away almost the mainframes and and so the scientists in the 60s took it for granted that they had the fastest computers in the world in academia but about 1970 with the starting the vietnam war and with all kinds of guns and butters issues and everything else that stopped and in fact to give an example by 78 half the number of phds and engineers in engineering was being generated in our country as there were in 1970. so there was this complete severing of the sputnik era partnership with the federal government and it was particularly bad in computing to give you an example when i was at livermore in in the 70s there were four cdc 7600s which were just one of the finest supercomputers ever built no american university ever took delivery of a single cdc 7600 in fact the university of illinois when i first got here had a cyber 175 which was a retread a second design manufacturing of of this thing and we were one of the first universities that had it and people all thought that you know illinois naturally was way ahead of everybody else so it's like we were just completely divorced from the private sector that was generating these wonderful machines because of federal policy which was to say these things are only are so valuable that we can only afford to put them into war environments so after this german encounter i came back and i said well gee i wonder how many other scientists like me are there so at the university of illinois i started calling cold calling my colleagues and saying hi you don't know me i'm a little assistant professor but i bet you that your research is blocked by lack of access to super computers uh and they'd sort of say like who crack wreck who is this you know grant call and but we'd start talking in chemistry and biology and agriculture and so on and sure enough it turns out that that that was true they knew how to do the science they just didn't have access so i said well send me a little prospectus of what science you could do if you had a supercomputer well i ended up with 65 faculty in 15 departments on one campus and i thought this has got to be this way all over the country so it was i really started saying well gee somebody's got to raise this issue and about that time there had been a lacks report that had the federal government had done to begin to undercover some of this stuff but they still weren't i remember peter lax he was one of the greatest mathematicians uh head of one of the top people in the kron institute a long time advocate of things computational and i had this long long battle with peter because in the draft report of the lax report it was not it was going to say well yeah we ought to get one of these super computers and make it available to the universities but you know let's put it livermore put it somewhere that people know how to do this stuff and a university was not on the list of what the lacks report considered to be appropriate sites for a supercomputer so i had this long battle over the telephone i remember with peter lacks sort of david and goliath because i mean he was this giant of the field i'm nobody and um i i finally convinced him not to exclude universities as a possibility even though he felt that it was fairly unlikely that any of them would be able to play with sharp instruments and not hurt themselves so i mean you gotta understand the world was totally different when the supercomputer centers program was coming into being and people it's so hard for people now on the web and everything to go back to that time there was no internet you know there was there was these wonderful people vince cerf and bob khan and all these people bill joy who had developed tcp and and and embodied it in the arpanet this was a few computer science departments and military okay nobody in a physics department or chemistry department ever heard of the arpanet much less had any access to it but it was obviously the right idea and so once we got the congress to put through money for a national supercomputer program then there was a national competition and and and so on and the five centers were selected the first thing then the nsf realized well okay now we put these in place to be providing access to academic scientists and yet uh like they have to fly to champaign-urbana like i had to fly to livermore okay this isn't right so there were a lot of discussions then but the trouble was that the telecom lobbyist in washington would block any discussion of the federal government putting in the kind of network we have today which is what people wanted i mean everybody knew that they wanted to have a ubiquitous email person-to-person network but as soon as they'd start talking about that the telecom lobbyist would come up and say no way guys that's private sector don't get the federal government involved in that stay out of it so what we learned early on is real interesting it's like if we had argued instead of the super computer program let's get the federal government to buy everybody a personal computer which was ibm personal computer was two years old in 1985 and put on people's desk okay again this would be an interference with the private sector um so but we said oh we'll just take a few of these esoteric supercomputers and they said okay that's right there's no market there that's okay the federal government has a role well the same thing with networking what we said is we just want to put a high-speed backbone across the country to connect the five centers and and the telecom people said okay that's cool you know 56 kilobits was the national high-speed backbone less than isdn today yeah that's not a market okay and and we said we got a few of these weird super computer types who are out universities who want to hook into that they say yeah that's okay that's not a market you can do that well that was the nsf net backbone then the regionals got funded and then the campuses were afraid that if they didn't dig up their quad and put in some fiber then the professors who wanted to get access to supercomputers would go to a university that would do that so gradually the whole internet emerged out of the the the sort of policy vice you get into washington where you can't do the right thing you have to do something that seems irrelevant but has a logic to it that will gradually bring the market forces into play that will spin out ultimately a whole industry so i hope you got from that that it wasn't trivial it wasn't like somebody just walked out and said hey make a network i mean there were forces powerful forces the telecom lobbyists that did not want this to happen and we will we will see these telecom lobbyists a couple times as we progress through this lecture so larry smart and the folks that made the supercomputers convince congress to authorize the the giving of a grant to build the national science foundation's network it was going to use the tcp protocol that the arpanet had built it was supposed to be inclusive at least for research academics and it was you know research universities would we want to get them connected and so our story now leads from the university of illinois urbana-champaign to my university the university of michigan now if you take a look at this arpanet picture that was sort of right before well 1972 but if you look at all of the arpanet pictures there is a glaring omission from my perspective declaring a mission on all of these i wonder why aren't we there well there's a very good reason why the university of michigan is not on any of the arpanet pictures and that's because we build our own network we build a network using lease lines a three-node network between michigan state university university of michigan and wayne state university we were sharing each other's compute resources we were using each other's computers interactively we were playing online multi-user games we called them adventure they were all based on text but they were great fun and they were great community building we had chat forums we had all kinds of things and we used each other's software and it was a is a cool little world and the arpanet wasn't seen by many is all that significant it was kind of a experiment right it was a research project and we were a researcher we were production merit was production and arpanet was research so there you go right so there was actually a lot of these sort of like nascent networks out there early days and so the university of michigan sort of never really was significantly involved in the arpanet work but the university of michigan being a large university did want to get a supercomputer center and because of some strategic blunders as doug van hollen will soon explain to you um michigan didn't get a super computer center and so the idea was is we would do what it was going to take to get the network because that might be more fun so doug van hollen at the time was the chief information officer at the university of michigan having recently arrived from carnegie mellon university and he was also the chief of the merit network the network that connected our three schools and so they they figured that once they didn't get a super computer center they better really pull out the stops so to get so that they would get the nsf net and of course they did get the nsf net and doug van hollen was the principal investigator for the nsf net starting in 1988. so let's let doug van gaaling talk to tell you the story and i'll be back in a moment [Music] back in the mid 80s the national science foundation decided to establish super computing centers and the university of michigan was one of the organizations that made a proposal for a national super computing center the michigan proposal had as a as its primary hardware artifact a machine that was built in japan and i explained to my colleagues my new colleagues the university of michigan having just arrived that it was highly unlikely that the proposal would be funded a short while later i was visiting the national science foundation and i had gotten to know not well but had gotten to know eric block who was then director of the national science foundation and eric and i had a conversation about michigan's proposal and um it was clear to me from my conversation with eric that there was no prospect that the michigan proposal would be funded i said to eric i said well it occurs to me that what might be even better for the university of michigan than having a supercomputing center is to run the network that connects all of the supercomputing centers together i had an old friend who worked for ibm research by the name of al weiss who's in charge of all of ibm research's computing facilities and i called al and i said al uh this is a great opportunity but ibm is going to uh is is not going to be successful here and um i need your help and so i also rallied some folks from ibm research where there actually was work going on in tcpi protocols we got tentatively an agreement from ibm that they would contribute the hardware and the software to create the routing structure for the network but we still needed the communications facilities and at that time the cfo at ibm was a gentleman i think i remember the name correctly by his last name was crow and um so through ibm we went to him and he had contacts with a former ibm employee who is now the network that chief technical officer and essentially the chief network operations officer for mci his name was dick lee paper and so ibm approached dick lee paper and asked him if mci would be interested in providing the communications facilities for this network well as you may recall at that time mci was this fledgling organization some people had described it really as as a as a law office trying to create an environment where they could actually offer telecommunications services up against att's lobbying efforts and they were they had just been successful in that they were establishing uh facilities across the united states and uh and dick lebhaver saw this as an opportunity to sort of move mci into the big time um to be part of this nsf net proposal and so we wound up with an agreement that we would file a joint proposal with merit as the principal organization in partnership with ibm who would build all the routing hardware and software and with mci who would provide the nationwide communications facilities and while we were and then we got governor blanchard to commit a million dollars a year from state funds in addition so we wound up being able to submit a proposal to the national science foundation i think for something like 14.7 million dollars because we knew the ceiling was 15 but in fact by including all this in-kind uh activity it was actually more like a 55 million dollar proposal it was designed to start at t1 or 1.5 megabits uh with planned upgrades over the over the period of the network network's life um we've subsequently learned that uh the proposal was received with considerable skepticism by the reviewers of the national science foundation uh people really wondered about our technical ability to pull this off but uh the that review was conducted without reference to the actual funding pattern and then when the raps got pulled off of the amount of resource that was being committed by the partners to this proposal it immediately went to the top of the list at the nsf and and a short period later we received uh informal word that they wanted to negotiate with us about sort of working this all out but we had to do a lot of innovation uh the border gateway protocols had to be developed to allow multiple networks to interact with one another and we had to build increasingly more capable routing and communications facilities when we started the network we had t1 circuits but there were no cards for computers that would go at one and a half megabits we used the t1 circuits we subdivided and built a mesh network among all of the routers that we put in place it wasn't for about a year ibm was actually to build prototype cards that would go at one and a half uh megabits when we put those when we put those cards in our test network we discovered they they worked just fine we put them in the production network the network started failing on us and we discovered after uh after a very tough period that the folks who had built the t1 hardware for mci had planned on using certain bit patterns to do diagnosis on the network and it never anticipated the notion that anybody would ever use a full and a half megabits as a single channel they had always thought that it would be broken up into a set of voice uh circuits at 64 kilobits each and so they didn't have any worry about these patterns ever appearing on their network well it turned out that that happened with some frequency uh on the nsf net and it took the communications line down when it did so they had to re-engineer their hardware to undertake this we actually had for a short period of time put some translation circuits in our routers that actually looked for these bit patterns took them out and replaced them with something else uh and then reinsert them at the other end uh to make it work over these t1 circuits so we had a lot of adventure finally in i think it was around 1990 network was growing so fast that it was clear that these t1 circuits were not going to we're not going to enable what we needed to do so we had to go to the next step which was ds3 or from one and a half megabits to 45 megabits which was a very large step a 30-fold increase in capacity and in order to do that um we wound up creating another not-for-profit organization called advanced network and services merit was still the principal investigator on the grant but it subcontracted the development of this new network this 45 megabit network to advanced network and services which was headquartered in armonk and uh and ibm mci and uh nortel who uh all contributed three million dollars to the founding of this new uh organization so it had the uh the staff and the facilities to do the innovation that required us to go up to 45 megabits that did accommodate our capacity needs over the life of the nsf net the nsf net was the fastest internet network to the end it finally was decommissioned in 1995 when the congress decided that the federal government should not be in the business of supporting something that by that time in their view should have been become a commercial facility i'll not ever forget sitting in a house hearing room in the capitol and next to mitch kapoor and um some internet uh some small internet company uh startup ceos who were complaining uh to the congress that it was inappropriate for the nsf net to be funded by by the national science foundation when they could provide this service as a commercial service at the very same time they were making that complaint they were using nsfnet as their backup network to carry traffic when their much less reliable networks failed on a national scale mci of course turned out to be a major internet service provider also using the same technology in a in a classic um innovators curse moment ibm who was at that time the leader in routing technology for uh internet for internet backbones managed to decide that they should kill all of the work they had done in developing these routers because it would threaten their proprietary network efforts it's probably almost single-handedly responsible for the fact that cisco became the dominant router company in the united states rather than ibm [Music] so [Music] so as you can see none of this is easy it wasn't easy for larry smart and now we basically have the lobbyist once again the lobbyists are deciding that there's only going to be 15 million dollars because if it's 15 million dollars they can only build a 56 kilobit line but michigan thought outside the box some people might say we cheated other universities wanted to get the grant kind of thought we did cheat we didn't play by the rules like in star trek we rewrote the rules so we got it and we slid underneath the nose of a t a one and a half megabit network that started started large and started fast and got faster and as doug talked it was a very very very successful network and it it didn't it had follow-on effects just like larry talked about it had these follow-on effects of causing campuses to build networking because in would come this really fast national network so you better fix the on-campus networking so it was really quite wonderful and the partners ibm mci the other thing is is a t was being beat up by mci at the time and maybe they weren't noticing so we might have been lucky it could easily have been two years earlier or two years later it might not have worked so whatever it did work and and we can be very grateful so the university of michigan was sort of the center of they ran the network operating asian center and did the network design worked on the software with ibm and built the network and um it had a very exponential growth as it went forward this building right here is on our north campus and it was for me personally the very first place that i typed the first characters that went across the internet i walked in this door to i wasn't at the university at that time but i had friends here and so i came in and i was typing on you know hello on the internet so this building was where the network operation center was and um where i touched the internet for the the nsf net for the first time so it started out real simple but its goal its goal was to enable connections to other networks the regional networks got formed larry smart mentioned that and so then the backbone became sort of a forcing function to get more campuses on and more and more states etc etc [Music] do [Music] in the case of ghana private sector led government was not aware government was not interested and so the private sector took it out but the private sector we're discussing is me i'm not exactly private sector i'm academic only i did this job through the private sector meaning that i realized that you can wait for the university forever because they depend on government and you can wait for the government forever they have no clue so i actually have been known to have said publicly that i cannot wait for the government to do it for the country so i know how to do it i would do it no matter how small it is just to make the point that we are able to do it in so doing maybe i can create an avalanche that will carry through the momentum and that is what i try to do in the case of them now on the continent as a whole they have variety like in the case of egypt where the minister there happens to be one of our counterparts from internet society isak and that's tara camel and so there was a heavy government success built in other places like south africa there was good academic dose so it has varied depending on on the settlements but i think in much of west africa the trust had been principally from private sector because i also went around helping togo private sector helping gambia private sector and so even nigeria giving them intellectual property so they could also begin to do it so it became privacy helping private sector but for the common good because we really felt that we had to go together otherwise things would just disappear i allow some countries to transit their data through me and togo was an example but in other places i actually sent engineers to install notes for them and gambia was an example in other cases i actually trained people like swaziland they brought the telco people to me to be trained in other cases we did consulting services to help ethiopia and so on and so forth so that was roughly the progression in that the names had to confess but i was meaningful and we had to move mail and so we had to do that and then the next bit was get the capacities connected and download was a madras but it was a good community activity in the sense that we wanted to help each other and that has continued till now because now i actually run the african network operators group and the principal function of this group is to sort of help operators support themselves and usually we support ourselves with building our capacity in many different areas mostly in infrastructure related things like routing type of issues or server services type of issues for those who want to build information resources um the afnog has now become the meeting place for most of the technical community and we meet only once a year and when we meet his life for a product of two weeks within which we spend one week to train whoever has been admitted into the into the workshops and we run four workshops in parallel now it didn't start as four it started as just two and then we've added as we went along we even have a french track in the in the routing side of things um and when we meet after the workshops for the students then we try to sort of induct the students into what the real world is so we follow the workshop with meetings and the meetings including lots of parallel sessions those who are following the affiliate meetings will get to do that those who may be discussing some tutorials maybe in security we'll also get to do that and then we also have a day for like conference presentations or 20 30 minutes from a variety of different areas and so that has now become a major meeting place for the engineers and once a year we all congregate and it's supposed to be service to community type of thing in that nobody gets paid but we try to raise money for the student participants to come and we've been getting funding cisco has been a regular funding source and we've got funding from idrc internet society francophone and a whole number of others but over time we're trying to change our programs so that we become more self-sustaining so participants who are coming in from let's say operators telcos and so on we insist that they pay more of their way so that we can raise funding for academic and research people to participate and so on and that was really how how things are and it still continues to be a major meeting point for for many of our operators of course once we've got the connectivity in then there's a whole host of other issues that began to show up but we thought all of them can be solved within the same environment so you have necessity issues oh you have aftld let afj focus that you have a research and educational network challenge or build a consensus around the same meeting place so to understand afnoc has moved from being a workshop type of thing to a meeting place where many new things could be incubated and allowed to blossom and take on their own life [Music] now the particular case about the was interesting in that we it took us 10 years to get to accreditation of course in the beginning we actually knew what it was we knew was very important and for engineers we knew that that was what to focus on none of the other things so we began to build consensus around it you know a proposal was made first in 95 then 97 a formal document and then we moved on but we realized that we had to build something more than just casual meetings so we went deliberately to tottenham to agree on specifics and the specifics was who becomes a member of the board and we decided to do it geographically okay so that we don't get too much caught into why is that there are so many people from the north and nobody from the south so okay each one of the sub regions gets to elect one person so if the members who elect that person and that person would save for a period of one to three years depending on the staggering and that is how that one was okay but the the difficult thing with that was as we're learning that meant more and more were coming so when whenever we thought we had an agreement on something we're about to move forward like in operation then more people come one case was very disturbing and then that's uh sort of a moment and that one john kuster was also there this was very early and he had agreed that me just do it you have to do it has to happen so i invited them all to a meeting with the african engineers and everything and suddenly somebody got up and said no no no why don't we share it up into the countries and have this all go our separate ways and i mean at that point i thought that was the end of the world and to be honest with you i've never felt so ashamed because i was a bit of an interface between the african technical community and the global community i knew young poster myself personally and they've come to sort of help me and my people were not ready so all john could say to me was get your people ready and you will have it so i had to continue until i was able to build sufficient consensus and in 205 began and at that point was also evident that if he has taken so long there are maybe two factors i have to consider in 10 years i would have upset a lot of people and this could be real in the sense that the regis three registries had to give up something for affiliate to exist so it's inevitable that i would have made a number of people in those three registries uncomfortable so for me it was obvious i could not continue in the same capacity of course i didn't tell my colleagues that this was my real feeling but it was also true that if i were to stay there i'll get less participation i'll have less motivation for people to you know make an effort and what i needed more was more people making effort than just one person driving things because i knew that the space was so large that even if i had nothing to do with these things there was plenty for me to do so the strategic thinking was find a way to exit without telling anybody make sure you don't give up too much information for people to capitalize and game on it so everybody thought i was indeed going to be the next official chair not the you know startup chair but official chair and i let them think so but when it came to the board meeting i decided to dissolve the board and i got vote for it and then i decided i would not run again and at the same time certain people have been you might say controversial in their contribution i try to urge them not to and in some cases i even told them that if you tried i would martial the community to vote you out and in some cases they tried and they were voted out but that created a certain flow of uh maybe new entrants wanted to contribute service to the community and that has continued till now and and i think that that has been a very good thing and of course we also observe in the case of afrin that the ability to make local policy uh does affect the effectiveness of you know the task at hand so for example we know that uh soon after accreditation the type of membership we had begin to skew more towards the smaller size because majority of the operators were actually small operators and they were trying to fit into a large markets a large operator community environment when we were working with right and arranging and so on and that wasn't working so well so by us changing the policy to recognize the size the two sides of our operators were able to increase the membership and beyond that we realized that our location also we changed the policy so that our location sizes were smaller than the normal that a person a typical operator will get so by knowing the community and making the policies the community making the policy to suit itself you're able to get a better match to the needs of the community and and that has shown in the growth and uptake of at least ipv4 addresses from affirmative [Music] now one of the things that was quite interesting is we in the academic field had the right to use this network but when it first started out it was supposed to be academic only and then once it was established in the late 80s and early 90s there were some people that started sort of like bending those rules or bypassing those rules or ignoring those rules and it wasn't really clear who was supposed to police those rules and the people like at the university of michigan they their heart wasn't in reducing who would use the internet they wanted to expand right and so there were places um often bulletin board systems uh would somehow get a connection to the internet and then they would give internet access to a bunch of non-academics and so it became increasingly in the late eighties understood by everyone around the world but this is pretty much nerds right they can sit with a screen with a command line interface with the text based screen and they can get excited about that i'm i'm like that i can get excited about things that are non-graphical because i just make it up in my head but we needed something to pop this into the collective consciousness of all humanity rather than just academics first and then nerds second but that was about to change so i started our picture where we had super computers at university illinois we went to university of michigan we got the nsf net funded and now we're going to fly across the ocean to cern cern of course is a high energy physics laboratory they do this 26 mile circle and smash particles and take pictures of the smashing particles and looking for the higgs boson or found the higgs boson um it's a really cool place i recommend that you visit the cern is a place that physicists from all over the world visit live at and collaborate with for experimental physics it really revolves around those experimental facilities and so regardless of whether you live in russia or australia or germany or america or japan if you're an energy nuclear physicist you have got to work with cern or spend time at cern and those who spend time at cern tend to make the best discoveries and so these projects have such long lead times and they take so many different kinds of talented people you know they're metallurgy welders physicists engineers designers project managers there's a ton of people involved in it and it's a pretty well funded operation and everyone's pretty smart so one of the things that they do is they have fun they have these clubs like the softball club the cricket club the blues club and and it's kind of like these people are somewhat away from home a lot and so they just have fun with each other this is a picture that i will show you of the cernettes they are famous for being the first band photo on the web some people wonder if they're the very first photo on the web but they uh are a doo-wop group and they sing sort of 50 style doo-wop songs they are very fun to watch but their songs are about like particles and internet and modems and stuff like that stuff that i care a lot about um they sadly they've been they've been doing this since the 1990s um you know 1991 92 kind of time frame um and they all grew up and their kids grew up and their kids were all in college and so they they did their farewell tour in 2011 and i and one of them's moved to australia so we don't know if the cernettes are ever going to get back together and sing but for now they're on permanent hiatus but for your viewing pleasure take a quick look at one or more of the cernette songs [Music] so like i said you should go visit cern i have had the great fortune to visit cern i have visited cern with an on professional uh in a professional role i help them record lectures with my synchromatic software that you may have heard me talk about um i i have visited sort of in helping other technology things and i uh teaching and learning with technology and things like that and so they've got a a wonderful cafe and if you're working with people you get to go in the back and hang out that's pretty cool they also have a wonderful museum that you can go see the first web server and all kinds of other cool stuff and uh in 2006 06 i was lucky enough to have an invited talk at cern where i talked about sakai and how they might use sakai to do collaborative work and i brought my family with me and so my wife theresa and my daughter mandy and my son brent and there is me we are in the pit and this pit is like like eight or nine stories underground it is six stories tall this is where the beam comes in right here it's three stories from the bottom of the pit the pit is six stories tall at this point the it was only less than one-third complete and so we could go on a tour and so i have a family photo with hard hats in the cern pit and uh that's pretty cool another time i visited um i went and sang with one of our university michigan physicists and that would be this guy right here his name is stephen goldfarb and he's a physicist that works on the atlas project um but he also is the bandleader of an all physicist band called the cernettes of the canettes blues band and he let me sing a song i i was coming to do some video work for him i happened to be in the area and i just stopped by on one of my trips and me and another michigan staffer we grabbed a couple of cameras we made some music videos for him and put the cernette some of the cernette's music videos up on the web and then they let me sing one song called i got got my mojo working and so i've got i'll share with you the video i've got my mojo working this just oh so everybody in this band is a physicist and pretty much everybody dancing is a physicist too now the reason i'm showing you all this is to give you a sense of the energy and joy in addition to the hard work that goes on at cern you guys are all going to be on youtube soon this guy's got more channel than that though he tells me he doesn't just do the gym i'm going to get him up here if you guys help me you make him come up here southern michigan bluesman's mom you can do the mojo let's go [Music] so [Music] oh [Music] i [Music] oh [Music] [Music] i wanna do [Music] so [Music] oh [Music] i wanna love you so much [Music] just [Music] good [Music] oh [Music] [Applause] [Music] my [Music] [Music] he's got that knowledge [Music] oh [Applause] [Music] [Music] [Applause] [Music] [Applause] [Music] [Applause] okay [Music] [Applause] so i don't know if you noticed halfway through i i knew that song by heart but i had still written the lyrics on my hand and so halfway through i you can see me look at my hand to check the lyrics that i knew made me look like a dork i'm not i really want to sing i'm just not a very good singer and thankfully steve let me sing with the band so back to the topic at hand in 1999 i visited cern uh as one of my first tasks at the university of michigan to help them with lecture recording and uh i said hey i got a camera you know is are the inventors of the world wide web here and we still had a little bit of the television show going back then so i went and interviewed uh robert caillou who was still at cern he was just sort of across the street from the cafeteria and we uh walked into his office and gave him a microphone and just started talking about the beginnings of the world wide web and robert caillou is the co-inventor of the world wide web along with tim berners-lee who who built it at cern so let's take a listen to robert caillou [Music] hi charles severance here i'm at cern in geneva switzerland one of the world's preeminent high energy physics facilities we were lucky enough to talk to robert caillou one of the co-founders of the world wide web the big collaborations at cern and steven uh is a member of one of them uh had people spread all over the world and use cern as the infrastructure to do the experiments and so um obviously the the whole of high energy physics has been the sort of miniature information society since way back when as soon as there were networks essentially and so because we have this need for spreading documentation around we built these things like centralized databases there was certain dock you know you could use it but whatever and we had a lot well we still have a large database of energy physics articles uh kept by stanford and you could get at it before the web by knowing exactly what computer to log into over the network blah blah blah blah when when the web came all that necessity of knowing which computer to go to what to say to that computer and so forth just disappeared you know you people put up these pages with the links and you could just follow links and get to places where you wanted to be and find everything and it was also all in the same format so that was very important too that you know we broke this proprietary commercial system of vertical markets which you know don't let you get at anything except if you stay with this particular company or with that particular company so that that horizontal split that cut that we made between the browsers on top and the databases at the bottom was i think essential uh to make it useful for us but also to make it useful for everybody else right and so that was that was what it was like in the beginning and tim and i we did this all on this next machine here uh in about 1990 so the first server was up about 1990 the first uh end of 1990 the first uh server in the united states came up about a year later uh at stanford because of that database that i was talking about before you and tim feel like this was the big one or it was just well in a sense of course as i always say we called it world wide web in 1990 and and tim had in fact a name like that just before that so there was in fact no way of building it smaller than the internet already was and the internet was everywhere so there was no way we could build it smaller but the thing that we probably did not expect or did not aim for definitely in the beginning at least was to have this be useful outside the community of academics and internet people that existed then see the internet came outside the academic world only what after i would say also after 94 right roughly what i think at the beginning of the yeah 94 is what i call the the the year of the web it's when we had the first conferences where when you know mosaic got off the ground really and and when commerce started to notice it and companies began to be formed and so forth exclusively with that in mind and that was only then before that it was mainly universities and academics gopher was simpler and easier to install and easy to populate and that explained its easier success for a while because both took off at about the same time and because the web was somewhat more difficult it took somewhat longer but gopher became also integrated in the web browsers almost instantaneously and so you know after that after after short while you saw what you could do with together what you could do with gopher you went for the web but it was easy much easier to install gopher this is a very which is why it had a sort of bump where it it went ahead of the web for a while the same is true of mosaic mosaic was just no good but easier to install right and so it went ahead of what we were trying to do we were completely killed in the browser environment because what we tried to do with our browsers was more difficult than what mosaic was trying to do and so you know this proves that a better thing sometimes gets killed or takes much much longer to come up because the easier thing so like a virus outgrows the other one i think the the real problem was that this development system is so much better than anything else that porting what we had here to any other platform took an order of magnitude more time and the for example every time you clicked here you had another window right every time you clicked on a diagram you had a diagram in another window when you clicked on the map you got the map in postscript scalable perfectly printable and so on and so forth and try to port that to another system you go berserk and this is the reason why in mosaic you had only one window and every time you clicked you replaced the content of that one window which was not what we wanted every time you see a page you've got the images in line so you scroll there are gone and so this is all not what we wanted is horribly complicated for the user and and it's not efficient but you know it was the easiest way to do it on the next system and so you know there you go so that thing spreads and and if you want to make you know there is a big difference between making an editor and something that just puts up a page and you can't do anything with that so um our system from 1990 was also the editor i mean i started it's only after next stop making hardware and i had to go back from a next to a macintosh that i had to learn html right i mean before we produced all the documentation and stuff but we never saw any we never in html we never saw any urls right because you linked by saying link this to that not by typing in the url there was a special window you could call up in which you could type a url if you needed to but it wasn't the usual thing i mean this navigation now which says http you know i learned all that the hard way afterwards that you have to use that because you've lost that system right the interesting thing though is that i see i find html a glorious language no i'm not i'm not the average user but i like to write it i mean i enjoy almost word processing it's like tech it's like yeah right right it's exactly as bad as tech that's exactly what it is it's exactly as bad as thing and can you imagine i mean the headings have levels which are absolute true you need a plus what um i mean come on html was we just didn't have must realize that we were never more than four people here tim and myself and a student each okay so things like uh putting in serious effort into thinking about html was a low priority business and on then of course by the time it there was time to do that it had spread beyond repair right that sort of like units right right but of course we had like a virus uh yeah uh in a sense we we um we have xml coming now fortunately so that'll that'll help this machine here uh it's um well it's more than 10 years old now it's got it runs unix but it has unix with a nice visual interface it had some other interesting things which was the which permitted us to do the development of the web in such a small time such a short time and that was it had a completely uh or it has a completely object-oriented development system in which there is already supplied as part of the library and editable text object and that was what tim used to make the first browser and this this was all nice because it got us there very quickly and then we realized that you know somehow the real world uses these horrible machines and you know porting it from here to make it available on these horrible machines in the same elegant way is an enormous amount of work about halfway through 93 i think we we made a last effort in outputting browsers that were also geared to becoming editors but then there was just no hope so i really like that video it's real precious to me i shot it in 1999. i don't know if you noticed but i got a ponytail in the back and i got a village people mustache um when i first came to university of michigan i thought i was pretty uh pretty special and so i grew the ponytail uh i don't have ponytail anymore i got all my hair's a little darker now um it was i really liked the moment where he started yelling at me because i liked html and html these days with html5 is so much better than html was he was exactly right that html wasn't elegant but it was amazingly powerful at the same time and the fact that we could see it meant that people believed in it and it was it wasn't magic um the other thing that you might have noticed is that he had some very strong opinions about the design of what a web browser should be and they might not have seemed logical when you first listened to them one thing he said was every image had to pop up in the same screen and mosaic had one page that replaced the whole thing and you saw the images in line which is not what we wanted to do that's what robert said that might sound illogical today with farmville and facebook and you know all these things but you have to understand that in 1990 the network was very different even than it was in 1995 it was a very slow network and if you put images on every page they would slow down terribly and so what that user interface looked like is you had a document that was the text bold italics these kinds of things and when you clicked on something you got a new page and then it would take a while for this page to display because the networks were really and truly slow and the computers were slow as well so that might sound to you as an irrational design choice but in the time of 1990 it was a completely rational design choice and one of the things that sort of changed over time was that became less and less a rational design choice as the networks became faster the computers became faster an image became images became more of a natural thing that the technology was capable of handling so continuing our story we started at university of illinois urbana-champaign we went to university of michigan we got the nsf net up in 1990 and cern creates the world wide web the web took a great step forward at stanford university when the very first web server in america came up now the fact that it's the very first web server is not actually all that important the what was important was what was on that web server um it was 300 000 physics papers at the slack stanford linear accelerator stored in the database on a mainframe and uh you met and caillou robert caillou mentioned this in his conversation about this now what happened was is paul coons who you meet just in a moment he you know he said i'll put my database the database was well known and people had many ways of using it but with the web it made it much more easily used and so i think in a way paul coons inadvertently created the first search engine the first reading of something that's mostly content that people read up to that point tim berners-lee and robert were really trying to build something that allowed collective editing of information stored on servers all around the world so let's take a listen to paul coons [Music] so [Music] well the database that was here at slack was used by people around the world but with great difficulty because they had to have an account on the mainframe most people weren't familiar with mainframes and second of all the database language that you type in was difficult so before there was a web i invented a way for people to do what's now called instant messaging and to do a query to the database without logging in and that improves access to the database but you still had this terrible language of the database machine to to type in in order to do your query a little later on people added an email interface so you could send your query by email and get your response back by email so when i was at cern in september 1991 and tim demernos lee dragged me into his office to show me give me a demo of the web when he demonst at first i wasn't very interested but when he demonstrated doing a query to a help system database on a mainframe i immediately put two and two together and says well if you can query a help system on a mainframe you can query a database on the mainframe and i started getting interested the thing is that we couldn't change the query commands because that was built in the database but the web page could give you examples and remind you what the query would be so did you have to write it all from scratch i mean did you write it from the protocol or was there software that you reused to make your first web server well i used the cern server software which was written in c unfortunately we had a c compiler on the mainframe at that time that wasn't very long we had a mainframe of c compiler but we had one so all i had to do was to write some extra c code to uh get the the query that the user had made and turn it into the database query when it was december 12 1991 we installed our our web server and we we informed tim berners-lee that day to give it a try the big boost came about a month later in january in southern france where there was a workshop on computing topics for high energy and nuclear physics and at that workshop tim burnley had a preliminary talk so he gave his hour-long talk to about 200 physicists from around the world and as part of his talk he gave a demo and at the very end most people i think were bored most of the time i mean the worst thing that software people want to think about it is uh documentation and he was pitching documentation but at the end of the talk he connected to the slack web server and made a query and that really dropped a lot of draw jaws because everybody knew the database everybody knew how hard it was to access okay and here he just clicked away typed in a few things for the query term and bang the results came back and nicely formatted so i way i say the interest in the web went from about 20 people to 200 people in that hour okay now those 200 people went back home and if each one of them told 10 people then within a week the interest in the web brewed 2 000 people so uh that that was the big turning point and i think tim recognizes that that was really to kick off and i think the reason that the web took off so quickly once the comers appreciate it and started to realize it it's a win-win situation okay it's a win for the customer obviously because he can do price comparisons he can browse this airline schedules on his own and visualize what he wants to see quickly he can cut and try different things as much patient as he has to get the price down so it's much much better for the consumer but what about the provider the airlines well it's much better for them because just software is running on machines okay it's much lower cost for them and and so they're winning too i point out in my talk about the web near the end sort of a punchline that in doing big science we're solving we're finding solutions to problems that the general public don't know they have so who would predict that out of high geophysics research he would come something like the web would come up i think that would be unpredictable but in hindsight you can see that it was a natural place for uh for the web to have been invented so again i think paul really created for us the first search engine and showed that the web could be a mostly consume environment and be extremely useful when you think about it if there was no web and there was no content how would you know that the web is a great way to view content so paul coons had the advantage that he had a lot of content and it was content that at least the physicists found extremely valuable and then people go oh i can see in fact i remember when i first saw the web i was like okay fine it's got pictures who cares um because other things like gopher were just as good and then i saw the ability to go to federal express and track a package and go like now that's a cool idea right the go gopher couldn't do that as a matter of fact in 1993 sort of three years after the web was created the web was actually not all that popular 1991 was the paul coons but that was mostly physicists and a few nerds myself included but in 1993 gopher was a much better product and it was a much more beloved product and the problem again went down to came down to the fact that the network was so slow that simple text-based things worked better than highly graphical things and there's an apocryphal story that happened in a march of 1993 where there was a meeting of the internet engineering task force that does standards for all these things and they had a bird of a feather session for gopher and a bird of a feather session for the world wide web and the gopher session was full of people and they couldn't have enough in the room and they were sitting on the floor and peeking in the door and then later i wasn't at this meeting by the way rich wiggins my co-host on the television show was at this meeting and what he tells me is at the birds of feather meeting there was almost nobody there and turned berners-lee was like i've been working on this thing for three years and it's better than gopher but nobody wants to use it and then people in the room at that point told him they said it's just too complex it's too hard to get working and so this is a long time ago and the web was not assured to be a success at that point it just wasn't really clear the university was getting the idea that we had spent a lot of money on a campus-wide network and that maybe we should use it for something besides connecting to the mainframe so most of the top of the computer center was saying maybe we should do something with publishing information that'd be an interesting sort of use of the network that more people could use than just the people who wanted to talk to the mainframe were you surprised at how fast gopher took off yeah absolutely shocked uh because we basically wrote something that we thought would be good enough for our campus and hoped that a few other places would use it because then if somebody at a different institution used it we'd have some credibility locally that hey we're not just saying it's good somebody else is saying it's good so that it was accepted is probably more a sign that it just happened to be the right thing at the right time the environment was ripe for something like that and we had an okay product at the right time yeah basically we put it up for anonymous ftp and sent out a couple announcements on using it saying hey here's a thing that does kind of a campus-wide information system the fashion back then was to say everybody needs to have a cwis a campus-wide information system and so gopher we thought made a pretty good one because you didn't have to centralize publishing on one machine you could let individual departments publish off their own machines and control their own content that model turned out to be a pretty good one so lots of other places started using it so we had kind of the right idea at the point where this particular issue campus-wide information system was something that all of the all the university computer centers were saying i've got to have one of these the technology we built gopher to work with were fairly slow machines with limited graphics and slow modems that was our design point because that's what people had six seven years ago and while we added graphics and movies and those sorts of things gopher wasn't designed to be hyper text and really media rich documents it was aimed more at run fast over limited bandwidth on slow machines so 1993 was kind of like it was a real there's a lot of things that could have been very different in 1993 and i want to show you a commercial and there's a couple of these out there from 1993 that one of the large telecommunications vendors put out as a national ad campaign and this is what the telecommunications industry was thinking about they saw it all happening they saw this all happening they knew it was going to be big they knew it was going to allow interactions in many ways they knew it they they weren't dumb they it was like whoa what's going on no not at all they absolutely knew so this this really wonderful series of commercials is um [Music] it's just quite amazing so take a look at the end you'll see actually what the large telecommunications company's name is so take a quick look have you ever renewed your driver's license had a cash machine nice picture you will and the company that'll bring it to you a t have you ever watched the movie you wanted to the minute you wanted to learn special things that's all taken from jazz now any questions from far away places so where did jazz come from good question or tucked your baby in [Music] from a phone you booth and the company that will bring it to you a t and t so again sort of looking back at this and looking at things that might not have been um there's nobody that really makes the connection that steve jobs might have had some impact on the world wide web but in a way he was actually quite influential steve jobs of course founded apple and then was kicked out of apple um at after the macintosh um and he started a new company called next and if you listen closely to what robert caillou says and you listen to closely what paul kuhn says and the computer that was on my desk during this entire period of the 1990s to 93 were next computers next was a bold unix-based highly networked high-definition display high-performance computer that steve jobs built when he formed the company next after he got fired from apple of course he eventually came back to apple and the next technology is macintosh that's the macintosh operating system and so if you make a mistake in macintosh you might see a error message starts ns something that's called next step which is the operating system on these next computers for the first three years of the web it was pretty much only on the next computer people would even the server was you know on the next computer and the browser was on the next computer and and so it's the next computer really kind of did that i i wrote an article in uh sort of a couple months after steve jobs died for ieee computer magazine's january 2012 issue that really sort of tried to at least from a historical perspective point out how important steve jobs may have been in helping the internet get formed [Music] hello and welcome to conversations with computing in this column i take a look at some of the second order effects of the technology that steve jobs produced throughout his career and how those technologies from apple and next often served as an inspiration to many of the early innovators in the internet and world wide web i found out that we'd lost steve jobs on the evening of october 5th right in the middle of a lecture on using regular expressions in python i was recording the lecture for a later podcast because you might want to go down the nerd rabbit hole later and that's okay there's nothing wrong with avoiding the nerd rabbit hole but this is a nerdy thing it's extremely nerdy i think it's awesome a nerdy announcement [Music] the very next week in the same classroom i was giving my lecture on the inside story of the history of the internet and the world wide web sharing some of my interviews with early pioneers as i gave the lecture and watched my video interviews thinking about steve jobs i began to realize how important apple and next technology was to those early innovators in some ways the very existence of those technologies helped propel the internet and web revolution forward in my 1999 interview with robert caillou the co-inventor of the world wide web we are sitting in his office by the next cube that ran the very first web server as robert describes how the next stepped development environment allowed them to quickly build prototype versions of a web browser in 1990 you get the sense that their next hardware and software was very much an equal partner in their early visions of the web obviously the the whole of high energy physics has been the sort of miniature information society since way back when soon as they were networks essentially and so because we have this need for spreading documentation around we built these things like centralized databases there was certain dock you know you could use it but whatever uh we had a a well you still have a large database of um energy physics articles uh kept by stanford and you could get at it before the web by knowing exactly what computer to log into over the network blah blah blah blah but it was all very difficult and then so when tim invented to web and i had a separate proposal i dropped it because tim's proposal ran over the internet and that was clearly much more efficient um when when the web came all that necessity of knowing which computer to go to what to say to that computer and so forth just disappeared you know you people put up these pages with the links and you could just follow links and get to places where you wanted to be and find everything and it was also all in the same format so that was very important too that you know we broke this proprietary commercial system of vertical markets which you know don't let you get at anything except if you stay with this particular company or with that particular company so that that horizontal split that cut that we made between the browsers on top and the databases at the bottom was i think essential uh to make it useful for us but also to make it useful for everybody else right and uh so that was that was what it was like in the beginning and tim and i we did this all on this next machine here in about 1990 so the first server was up about 1990 the first end of 1990. the first server in the united states came up about a year later at stanford because of that database that i was talking about before the the real problem was that this development system is so much better than anything else that porting what we had here to any other platform took an order of magnitude more time and the for example every time you clicked here you had another window right every time you clicked on a diagram you had a diagram in another window when you clicked on the map you got the map in postscript scalable perfectly printable and so on and so forth it tried to port that to another system you go berserk you know there is a big difference between making an editor and something that just puts up a page and you can't do anything with that so our system from 1990 was also i mean i started it's only after next stop making hardware and i had to go back from a next to a macintosh that i had to learn html right i mean before we produced all the documentation and stuff but we never saw any we never though in html we never saw any urls right because you linked by saying link this to that not by typing in the url there was a special window you could call up in which you could type a url if you needed to but it wasn't the usual thing i mean this navigation bar which says http you know i learned all that the hard way afterwards that you had to use that because you've lost that system right steve jobs next workstations were essential to the creation of the earliest web software their advanced development environment and rich display capabilities led early innovators to think of the internet less as a text only and more as an engaging multimedia experience in my 2007 interview with paul coons who brought up the first web server in the united states on an ibm mainframe at the stanford linear accelerator in december 1991 we can see that he still has a working color next workstation in his office when paul was developing software for the ibm mainframe there are around 20 web servers in the entire world if you wanted a web browser to test your server having a next workstation was essential so when i was at cern in september 1991 and tim deverance lee dragged me into his office to show me give me a demo of the web when he demons at first i wasn't very interested but when he demonstrated doing a query to a help system database on a mainframe i immediately put two and two together and says well if you can query a help system on a mainframe you can query a database on a mainframe the database itself had about 300 000 entries and was it heavily used was it was it heavily used before you put it on the web and then continue oh yeah it was heavily used people had signed up for have computer accounts on the mainframe so they could do their queries and from all around the world i think there was some 4000 register so did you have to write it all from scratch i mean did you write it from the protocol or was there software that you were used to make your first web server well i used the cern server software which was written in c unfortunately we had a c compiler on the mainframe at that time that wasn't very long we had a mainframe of c compiler but we had one so all i had to do was to write some extra c code to get the query that the user had made and turn it into the database query now wasn't the original web server on a next and that's the the uh all the web software was originally developed on the next computer at cern while the web and web technologies were gaining traction in the academic and research communities who often had access to powerful unix workstations on their desks the rest of the world was browsing the internet using the text oriented gopher clients and servers in 1993 the national center for supercomputing applications at the university of illinois at urbana-champaign released ncsa mosaic mosaic was the first graphically rich web browser that ran on unix macintosh and windows in my 1997 interview with larry smarr he points out that the success of the ncsa mosaic browser occurred in part because of earlier work developing the ncsa image library to take advantage of the graphics capabilities of early apple macintosh computers what we basically did during that ladies period was to make the world safe for images what ncsa image did was basically we said we want to build a world of infrastructure in which it's as easy to move an image around as it is to move a word that was our design parameters back and and that's the way we talked about it back then so that meant we had to scale the network scale the disk drives scale the compute power we had to go to full color when the mac 2 first came out 256 color levels uh we got 50 of them apple gave us 50 mac twos which was like stunning uh in those days we were in fact we're the largest funded group in the academic group in the country for the apple advanced technology group ibm at that time was telling their customers you don't need color we've already provided it as i said you have four of them black white cyan and magenta why would you need more so what we did was we took things that were on 100 000 computer graphics workstations of image processing that medical imaging people used satellite reconnaissance people used and we took all that put it into software and ncsa image on the mac and so you could just move the mouse and do what it would take you would have otherwise cost you a hundred thousand dollars to do and you'd have to be an elite specialist again taking things that elite people knew how to do could afford to do and making it available to the masses the day after i found out that steve was no longer with us i happened to have a morning meeting in new york city so i went to the apple store in manhattan to get a sense of our collective reaction to the loss of such a brilliant visionary who's affected us in so many ways hello everybody this is chuck i'm here at the apple store in manhattan i'll show you kind of what we're seeing we're seeing press we're seeing a impromptu memorial for steve jobs uh that's got apples it's got flowers it's got all kinds of really really cool stuff and but the the crowd seems really quiet there's a media that's interviewing all the people you can see uh see the media there grabbing people off the street and interviewing them sure i'm sure they're capturing their thoughts and so here we are sort of one day one day after uh the passing of steve jobs throughout his career at apple and next and then again at apple steve jobs was never interested in the least expensive nor the most profitable technology he pursued the best and most advanced technology in many of his designs and decisions he provided a road map for the entire consumer technology marketplace it's no wonder that many of us would wait with breathless anticipation for each and every apple announcement they were always an exciting glimpse into the future of technology over the past 30 years the products that came from jobs companies were a joy to use and perhaps more importantly they inspired many innovators to keep their focus on the exciting and unexplored future [Music] so we started our journey at university of illinois urbana-champaign we got the nsf created university of michigan we created the web we created the first web server and now we're going to come back to where we started we're going to go back to the university of illinois at urbana-champaign these are the folks that basically exploded the web moved the web from the academic space to the commercial and end user space and there's all kinds of things the network was growing things were getting faster personal computers were getting better a very rapid pace because the ibm pc is almost 10 years old and many computers they were getting quite fast in the in the 90s so the 1990s our computing and their display capabilities were getting better very rapidly and in this environment ncsa at urbana-champaign university illinois urbana champaign built an open source web browser that worked on mac windows and unix it was the first web browser that did this and it it was the thing that really made it so average people who had pcs or macs in their home could get a network connection and start enjoying all of this new content that was coming up on the world wide web hello my name is charles severance and i am showing you right now a desktop application that works on my macintosh called ncsa x mosaic this was a software released in 1993 and it was the first mosaic it was the x windows mosaic and i'm running it on my macintosh and you can download it from my github account uh csab is my github handle and it's the x mosaic dash 1.2 so what we're looking at here is the original default home page at the national center for super computing applications at the university of illinois and we see it as we saw at circa 1993 we can scroll through we can see the old good old days the visited set links in purple and so if we scroll down you'll see some neat pictures and the wonderful gray background list tags were were real back then you know we can go see something about visualization at ncsa i've done a little tiny bit to this to teach it a few things you can see all that i've done on http um on my github site you can see everything i've done on my github site i try to stay as i just made a few more things work like png images and http 1.1 so you could visit more more pages another another wonderful page to uh another wonderful page to take a look at is the original first page on the world wide web the info.cern.ch and again we're experiencing it now circa 1993 to see what's out there take a look at the people involved in the project so here's some of those pages tim berners-lee of course robert caillou of course and here's some of the other people that were working at cern so there you go that is a quick experience in how we surfed the web in 1993. after this the ncsa staff student programmers software developers all formed the company netscape in order to commercialize all this joseph harden was the supervisor of the software development group at ncsa that was responsible for building and releasing both the mosaic web browser and the httpd web server as open source and gave them way free [Music] the ncsa mosaic web browser was developed at the university of illinois starting in 1992 ncsa mosaic brought the web to apple pc and unix systems joseph hardin managed the software development group at the national center for super computing applications [Music] ncsa was kind of a special place right one of those things that happens very infrequently where there's lots of energy a large amount of resources visionary leadership and a lot of free space for people to play and larry smar has to figure in any discussion of it quickly because he's the person that created that space and that gave a bunch of us people in visualization people in computational science people in collaboration technologies people in all kinds of stuff networking in the late middle and late 80s a space to just figure out what it was that we thought was fun and what was interesting you know what i thought was interesting was how people were using these new technologies to work together we've been working we started out working with tools that supported simulation and computations on the main supercomputing systems that's what the software development group did and there was a group before i joined it that did ncsa telnet right this is back when we were trying to convince the physicist that decnet wasn't such a great idea that we wanted this crazy open nobody owns it right ip um model for the network and that was kind of a bit of a struggle because if it was an open source type thing and nobody owns it why how's anybody going to have responsibility for it and why not use decnet which is this perfectly good proprietary well-developed you know system yada yada so it was a long time it took a while for us to explain to the physicists why we wanted an open model for um tcpip for the interconnection of all these networks and i spent time for a while with a program at ncsa called the affiliates program where we gave people mac ses and told them about tcp and put the ncsa telnet on the machine and said now here's how you get to the super computers larry recognized from the beginning and we all loved the idea that these small little things on the desktop were really the gateways for everybody to the big machines in the background and that all of this would turn into one cloud behind the the screen that we needed to figure out how to get the user involved in as much as possible so we were interested in building tools for end users that worked with pcs and that allowed them to do their work better in the computational science communities that they were part of on the super computers and it's an easy extension from that to think about collaborative technologies in the large how do people work together not only with these tools but also with simple communications email papers data sets that they want to share and how do they do that initially our interest was in synchronous tools um when we were thinking of collaboration tools so we were building something called ncsa collage which was a set of tools that worked and one of the big deals was that they worked across the three platforms the x windows for the unix people the windows environment and the mac and that was one of the things that was sort of part of the underlying culture there was we wanted to make them available to as large a community as possible so started working on collaborative tools and there was a set of people and each other on each of those machines that was working to make it possible for people to share in real time images of their data the spreadsheets of their data and papers that they had run across interesting references to uh with their colleagues who were remote from them geographically so that's the context in which dave thompson who was one of the developers one of the ex-windows developers the lead x-windows developer i think for the collage tool um pulled down one of the early um web browsers and it was the one from slack and i can't remember its name and he went through the effort of getting it working and brought it in and showed it to mark andreessen and i and both of us looked at the screen dave described what he had in front of us and we said we can do better than that that's a complicated system and the interface looks terrible and it dave said that it was a real pain for it for him to get it working for him to download it and install it and compile it and everything and it only works on the next windows box and wouldn't it be cool if it worked across all three of the boxes and if it was something that was just a plug and go like the rest of our tools so mark and another developer there ran with the idea dave wanted to work on it but i said dave please finish up with the collage stuff and this was before we really understood what was meant by open source so we just and we wanted everybody to be able to take the software and do whatever they wanted to with it we weren't that concerned with commercial advantage we were more interested in it being open and people being able to make contributions back to the code and taking the code and doing what they wanted to with it so we just put it all in the public domain which as the folks from apache will tell you was kind of ambiguous it's not clear what that means um but at least it got the code out there so mark and eric we're working on that um working on what mark and eric were working on what the first generation of ncsa mosaic and this was the next windows app and nobody was working yet on the windows or the mac version and we saw the first version of that come out when was it early 93 late 92 and it was probably early 93 and the response of course was fantastic it's wonderful to be able to just click on something and see it right away and indeed the combination of um hyperlinks in a document as far as navigation and retrieval of documents as a user interface is just great and a lot of people got it immediately especially people that were working with the tools at ncsa and the companies i remember an hp exec coming in one time and mark and eric had written a little filter that took unix um documentation and made it into html and made all of the references into links and they went to they hit an hp site and the exec said where's this coming from because he was able to see all of his hp documentation there in the room at ncsa and navigate through it real easily and he said well you've got three or four folks back there that have put up httpd servers you may not know what those are yet he said i've never heard of this and we said but this is the kind of thing that um is probably going to be really useful in the future for people who are trying to manage documentation in a distributed fashion yada yada right we went on with a story like that and this guy was bouncing up and down in his seat and this was the kind of response that we got with it and for a while we tried to integrate the earlier work with collage with the work with mosaic and early versions of mosaic have a collaborate button at the top and that was something that would allow you to pull in something from a synchronous session on the idea was that you'd be working with your colleagues synchronously and then asynchronously go off and use the browser and pull something into the um the collage session or vice versa be working on something in the collage session and be able to access it through the browser or just use the browser as a component and uh in the session well um the browser of course took off we had tom redman who was the lead for the mac version though uh he was the person who really built it was alex toten who was a excellent developer and a lot of times ran ahead of the other people that were working on stuff especially on the mac and chris wilson and john they were working on the windows version so all of a sudden in 93 late 93 early 94 there's this full suite of mosaics that work across x windows the mac and the windows system and that's the point at which the guy who is the president at the time of the internet association said ncsa has fired a shot heard around the world because it's available now across all these platforms and anybody can use it we were convinced deep down inside that all of the new technologies and the digitalization of the world and everything was going to make a huge difference we just weren't sure exactly how at the same time when people would come to us early on in the mosaic experience and say we want to commercialize this and do this with it or do that with it there was a lot of you know we're not sure i remember sitting in rooms with commercial folks that came in and pitched to us and i bring the whole team in um the whole software development group and say listen to these guys and you know tell me what you think about this um and we were nobody was sure what it was gonna do and there were people who started off building browsers and just kind of never got off the ground it wasn't until the netscape effort started up that there was sufficient energy and sufficient resources i think to really get on and ride and push um and and you know crank up a group of x hundred developers uh in a matter of months and then they were you know quickly overshadowed by the effort that um microsoft put into it i remember one of the netscape guys saying um i came back from a meeting with some folks up in seattle and they said that microsoft now had this is when netscape is writing at the top of its form right top of the game this guy says microsoft just told me that there's somebody from microsoft just said that they've got like 2 000 developers working on this said at that point i realized that we were you know going to have some difficulties um and we of course always felt that there should be more than one browser um and so we wanted this because we were interested in standards and openness right at the time if one there's only one browser then that company gets to determine what the standards are there were all kinds of hassles early on about putting in different features and the browsers driving the standards rather than the standards driving the browsers and all this kind of stuff so we wanted some diversity there was a feeling very early on that this was going to be a real gas this was going to be hot that this was something that was the response was just so immediate if you go back and you ask people who were sitting in front of machines in 1993 94 or 95 right and you say do you remember the first time that you used mosaic right you remember the first time you used a browser it wasn't mosaic and vast majority of them that i that i've run into say yeah i can remember and i remember one of the nsf program officers calling me up one day just out of the blue and saying i just wanted to tell you you know you guys probably this is like after we've been playing with this for months right not not years just months he said you know sometimes we just sit around and click on things because it's still such a surprise to have a picture or a whole other page of information pop up right and we don't know where it's coming from and we don't know who's putting it up but we still haven't gotten over the g wiz aspect of this yet [Music] so at the end joseph is really talking about how microsoft had to react to netscape and how all of a sudden the market begin to sort of crash together and tons of investment a lot of intensity triggered mostly by the release of the windows and the mac version of mosaic 1994 was distinctly different than 1993. the staff at ncsa went and founded netscape in april 1994. the first world wide web conference was held in switzerland shortly thereafter the other first world wide web conference was held in chicago there's some interesting stuff about that in these books the uh books robert caius and tim berners-lee's book talks a little bit about that in october 1994 tim left cern and went to form the world wide web consortium and and by the end of the year windows 95 beta 2 with an internet browser was there with tcpip built in and so if you really think about i mean it's not even a whole year it's almost six months where the world changed forever and so at this point now money is coming into it up to that point it was mostly ideas and research but now money is coming out and we start seeing a transition so netscape was on the forefront of doing this and netscape basically took the open source product they started kind of competing just to build a browser but they quickly decided that they would go turn the browser and their web server more proprietary and try to create distributed computing applications using proprietary things that netscape would build unique to themselves and they would attack microsoft the moment it became clear that netscape was going to produce a way to to to develop software on mac windows and linux portably then microsoft got worried because then the operating system wouldn't matter and window microsoft put so much into the windows operating system that if the operating system didn't matter it would really tremendously threaten microsoft's business hence windows 95 with tcp in a free web browser so netscape sort of scared well as microsoft tried to buy netscape netscape refused to sell wanting more money and then microsoft vowed to destroy it and as microsoft was coming after netscape first netscape tried to compete by building better and better software like the javascript language which will meet brandon ike in a moment and then after that they tried to sort of switch from being the proprietary bad guys and go become more the open source good guys and that kind of blew up in their face and they built this open source mozilla which eventually became the mozilla foundation which eventually began firefox so now what i want to do is i want to you to meet two people both of whom now are the senior leadership at the mozilla foundation brendan ike who's the cto of the mozilla foundation but he'll really tell us about the 1995 where he invented javascript in 10 days as part of netscape and then we'll meet mitchell baker who was the founder of one of the founders of mozilla and she will talk about how netscape kind of fortunes declined got purchased by aol and fortunes declined even worse and how they basically pulled the netscape code base out to form the mozilla code base which then became firefox codebase and the firefox of course had this great idea that mitchell will tell you about of having a search box and they made tons and tons of money as mitchell will tell you about so up next brendan ike of the mozilla foundation and mitchell baker of the misilla foundation [Music] do i was hired at netscape in april 1995. netscape had already launched it's 1-0 mosaic killer if you remember mosaic was the big ncc mosaic was the big browser until netscape took over so when i came to netscape they'd been going for about a year i actually had an option to go in at the beginning i passed that up but i went in in time to do what i was tempted to do which is a programming language for html for web designers and programmers to use embedded directly in the web page not something uh that was like coming along at the time called java which was a more of a professional's language where you would write real code with type declarations and you'd have to write that code in a way that compiled i was writing something javascript that could be used by people who didn't know what a compiler was they were just going to load it it was like basic and that was really the pitch that we would make two languages not one analogous to microsoft's visual basic for c plus plus so javascript for java bill joy of sun actually liked that idea and agreed with it and he was the guy that signed the trademark license by which what i had created became javascript the name is total lie it's not really related to java so much as to a common ancestor c in syntax and and to the extent that we made it easy to use and something you could copy paste program or start small from scripts and grow to programs javascript has succeeded massively it was also an incredible rush job so there were mistakes in it and uh something that i think is important about it is that it i knew there would be mistakes and there would be gaps so i made it very malleable as a language and that has enabled web developers to make it be what they want it to be to project their own style of not just api but almost language pattern on it and create their own innovation networks to use eric von hippel's phrase or innovation toolkits on top of it so it's not a language that tries to restrict you to one paradigm it's a multi-paradigm language i mean there's all these things where a language comes out and immediately trips and falls and you have to sort of it's okay at the same time but there wasn't really a trip and fall in javascript why so you could say that it hasn't been a version too and that's true because i've tried to work on what might be a big version too that was called the fourth edition and that failed uh there has been evolution the web is all about evolution people don't quite see this while you're in the midst of it because it's microevolution but web pages from the 90s don't all render properly these days they don't all work right and a lot of them are lost and only available through webarchive.org javascript had enough at the beginning enough good parts to use crockford's phrase or enough genetic material from other languages you know first class functions prototypal inheritance from self the inheritance of first class functions from scheme is really kind of a fraud because scheme is different in many ways and i couldn't make those uh differences manifest i couldn't do the scheme thing in javascript i was under these marching orders to make it look like java i had 10 days to prototype it so scheme was more of a spiritual than an actual influence but first class functions are very powerful and they fit with an event handling sort of programming model i was inspired by atkinson's hypercard so that's why you see on click in javascript and hypercar had this pattern for event handlers called on you know on page down or whatever so javascript had enough good at the beginning to survive now if you think back though to the mid 90s javascript was cursed because it was mainly used for annoyances like little scrolling messages in the status bar at the bottom of your browser or flashing images or things that popped up windows massively we could have put in controls for those we should eventually browsers firefox kind of championed this led this automatic suppression of annoyances that made it all much better and with moore's law compounding and with javascript getting some evolutionary improvements in the standards process it became really quite fast enough and good enough in 2004 and five to get the web 2.0 revolution that was i think tied in with firefox's retaking market share from ie and developers realizing there was a client side to the programming stack that could be expressive and powerful and could be fast enough thanks to faster computers mainly you must have had some training some kind of oh i've done a lot of set experiences that kind of got to the point where you could pull from scheme and i had implemented uh i was sort of a language buff when i first started computer science i was a math physics major originally and ended up math computer science when i finally got my undergraduate degree so i i was programming uh formal language theory uh applied to recognizing languages like lexical analysis parsers automatically constructed parsers from grammars i love that stuff because it was all very pretty and clean theoretically and it it still is it hasn't changed a lot there's been only one or two innovations since my time in university in the early 80s so what um what that gave me was the ability to quickly knock out uh you know a sort of a a language interpreter i could do the the parser and the scanner i could generate bytecode because netscape wanted to do a server-side embedding in javascript even though it could have been a tree walker something that interpreted parse trees i made a byte code for it and it was an internal bytecode not the java by code that's become a handicap for java i think and i knocked all that out really quickly because i'd done it before i've done it at silicon graphics to build sort of network monitoring tools to capture packets based on expressions over fields at the various protocol headers i'd done it for fun just to make my own languages and finally i got to do it really quickly the the speed was an issue for me it was partly we were all feeling like microsoft was going to come after netscape because they had tried to buy netscape in late 94 for too long the money i heard about this before my time at netscape but we also were in a weird game theory with respect to java because even at netscape some people thought well if we have java do we really need a second language they didn't see the benefit of the visual basic companion language for a much larger cohort of programmers or amateurs designers beginners to write java as to write c for the microsoft platform that they took a lot of education and greater pay it was a higher price proposition to get people gluing components together and designing pages and filling gaps using javascript as they did with basic and visual basic in microsoft's windows was cheaper and wider spread it also enabled this uh user innovation toolkit approach uh to use vulnerable phrase again because javascript was malleable because there were so many web designers you would see different schools of thought on how to use it emerge and this has become quite clear over the last 10 years with the various javascript libraries and i think that's actually an advantage as i said earlier to javascript that we're not telling you here's the one way to write it here's the one true object-oriented paradigm here's the only way you should ever make a reusable abstraction it's not unmixed right it's hard for beginners people reinvent certain wheels and make mistakes doing it or don't like having to acquire a library but you see jquery of great power library because it gives people this very sweet query and do paradigm and again it's not mandatory with javascript but a lot of people learn that they think that is javascript i think jquery is a language or they think you know jquery is the tale of like the dog jquery is great and john rezzy used to work with us in mozilla and but there are so many good libraries out there now and they're actually shrinking and becoming more compositional which is a good trend so javascript by being malleable and sort of fostering user innovation i think has uh played a unique role if i had done something more rigid i think the odds are greater would have failed i i just can't imagine how you would have escaped from the object-oriented pattern of c-plus plus partly i had to because if i'd done classes in javascript back in those 10 days in may in 1995 i think i would have been told this is too much like java you're competing with java you know somebody at sun would have yelled at bill joy more than they did and it might have killed the deal so i was definitely not only under time constraint but under marketing orders make it look like java but don't make it too big for its bridges it's just this sort of silly little brother language right the sidekick to java but then you you put in primitives i snapped himself in here you put some primitives you know like closures and all those other things that that it's like you can build what you want yes and and that kind of went under the radar for a lot of people and it wasn't all even there in a good working order in the first release but over the next few years it became not only well um known uh i would say more standardized and well known over the next 10 years it became evangelized like crop words are the exponent of the closure pattern and good uses you can make of closures so people find the malleability and the expressiveness and the power compelling enough that some people actually resist any version two they say i i don't want you to add cliches or common special forms for patterns that i'm more happy writing myself or acquiring as a library myself you you've created an abstraction that implementers can do crazy things with right and people can rethink what the interpreter is really supposed to do they can and they can just say okay here comes v8 that's right differently right optimizations that haven't been tried i knew about these optimizations because i had studied small talk himself but nobody had gotten the time and the money google was maybe first there were other efforts going on in parallel and apple and mozilla have kept up as best they can but va deserves a lot of credit for pushing this forward it wasn't quite as as first on the scene as as they like to claim because it was all coming together in 2008 but it has been very helpful it has shown people what can be done what's interesting to me is that you then go and put different more intensive workloads on the language and you see the there's a new va that should come out of somewhere it may not come out of google because they may have tired of optimizing javascript in fact i believe dart is a response to that by the principals who did v8 they want to do a language where they don't have to worry about all this crazy compatibility may not succeed and it also doesn't give javascript the next level of performance but i believe that level is there and it's still improving in performance much more dramatically than a language like java where the the gains are percent or a fraction of a percent on the standard benchmarks but a lot of html5 development now quote-unquote html5 javascript css web apis beyond what's in html5 is taking off you're seeing like zynga doing html5 only games it's really coming faster than some thought and i talked to the venture capitalist fred wilson of union square in new york said yeah it's here he thought it would take years it's we've turned that corner and so you call it html5 what it really means is it's the web stack it's the same stack you use to write web pages and hosted web apps you can write apps that run in your device apps that are maybe hosted maybe offline maybe the line is blurred so that you can associate them with a url but you can also take them on the plane without any fear that you're going to lose anything by disconnecting from the internet [Music] landscape is of course known for its netscape navigator product first commercial browser on the web so first tool the first generation really used to get on the web and that was successful for a while and then competition became quite stiff from microsoft the different way to compete would be by building a shared asset instead of netscape versus microsoft that netscape would gather contributions of volunteers and other commercial commercial partners you know and build a product that would be shared and that group of people knew that to be open source you had to be real you couldn't just say oh we're open source love us you really had to manage it differently so at the time there were six or seven or eight of us who were employed by netscape as mozilla.org staff and another 100 or 150 employed by netscape as netscape engineers building the netscape product and contributing their work to the mozilla open source project so we are all in the same buildings we all went to many of the same meetings but we had slightly different goals because although we call it the netscape product netscape had been purchased by aol already and so the client itself was diminishing in importance and the importance of the netscape client was to push traffic to the aol websites but we of course at mozilla.org were an anomaly because our charter was to build a successful open source project and so that worked for quite a while you know there were tensions you know these were the dark years for mozilla we came to understand we needed to rebuild our core technology that took a long time while we were doing that market share continued to slide and there was a set of tensions over what does it mean to run a real open source project and as it turned out there were two competing views and one view was that open source is phenomenal we're very happy that the code is out there anyone is very welcome to take that code and use it we absolutely want to build a large and engaged contributor community but all decisions and all authority will be held by the managed equipment group and we'll make those decisions as we think best which is of course to benefit the client we're trying to build which was the nesgate browser we at mozilla.org were convinced it wouldn't be successful that building a product to benefit aol only wouldn't generate the kind of interest either from individual volunteers or from commercial partners that we needed to be successful and that we found the difference in perspective from the volunteers made the quality of the project much higher they were under immense pressure to ship this product because the company needed it and not only was it difficult technically and in an engineering sense but we at mozilla.org were saying and there's these other things right you may think it's good enough but we're telling you it's not because this is the feedback that we're getting and you need to change some of the development processes and management practices and so they thought well i'm not sure what they thought but we experienced it as no we really need to keep going on the way that we're going to get this product shipped but unfortunately you know the mozilla community was correct because when that product shipped it was a failure that was netscape 6 universally acknowledged as a bad product and pretty universally acknowledged as the end of the netscape product line you know that all those people that had held on and hoped through the netscape four days and waited and waited and waited for the update once they got it pretty much gave up so that's what we experienced so that certainly didn't help us that was unfortunate it did reinforce our view that you know our view of the world wasn't crazy and so we continued on and it was still long and slow to get technology and products that we liked and the tensions continued the management tensions continued you know the failure of netscape 6 didn't make anything easier and certainly for the the aol folks it really didn't make things easier either and we fought a lot about the ui i think ui is a constant source of tension but in our case it was worse because some of these fights would be it makes sense to aol to put something in the product and the interface of the product either a button to an aol site or something that has an ad in it or some feature that a partner is paid for so it generates revenue we think it might send people to our site you know some of our users might really like it so it all made sense from their point of view but if you're not building the aol client to benefit aol and you're trying to build an open source product that is used for many people then of course those those features are a big negative and so those are very fierce fights because because the management of the client group had very clear opportunities in hand that they wanted to build into their product and we were this small group saying no diametrically we're pretty opposed in that case and we would say no you can't put it into the core product you're welcome to have a build system on your own and add it in later and that's how we ended up doing these things so so we did develop a system for but of course it's awkward and you're always rubbing against what should go where and it turned out even in the very early days that the open source mozilla versions of the product got a lot more testing than the netscape versions and we also tried to build in some quality mechanisms that again seem normal today so you can't you get hired as an employee you don't automatically get the right to check your code into the tree right someone has to you know look at your qualifications and say and mozilla and the way we work we understand your coding level we understand you know how we work we understand if something goes wrong with your code you know where you need to be all of these things and that's a very difficult setting for a hiring manager to say well i've just hired you and you can't be as efficient as you think you can because there's yet another vetting process by this other group so i understand you know why it was difficult but on the other hand as an open source project if you can't control the quality of the code and you don't have any say in it you're not very real aol client fortunes declined and the netscape browser declined and declined precipitously after this netscape six and so aol was interested in laying off people uh and in one of the big layoffs i was included and that was seen that was in 2001 and that was seen as a power struggle as well because by that time the fights back and forth about what we were building and who was making decisions we're pretty well known within the engineering organization it's it's hard for them not to be i mean it's true anywhere right but but in some cases the fights and the bugs are all very public as well but uh and and so so i was laid off or fired depending on your you know how you want to describe it and as it turned out it wasn't really possible to take my place and i continued as a volunteer and the netscape engineering organization was very clear whose leadership they were most interested in following and so we had to work out a way in which the management at netscape aol that remained and i worked out a you know sort of a way to work together enough so that they could ship the product they wanted to ship but that i continued to lead the project and i think that was a surprise um not not so much to me i think and so i worked as a volunteer on the mozilla project you know for a number of years including the years in which we shipped our first product which we called mozilla and which we were very proud of technically and for its day many people were surprised at how good it was but it didn't have a good user experience but eventually i ended up working on another open source project with mitch kapoor for a while and that was lucky he had started this open project and reached out to talk to brendan and i about it to learn something and the day that he was scheduled to come down and talk with us was the day that i was laid off and so it was an odd time yeah that was chandler i was preaching on there but oh sad so i i worked on that for a while and and mitch was always a supporter of mozilla even before firefox when he didn't like the product but he recognized that we were an important part of the ecosystem and that went on until 2003 when aol decided to stop investing in the client almost completely and fortunately they knew that just killing it wasn't good they knew enough about mozilla and the name and the brand to think it would be good to do something with it and so eventually i ended up working with them and and they some of the people there knew mitch as well and i was working with mitch and they knew him and so we all you know spent a chunk of time trying to figure out what was possible and my partner brendan was still at netscape very eager to make a move and many of the key people we wanted were also desperate to keep working on mozilla and so we ended up getting a two million dollar seed money from aol and mitch was helpful with that and a few other things trademark name mozilla and the four giant servers which were so important to us at the time and which had taken us almost 18 months to get through the purchase cycle at aol and so uh i think we still have those boxes sitting around because they were so important to us at the time and so in 2003 the mozilla foundation was formed and mitch was the first chairman because he had done so much and had so much to offer and brian balendorf and and uh guy chris blizzard who worked on mozilla forever and and brendan and i were on the board and aol understood that there were three or four people that would be leaving when they they did uh some kind of reorg when they closed down the client group and that those people would be coming to join us and of all the people that we thought would be most helpful at mozilla they all came that was about nine or ten so we were and mitch continued to support me and uh one other person in mozilla part-time so we were maybe 10 or 11 people and i and brendan and i had always felt that we needed employees it was exciting it's a little scary because we knew that two million dollars wouldn't go that far and that we had a lot of work to do to make ourselves real we were still 18 months away from shipping 15 months away from shipping firefox so that part was scary but it was immensely exciting one of the friends of mozilla showed up and they had a big lease and a little bit of space they could sublease to us so we found a funky little room way in the back not even a sink very odd but but we were really happy with it and we worked away on building our product at that time we made what was a pretty fundamental change that that was critical and we decided unambiguously to be a consumer product that seems obvious but when you're a bunch of developers it's not that easy and it means that you have to strip out a lot of the things that are clunky for a general consumer you really have to be determined that you're building for a general consumer not yourself and i think many open source projects probably don't want to do that and shouldn't try if you don't want to because you have to keep at it so we started doing that out of the blue some visual designers appeared from prince edward island in canada and so we started to get nice looking logos which we'd never had before and visual elements and then we tried to figure out what to do with the start page because we knew that mozilla development tools weren't the right answer and we just took us forever and we looked at all sorts of things and finally decided the one thing we knew that everybody did was search you know we'd have people who say well it's obvious you should have the bbc at least in english language versions it's obvious and then you know our engineering guy would look up and say well it's not obvious to my 17 year old daughter that that's what she wants it's not obvious to my son either so but everybody used search so we thought we have the search box maybe we should put that on the start page too make it easier maybe it be more obvious to people search so we went to talk with the search providers we had a very fruitful discussion with google i think they also saw the value of having a mozilla browser in the world they're very explicit about that and i believe them and so they're also open to doing a a business arrangement and so often people ask how mozilla gets paid and would like most businesses on the web search search and ads you do a search you know you'll see ads if you follow those ads revenue is generated you know it goes through the system and we get a little piece of it as do other browser makers so it's very similar to the business model on the web and we did something that i believe had never been done before which was to make sure that google and yahoo were right there next to each other like you don't see them next to each other you have to click on the arrow but they're there and today people laugh at me when when i say that because you only see one but but i negotiated that and that was an absolute utter i will walk away from the deal term to everyone because i don't think i'd ever seen it before and i wasn't you know if you get on a plane and you want a diet coke but the plane only has diet pepsi because that's their deal you're angry and if you want to diet pepsi and you want to coke and i i used that example i said i'm not going to have firefox users angry at us because they wanted one or the other you know you have to agree you're both going to be in the product but it also is your values right it is yes your values front and center yes because the choice is there we can do it and to not do it would be to take away a choice that somebody really wants and it matters to someone so so uh we we have that choice in in the search box but we still didn't really know what to expect so we shipped firefox and our sense that we were onto something turned out to be far more too even than we knew we'd seen a rise in interest from 0.8 and 0.9 that was pretty noticeable but once we hit the release version it just exploded and part of that was it was the perfect product this time we had a product at the right time in the market the internet capabilities had grown enough people could actually download a browser easily and more people had bandwidth and ability to do so and a comfort level to try it so that was good we had a beautiful product and it was an important product and the alternative was horrendous and dangerous and awful and so all of that combined to create this giant excitement and so firefox market share climbed by you know a percentage point immediately and started moving in these huge numbers so it was a completely viral storm with nothing driving it other than you know the product and the need itself and so that took everyone by surprise you know we were hoping to be able to generate a few million dollars to support ourselves for the next year you know we had 10 or 11 employees you start to add that up so we were hoping to make enough money to do that and it turned out you know we generated that amount of money before the end of the year the six weeks or eight weeks or whatever it was before the end of the year and so things actually things got even more stressful and more hectic at that point because now you've kind of got the proverbial tiger by the tail we're still 12 people right so that immediate need to grow and the sense of pressure and stress combined with elation and success is probably you know i don't know maybe it's like having a child right but it's it's that it's that combination doesn't immediately get easily to be anonymous purpose yes yes that's for sure we certainly see that and so by the end of firefox shipped in 2004 just actually we just passed the eight year anniversary and you know by the end of 2004 it was pretty clear that we are a different organization we were we were i would say still struggling with a different set of problems but we had changed the nature of the problems really quite fundamentally and you know we've done so in a way that that our users loved by 2005 we were in a really different world where we began to be able to actually influence others and that's always been the goal i mean market share is nice and it's nice when people love your product market share is a validation that you produce the right thing so that's awesome on its own but the more important i'm maybe not more important equally important goal is to be able to influence not only ourselves but others in the industry so that we start to see more of the things that we care about [Music] so that was a series of action-packed videos uh brendan and mitchell have done a great deal and if you look at the current html5 javascript really some of the work that brendan and mitchell have done have created this new world that we live in right now with these highly interactive applications javascript ajax it can really all trace its moments back to its creational moments back to these folks who a created the clever technology and b fought to keep it from becoming too proprietary and so a lot of people gave up on say their own personal benefit to to believe in a cause that was the greater good of society through open source now the thing to also think about just kind of in the back your mind is if it wasn't for microsoft would netscape have been successful in its original strategy to make the web and both servers and the clients make them proprietary they were going to sell web browsers they were selling web browser from 70 to 100 you used to be able to go to a store and buy a web browser your your computer didn't come with a browser as a matter of fact the first time that microsoft put the browser in with their operating system they got sued for it but by microsoft trying to catch up they had to give their browser away free which made it impossible for netscape to charge for the browser and so in some ways not that microsoft was trying to make the world safe but they basically blunted the blunted put hundreds of millions of dollars into blunting netscape strategy and making sure that well they didn't make it but it the net result was the creation of firefox in mozilla and a much more and world wide web consortium and a much more fair marketplace so the world wide web consortium it was created in uh october of 1994 cern made a conscious decision even though it knew that the web had been created there that the sort of future curation of the web and what the web meant was not something that a physics lab should take on and so tim berners-lee went to mit mit welcomed him with open arms they funded him they helped him create the world wide web consortium and have done a wonderful job in leading the charge in defining the standards that that me make that that that define the standards that define what html is what css is etcetera now in october 1994 there was no way to know if this was going to be successful again it was somewhat of a reaction to netscape who had this sort of like you know bunker mentality the netscape was going to take it all over and own it and push every all these open people away tim and mitchell and joseph and all these people did not want that to happen they felt it should be free for everybody and open and many browsers and many servers and everything should interoperate and standards were the key to that and the worldwide web consortium was going to be the vehicle through which that happened but they didn't have any power compared to netscape and netscape had all the money but ibm and microsoft were not particularly fans of netscape either and so they early on changed joined the world wide web consortium and gave it enough credibility to eventually sort of take over uh in terms of being very respected and the standards produced by the world web consortium are high quality and very respected and and they're they're guiding their shining light to keeping the web safe for the rest of us and let proprietary companies do their proprietary thing but the world wide web consortium is is a real important part of today's web and so we finally come to the point where you can assume the web where the web is just there there was a bunch of browsers there was a bunch of servers there were some open source stuff and non-open source stuff and um and and i want to briefly introduce you to tim berners-lee this footage was taken by richard wiggins not me um caught him at a conference and he's only we only got about a minute of his video last month i was off in boston for the fourth annual world wide web conference and i had the opportunity to talk to the inventor of the world wide web tim berners-lee and i asked him a little bit about where we going with this web stuff and we have just a little bit of footage from tim telling us where that's going to be you didn't see nothing yet but you wait until people assume the web when the web becomes something you can assume when the web or the infrastructure is all laid down or you have an information space then it will be time for the next revolution and who knows what that will be it's hard to imagine a greater revolution than what we've seen i mean i go to the smallest places and everybody's putting up a web page right the web could only take place the revolution the web revolution could only take place because the internet which had been a quieter smaller revolution but that had been an internet itself quietly being deployed throughout the world had happened because the internet was something people would assume the web revolution could take place when the web is something you can assume then maybe a cultural revolution a nicer cultural revolution any cultural revolution perhaps or maybe we'll find ways of doing things better than we hadn't imagined pretty neat rich one of the things that i i liked about it was he he just said we're gonna his idea was the next step is something completely different a cultural revolution is compared to just a little more technology i think you can see that tim berners-lee is a real bright guy with a real strong vision of the future and a real deep commitment to the the web and the internet being a agent of change in the world and uh and i think that it's going to continue to be that unless um proprietary companies can get an upper hand and so it's a it's really incumbent on all of us to protect the freedom of the internet and there's there's believe me there are organizations that think they should run the internet and the web they just are they never have stopped so we're at the point where the web is all here it's quite amazing and before we move on i want to i want to characterize just a little bit about the people you've met so far and talk about some of the common common things that that they they have between them and in a way this i got the book the wisdom of crowds here one of my favorite books each one of these people when you listen to them talk they would quite often talk about all the other people doing all the other work and and and they rarely consider themselves in any way heroes they rarely consider themselves as the great innovators they always see themselves as part of a collective curiosity and there were many things that i haven't told you about that were curious explorations that helped inform it and and this was really research and you know until sort of the late 90s it was just a question of how to do it and at some point after the late 90s it switched to what to do with it and at that point we have a whole different group of people that start really entering the world um and and one that's one of the earlier ones and and has benefited greatly from it is jeff bezos and he's the founder of amazon.com and if you listen to jeff you will hear that he talks very differently he comes at it from a business perspective he's not necessarily trying to make the world a worse place he's trying to make the world a better place the fact that you can get you know all these books like the next morning that's kind of cool um so what amazon does is really wonderful it's it's more efficient for society it's economically beneficial but he has a profit in mind he is saying this web exists browsers exist protocols exist how can i take best advantage of it uh richard wiggins caught up with jeff in 1997 so just look at how smart this guy was in 1997. everyone else figured this out much later but jeff bezos is a really really bright person so here's jeff bezos hi there who are you i'm jeff bezos and what is your claim to fame i'm the founder of amazon.com where did you get an idea for amazon.com well three years ago i was in new york city working for a quantitative hedge fund when it came across the startling statistic that web usage was growing at 2300 a year so i decided i would try and find a business plan that made sense in the context of that growth and i picked books as the first best product to sell online for making a list of like 20 different products that you might be able to sell and books were great as the first best because books are incredibly unusual in one respect and that is that there are more items in the book category and there are items in any other category by far music is number two they're about 200 000 active music cds at any given time but in the book space they're more than three million different books worldwide active and imprint at any given time across all languages what more than one and a half million in english alone so when you have that many items you can literally build a store online that couldn't exist any other way and that's important right now because the web is still an infant technology basically right now if you can do things using a more traditional method you probably should do them using the more traditional method what kind of inventory do you keep we inventory uh the best selling books at any given time we're inventorying in our own warehouse only a couple of thousand titles and then we have we do almost in time inventory for another four hundred thousand titles or so we get those from a network of electronic we order electronically from a network of wholesalers and distributors we order those today they're on our loading dock the next morning then for another 1.1 million titles we get those directly from 20 000 different publishers and those can take a couple of weeks to get and then the uh there are a million out of print books in our catalog we have a calendar two and a half million books all together those million out of print books some of them we can get and some of them we can't but we find them uh if we can and then we ship them to our customers we do kind of a search on those what's almost in time inventory almost in time inventory is the phrase we use to describe a whole selection of books that we offer it's basically the things that are you know below the 2000 best-selling book up to the 400 000 bestseller book those are titles that we can get from a network of more than a dozen different wholesalers so if a customer orders a book from us today we order that book from our wholesalers today and that book shows up on our loading dock the next morning and then we can ship it to the customer they say one of the toughest things to do on the internet is to capture mind share what was your secret how did you do that yeah even more generally i agree with you that you know capturing mind share on the internet is extremely difficult even more generally it's the late 20th century not just the internet you know capturing attention attention is the scarce commodity of the late 20th century and one of the ways that you can do that and it's the way that we did it was by doing something new and innovative for the first time that actually has real value for the customer that's a hard thing to do but if you do do that then newspapers will write about you what you're doing customers will tell other customers and you'll get a huge word of mouth fan out and and that can really drive and accelerate businesses and that's what happened with us in the first year of opening amazon.com to the public we didn't do any paid advertising and all of our growth was fueled by word of mouth and media exposure i saw a little ants at the bottom of the column of the new york times that was our very first advertising um we don't do that anymore but at the very very beginning we did little tiny ads at the bottom of the front page of the new york times i thought that was very clever it's sort of using a url as a macro because i read and it expands we're a bookstore click here right that's a great way to think of it and it worked very well apparently i don't know you know the problem with that kind of advertising is it's extremely difficult to track um put a different url for every uh that's the problem is you want people to start to learn your url so you don't want to actually use a different one and it's very easy one of the great things about online ads we do advertising today in maybe 40 different uh on different websites we do banner ads and that advertising is very easy to track in terms of knowing how effective it is so we know for each piece of creative in each venue not only how many click-throughs we get but how many sell-throughs we get how many dollars of revenue it generates per ad dollar spent on that creative in that venue and that is a sort of a marketer's uh you know nirvana in a certain sense well it's an exciting place to be on the web right now oh it absolutely is i mean it's just incredible this is what's really incredible about this is that this is day one this is the very beginning this is the kitty hawk stage of electronic commerce we're moving forward in so many different areas and lots of different companies are as well in the late 20th century it's just a great time to be alive you know we're going to find out that i think a millennia from now people are going to look back and say wow the late 20th century was really a great time to be alive on this planet so if you're a stockholder of amazon i think you're in good shape because he's a bright fella and he's always looking for how to really do the right thing and you know give us products that we really like now in this we sort of evolve the modern internet and and much like jeff jeff said that investment was growing at 3 000 a year or something like that there was this mania in 96 97 98 where people just bought and bought and bought and the notion that communications was going to be so valuable led to in the sense an overbuilding of the fiber optic plan we have a massive amount of fiber optic in the ground right now and our ability to send more data through the existing fibers by just upgrading the equipment at the two ends that's growing very rapidly as well and so we're in this really weird world now where a fiber optic between ann arbor and chicago 180 miles is not that much more expensive than a piece of copper wire that goes 30 miles in 1960 and so distance is less and less important in the internet it's why long distance phone calls don't cost any money anymore because long distance and local are not that different because there's so much fiber in the ground but just one thing i want you to think about and wonder about is will we run out the we're living off of the overbuilding that happened in the late 90s and they put so much in and they've been able to upgrade it so effectively but we might run out and where will the money come from to dig the next set of trenches to put the next round of fiber in uh we'll have to see so if we take a quick look at this is a graph of the growth of the of the servers on the world wide web and if you take a look we start in 1990 with one web server december 1990 uh with tim berners-lee having one web server and then it sort of doesn't grow and then you sort of see kind of like this take off in the last two years last late two years the 1990s where you go up to about uh you know 20 to 30 million uh you know from what i don't know 1996 is 200 less than a quarter of a million dollars and unless the quarter of a million hosts you see the uh kind of the crash the boom and the crash where it wasn't growing and then you sort of see another new graph which is really this is when everybody had to be on the internet every little store started being on the internet every web one had to be a web server this was kind of the big ones coming on and this was sort of like the flame out of the exuberant growth of the late 90s and this exuberant growth gets so exuberant that it really changes the stock market in the united states and the rest of the world to the point where you know from the early 90s to 2000 was the longest period of expansion and there's a lot of factors for this but one of the factors is as we began to use communication and computation to make commerce more efficient that triggered a great deal of growth it also triggered over speculation right and so the kind of um the uh this part here the the crash the dot-com crash was a bit of the over speculation working its way out probably if you know if they hadn't overspeculated it would go like this instead but it can't be like that you go crazy crazy crazy crazy crazy oops and then we sort of had a crappy economy for a while after that but again the web the internet and all the technology that i just talked about really put a stamp on the entire world's economy in the 1990s so i could go on and on and on and on about this stuff open source is a big part of this um i will put links up for uh these videos that i've got these are three open source luminaries that i've interviewed uh richard stallman is the free software foundation the gnu which is much of unix utilities are written this way the compilers are written this way he is a a tremendous advocate for open source software also we meet brian bellandor from the apache foundation and the apache foundation was formed based on the web server that was created at ncsa became the first web server that was the apache web server so that lineage goes all the way back to joseph harden and the team at ncsa and then rasmus lierdorf is the inventor of php in one of the many very popular web programming languages for building websites and so i'll put the links up for these and you can take a look at them [Music] we got our start in the early days of the web as a group of disaffected webmasters who were using a piece of freely available web software but i i had difficulty with it we were fixing bugs we were sharing these bug fixes with each other like a baseball trading cards if you will these these patches as it's called and uh one day we discovered the group that put out the web server that we were using uh basically folded when all their developers left to go join a brand new company called netscape so a bunch of us decided that hey we're dependent upon this software we don't want to become full-time web server developers but we want to be able to to to use this thing that we've had for free and be able to prove it and all that kind of stuff we looked at the license of the code and the license said here's the software do whatever you want with it uh don't blame us when it breaks right uh and uh we said hey that's a pretty good bargain why don't we pass the same bargain on to the next group of people right uh so we formed a a mailing list right uh and this was mostly again webmasters and people working at some early internet service providers or or website design companies or places like amazon or the internet movie database and we combined our patches together and decided to call it a apache server for that reason and uh it went forward and really the model of how we worked was based upon kind of us as a group as peers proposing ideas you know vetting each other's ideas and patches and fixing bugs you know as a group as a team none of us had met in person well some of us had met but as a as a group we didn't meet in person until 1998 really three years after we got our start and long after by the way we'd become the most predominant web server product on the planet um and yet at this time still no money no dime not direct to us from this piece of open source software but plenty of us you know made our living off of building things on top of this piece and that's really the story i think of of kind of successful open source projects writ large which is people working together on on common technologies to solve common problems so they can go off and make money on other places or so they can have fun they can try new ideas they can you know uh be experimental right um and that's that's really the same story of apache and of linux and all these other open source projects it turns out to be not that hard to to be able to work together when people have the same common goal which is let's build a product that that does all this great stuff um one thing that we did do that made it easy to make some of these decisions was to have a very uh modular api which made it easy for us to be able to say hey if you want that special cool feature do it as a separate thing and and make it successful and we'll decide whether to bring this into the product once it's become successful or not right another key thing that plays into this that is true of all open source projects is that an open source license like we had on ours uh that linux has etc carries with it something called the right to fork which means that if i were to go all you know colonel kurtz on on the project and started saying we're going to go here you know and no one else wanted to follow well all of those other people could decide to pick up the code and go start a different project somewhere else you know if they couldn't kick me out which is probably what they would have tried to do first right this right to fork you know means that you don't have to have any tolerance for dictators you don't have to deal with people who make bad technical decisions uh uh you know you can take that future into your hands and and if you find a group of other people who agree with you you can go on and create a new project around it so i think that rule that right to fork limits the kind of excesses that we see whenever we start to talk about how do groups make decisions and and conflict arises how do you deal with that conflict and it means that you your style of leadership isn't so much one of control and plotting you know moves ahead of time but instead one of um being able to to get people on your side convince them that that you're the you're going to value their efforts value their the the contributions that they make [Music] and up next we're going to start digging into how this all works rather than what it is uh how it came to be so we're going to talk about what it is rather than how it came to be and and so thank you for your time so far and i look forward to seeing you at the next lecture [Music] oh oh hi sorry i was uh i was in the forums i guess it's uh time for this week's video so uh this week we're moving from history to technology and and in a sense we're gonna tell the same story again uh you know we'll sort of start at the beginning but we're going to start from the bottom and work upwards so since we're kind of talking about the same things um but just from a more technical perspective don't worry you'll be fine um we're not going to over there's no math i promise there's no programming don't worry about that so of course i appreciate ieee computer magazine who i write for to let me use the articles that are associated with some of the videos as well as the folks from open michigan who have helped me so i'm gonna say no hang on got a phone call i'll be right back okay well that's been sent sorry for that interruption um again i thank everybody for the use of the copyright materials um part of my hidden agenda in this class is that you will be able to look at this xkcd comic and you'll understand the humor so i'm going to stop for a moment and let you look at it and see if you get the humor okay so maybe you did maybe didn't but hopefully by the time we're done you'll have a better chance of getting the humor so if you recall the most academics throughout the 60s 70s and 80s the best they had access to was this store and forward networking where you would send a message and it would go into a computer and it might sit for some time then it would find its way across on a network it would sit for a little while longer find it and and hop and there's these multiple hops we call these hops and each of these would be a couple hundred miles per house and to go from say uh michigan to stanford there might be 15 or 20 hops cops are as much defined by geography where we're all trying to optimize the cost of these long distance connections so the thing that really characterized store and forward networking was that there wasn't a lot of sharing one message was being sent across one of these links at a time and all the other messages just kind of waited in time in line now that the the innovation that happened in the research networks in the 60s through the 80s ultimately embodied in the arpanet was the fact that they were going to be packet networks first how do we share one link so that a long message doesn't clog it up and how do we deal with outages more dynamically with the idea that eventually you could send a packet across the country and back into a computer and back out maybe a half of a second and that was seen as something to do rather than somewhere between 10 minutes and a day or two so how to do this efficiently it was a research project and it was a research project that lasted almost 20 years well 20 years and 20 or more years and and it the one of the neatest things about this is they were able to throw it away and rewrite it a bunch of times so that they could put something in production see its flaws see what was good about it then they had the money from darpa to throw it away and rewrite it now by the late 1970s it had you know 100-ish computers on it here's a picture and these are not the schools these are the computers these are you know listing all of the computers that are there and so the engine the the innovation that they spent 20 years perfecting was the notion of packet swishing so how can we simply move data simultaneously across the connection and the essential brilliant idea is break the message into packets and again to sort of review here i have a message and i have postcards that only can handle 10 characters and i'm going to send this hello there have a nice day message to daphne in california and so i have three 10 character postcards i basically put a from address and a two address on each postcard and a sequence number on each postcard one two and three and i put ten characters on each and then i stick them in my mailbox i just put them in put the sign up and then i wait now let's just say um you know that the the post office person comes and picks this uh this first one up and uh it goes to chicago from a minnesota mini from michigan and then it goes to omaha goes to denver and then the second one kind of falls on the floor and it ends up going to charlotte north carolina and then they're like why is this here and then they sent it to atlanta to get checked and then they make sure and and this one goes and number three ends up going to st louis and then tulsa and then colorado and this one gets routed back to charlotte because they weren't sure why they sent it to them in the first place and then they sent it to memphis tennessee and then this one makes it to california and then this this other one sort of goes to phoenix and this one finally gets to dallas texas and this other one goes finally and then this third one goes to tucson arizona and then las vegas nevada and finally makes it right so these poor little postcards that i put in they have each a different journey right it's a little adventure for each of the postcards that they go through and so we don't i don't see how this works on my end i put in three postcards i numbered them one two three but on the far end in california daphne opens her mailbox and has scribbling all over her mailbox oh the shame no actually get rid of the scribbling so daphne opens up her mailbox and she doesn't know how these packets got there ivy postcards how they got their outcomes the first one she goes oh looks like i got a message from chuck but i've only got part of it then out comes the second one shortly thereafter and then after a long period of time finally the third one comes out but it's actually the second one and now she understands that she's got the entire message and she reassembles it and away we go and so this is the basic notion of how all those postcards can share the infrastructure all they can all be invited at the same time they can take different paths you don't have to connect them all together like trains basically and and this led to a shared network infrastructure and so the computers that were in a store and forward network went from sort of big powerful computers with disk drives to really tiny computers with a single purpose of forwarding packets rather than long storm storage of message and so so the the store and forward had long-term storage of messages in the routers so the but i mean not in the wrappers in the store and forward the old one long-term storage was somewhere in the network but this was only short-term right short-term storage short-term storage and so when a packet comes out it simply has to find its way through a series of cops it still hops and it's still connections but basically it has to find its way across the internet and then we would take whole campuses like university of michigan say or stanford for example and then we'd have sort of various kinds of local networks on those campuses and computers on the campuses and servers and various things and they would route all their packets all their data out to the internet and then the packets would find their way across the internet and then we might have a home connection as well that might connect and so the shared shared network infrastructure focuses only on packets not reliability or anything else so this notion of hops didn't go away as a matter of fact i don't know how many hops it took to get from ann arbor to palo alto in using bit net i'm going to guess it might have been 16 or 20 and these days it takes 16 or 20 hops the difference is is in the internet the 16 or 20 hops happens in a hundredth of a second or a tenth of a second so the notion of hops and the notion of intermediate computers is still present in the internet and the tcp networks that we use today and so your your message sort of leaves you the host that you're in hops to the first router and then hops to however many routers dot dot this is more of a dot dot dot thing and then finally hops its way out to the far host and so this is sort of either your computer maybe stanford's web server and then there's this series of routers in the middle that it hops through okay and so the problem of what data goes between here and here is you've got to solve a lot of problems right how to all kinds of problems and so in order to simplify the solution or break the solution into simpler more manageable parts they came up with a layered network model now this is a cartoon and computer people love drawing cartoons and saying this is our architecture this is our framework this is our approach sometimes they're helpful sometimes they're not helpful sometimes the cartoon is just kind of a cartoon but what's usually being communicated in these kinds of pictures is they're taking a big problem and breaking it down into some subset some set of smaller problems so the whole problem that they've got to solve of getting data reliability reliably across the whole country is that big and if we can break it into four pieces and work separately and come up with ways to let these pieces interact work separately with each one then maybe we'll have a better solution okay so take a problem that's so large and so complex that we might not be able to solve it and break it into four smaller problems gives us a better chance of solving the four problems now there's a certain art to picking how you break the problem up and there's more than one network model the one that we use in the internet is the tcp or internet protocol suite model and we'll meet some of the people who designed that uh there was also a model called the seven layer osi model open system interconnection model so there's a model out there that has one two three four five six seven layers and they all have names and they all have purposes now definitions uh this was not quite as power this was not very popular and i doubt that it's many places because the tcpi model is the one that kind of won and so that's the one that we're going to study but that doesn't mean that that has to be the only one but it is the one that has become popular on the internet so once you break the problem down from a big problem into four small problems you have to develop documents about how these layers work together how various computers work together how routers work all these things and so early in the process of the development of the arpanet they created an open process to build these specifications where they would invite invite engineers from all kinds of companies and universities and who knows what any expertise to show up at a meeting several times a year i mentioned the ietf meeting when i was talking about the history of the internet and the tim burner's leave off that only 15 people showed up to that's this meeting it's the internet engineering task force which is just a bunch of engineers that get in a room get in many rooms and solve a bunch of problems and so the standards that come out of this and you can go read them they're all open documents they're very throwback documents they've got this text thing and it's got all this like beautiful uppercase that reminds me of when back when printers didn't have lowercase and these are documents that describe how most of the components of the internet work together so the layered architecture that i just described where you have a computer and then it sends data out the back of it and then hop hop hop hop pop through routers then the data rises destination computer and then the destination sends it back it's hop hop hop hop hop coming back and it comes to you we're going to expand on this a little bit so each of these hosts on the two ends is going to expand to this much right here so that's one host that's another host and each of these routers is going to expand into this so this bottom picture is just an expanded version of this upper picture so that we can see how the internet layers work together so when you send a message in on your computer it's this is all software application transport internet link so this is all software and then it comes out the little plug on the back your computer right so maybe your computer has a little plug like this on the back of it right that's an ethernet plug and here's an ethernet wire and unplug my local area network in right and so this right here is that right there that's kind of the plug it might be wi-fi which means it's kind of like air but if you have a wire like this then it's a little more tangible and there's something on the other end of that wire this is a very short wire but there would be a router on the other end of that wire the router takes the data off the wire and then forwards it just like a post office intermediate post office would do on to the next link on to the next link and this would be multiple hops in here they're only showing two hops and then finally it's on the last link going to that computer at stanford and then it goes up through these four layers of software and then goes and does whatever it's going to do now when they send you the response back it kind of goes the other way and back up and back to you so if this was you in a web browser you know and then here's me and here's a slide i've got right this web browser and you move in your cursor in the web browser sending stuff back and forth to the coursera servers kind of here and that's what's happening right so if as we look at this picture over here this left column is the host you're coming from the right column is the host you're going to and the routers are these things in the middle okay okay so let's start looking at all four layers we're going to start at the bottom at the link layer and then we're going to move up so if you recall when i talked about the the the basic reason why we even have a layered architecture at all is to simplify the problem to simplify the problem so the idea of the link layer is the link layer is a it all only worries about getting the data across one hop right only worries about getting the data from across one piece of wire what like voltage goes on these little wires and how we send the stuff and if more than one computer is using the same wire how do we share it's a complex enough problem but we don't worry about the whole world we worry about this one link layer and so the link layer is sort of like the connection out of one computer and into another computer and and this might be fiber optic might be you know 40 miles or it might be 40 feet for all we know but it's one link okay and then a router pulls it off that link and then forwards it onto another link if you go to the post office you could think of the the person who picks your mail up from your house with the thing on their shoulder that's one link then they put it in a truck which takes it to a place that puts it in a semi truck and then it puts it to another place it puts it on a train and the semi truck and the train and the postman or post woman are these little link layers each person or semi truck is not taking it all the way they're just getting a little farther so that's the link layer so the link layer doesn't worry about the rest of this stuff it really the stuff that is defined on this wire really doesn't even care if there's a world wide web or anything its job is to get data across one foot that's what its job is is it up is it down how do we share it doesn't care it's got a very narrow view so what it means is we can zoom in on this problem right we can zoom in and focus and ignore everything else so the link layer basically asks questions like you know i've got some data inside the computer and i want to send it out how do i encapsulate it what if this is shared how do i deal with that and common link technologies that we see are like ethernet or wi-fi or cable modem dsl satellite or optical right these are all links layers of one form or another so if you're going back to this ethernet link layer which is one of our favorites because it's kind of ubiquitous other than wireless it's probably ethernet is the most ubiquitous link layer so when the manufacturer builds an ethernet or a wireless adapter they actually build into it which is probably right here i don't know it could be right here um they build into it a serial number so they they mark this as having a serial number and you can actually whether on windows or mac you can work your way down to find the serial number for your actual manufacturer piece of equipment so there's one on the mac that's an example one on the mac and that's an example one on the pc they tend to be six two digit numbers concatenated together often with colons in this case in windows it shows dashes and so this is a raspberry pi and all the raspberry pi's are going to actually have similar prefixes and then serial numbers within that prefix and so the raspberry pi's get manufactured they come out of the manufacturing line with this sequence number just kind of going up up up up up and um then they get shipped to all corners of the world and so these numbers are not the numbers for this to get across the world all the number all these serial numbers at the link layer are good for is to get across one connection in case that connection is shared and if you were to plug this into a hub and you're plugging other computers into that same hub well then you're sharing that connection and this computer might see the traffic for that other computer and so we use the physical addresses so that this computer knows which of the packets belong to it on the piece of wire so wired ethernet or wi-fi certainly if you're sitting in a room with a bunch of wi-fi computers you're sharing the air so you have to come up with rules how to share this how to share nicely and how to behave so the way it works is you could and this ethernet could easily have a bunch of other computers hooked to it right because it's really shared and so they're all sitting here they're all talking simultaneously you know but we have a we have a pair of computers that want to talk so it's not just one it's got a bunch of folks sitting here just draw a bunch of them bunches of folks they're all talking or all potentially talking and we got to figure out how in the shared medium like this is think of this as a hub and they're all connected into this hub including the two we want to talk they're all connected in the hub including the two that we want to talk and how can these two make sure that the data goes is this data that's sent with the intention of going to this computer this router how can it make sure that it gets there well if it knows the address of the sending and receiving unit it just encodes that in the packet and that way it sort of goes by all these folks let me change colors here the packet we send goes by all these folks and even if they all heard every one of the packets they all know that it doesn't belong to them because they all have a different number and this is the only one that responds to it so that allows them to share the wireless or the wire and share the wireless or the wire and have it all work out for them so so that's the idea of the link layer now again remember this might only be 50 feet or 50 yards or 50 meters will make it meters it's only 50 meters but you still have to share the network in that 50 meters and then the router takes it off of that and forwards it on a different link and that's how we get across the country but that's the next layer up and let's not worry about that for now so the idea is think about one link whether it's fiber optic or cable or wireless or wi-fi and so what are the kind of problems that you have to solve on this link layer well so one of the things that's cool about many link layers is like wireless and ethernet is they can be shared and it makes it really easy just plug new computers in so you just sort of like here's a hub and you just plug another computer and it's on the network but they have to come up with a way to avoid the chaos when they're sharing and so the the way ethernet does this is with a technique called carrier sense media access with collision detection and as you'll see a lot of the things that we do on the internet have a lot to do with courtesy meaning that we're just nice and if everybody's nice it all works out it's kind of like driving in a car just everyone can't run through the stop sign at the same time somebody's got to stop got a green light your turn my turn whatever this probably wouldn't work so well with cars because sometimes you do have a collision of packets you just have to wait packet collisions don't crush your car they just slow your data down so here we go this is basically carrier sense media access with collision detection the first thing you do if you want to send some data say we got some data inside a little computer we're going to send it out knowing that there might be other data going by the first thing we do is we listen we listen to what's on there if it's not silent we just wait until it goes silent and then we start sending so if someone's already using it why crash into their data and crush their data because there's it's shared there's only one you can't both be sending at the same time so first you listen once it's silent you begin transmitting data and then what you do is you also listen to your own data and if your own data is coming back then it's sounding pretty good and and then if if you if there is a collision because there's a chance that two computers will want to send at the same time and they'll collide then what they have to do is back off and they have a sophisticated random number calculation so that they back off not always the same amount each computer backs off a different amount and they make it so that it's fair so one computer is not always backing the most off so it's very fair so it's when you detect a collision you re-transmit after a random back off first you avoid collisions but in the in the rare case that you have a collision then you do this and so um i want to introduce you to the person who invented ethernet ethernet this is robert metcalf and he was working at xerox palo alto research center park everyone calls it park and they had they were building what is most considered the first computer the alto computers they were connecting line printers are fast printers to these xerox was building the first laser printers and they needed something faster and so they just started building it and they built this thing called ethernet now it's a little different than the ethernet that we see the first ethernet that they built is a little different you can see a picture of it down here if you ever go to xerox park and get in you'll go see this little museum they have it used a single piece of cable that ran along they just ran it down the hallway and they would connect a tap in and then they would use that to send the data so it would go down the hallway and the tap would come out in each office and they were truly sharing the media this is not shared except when you plug it into a hub that's what makes this shared and it was inspired by an earlier wireless network where they were doing not collision detection but re-transmission called aloha and a lot of early network packet network inspiration uh comes from the alola net in uh you come from university of hawaii and because they didn't have anything except wireless but they did have wireless they did a lot of really cool early research on wireless and so uh let's go ahead and meet bob metcalf the inventor one of the inventors of ethernet [Music] so [Music] i was extraordinarily lucky and happened to be at the xerox research xerox palo alto research center when a problem evolved that had never before occurred the problem had never occurred and that was the problem of having a building full of personal computers and i was a networking guy so they turned to me and said network these puppies and we had just finished starting the internet it was then called the arpanet which was packet switching and it was pretty clear we wanted this network to connect to the it wasn't yet called internet thing so it would be packet switched we're pretty sure at the same time we were building uh arguably i don't want to get into that argument the first laser printer which in our case the first one whose name was ears that's a whole other story it was a page per second 500 dots per inch and if you do the math that's about 20 megabits per second so the existing methods of interconnection had a a lot of problems one is they were all home run so all these wires one from every desk would all come to this one place in the building and put a big rat's nest we called it a rat's nest by the way two is they generally ran at 300 bits per second or if you really rev them up they go to 14 4 kilobits per second and that wasn't even close to 20 megabits per second so and we wanted to be able to keep the printer busy by sending documents from all these pcs which hadn't been built yet accepted part well they hadn't been built even in park at that point so we were building the printer and we were building the pcs all at the same time so i um started work on this and there was a pre-see an effort before mine called cygnet at park and it was being done by charles simony my friend and uh but he wasn't a networking guy so they sent me in to pick up cignet from charles and he was going to go off and do something else and by the way the thing the other thing he did is he wrote a text editor a word processor called bravo which then became microsoft office wow so charles is a billionaire and he's been to the space station twice the international speed so maybe i shouldn't have kicked him off cignet maybe i should have done bravo i don't know anyway so i uh immediately decided that cygnet by the way cygnet stands for simoni's infinitely glorious network was in fact infinitely glorious and much had too many moving parts to be a lan by the way the word lan wasn't invented until 1990 and this is still 1973. so the word land was way in the future so that's an anachronism anyway the the cygnet had too many moving parts for a land and it was in the course of investigating how to organize this land that i ran into a packer by quite by accident by a packet radio network at the university of hawaii it was called the aloha net and what was beautiful about the aloha network is it had it solved a distributed problem that is how would we share a radio channel back to the mainframe in hawaii at the university if we were just a bunch of terminals scattered among the hawaiian islands and they couldn't really easily talk to each other to get coordinated how would they coordinate their sharing of this inbound radio channel and norm abramson there at the university of y devised this very simple randomized re-transmission procedure where a person would type by the way what they would type was a card image it was 80 columns wide back from the days of card batch processing so you would type in your card image and hit send and then your terminal would send it in toward the mainframe and then would wait a short time to see if there was an acknowledgement returned on the outbound channel and if there was everything was hunky-dory but if there wasn't that probably meant the two terminals had decided to send at the same time so then as many terminals as participated in that collision of transmissions would then randomize and and then re-transmit at a random time in the future thereby if they overlapped here they very likely would not overlap again in the future because they would choose different random numbers to count down so that's randomized retransmission multiple access so i since i was trying to avoid this big rat's nest of wires and only wanted to have one wire just one wire not 16 or 32 or whatever the alternatives were just one i wanted a distributed solution to how to share this cable and the aloha network produced that using randomized retransmissions so then i um actually you know there's sort of two stories here one is a hardware story we've learned about the hardware there's a hardware story and there's a software story and the hardware story is uh not that interesting but it's there so one of the first things i did was to buy a kilometer of cable or maybe a mile i don't know if i'd gone metric yet but i saw a spool of coax cable about this big with the two ends sticking up so i got a pulse generator and i hooked it up to one end and hooked it around back into the oscilloscope and started launching square waves down the cable and watching what came out i thought that would be good preparation for building a network and what came out the other side wasn't a square wave it was sort of sort of lazy rise times and lazy fall times but if you put a a digital gate you could recover the square wave that is you just said use the digital threshold and so out of this gate came a square way so you could recover at the end of a mile of cable this square wave so i kind of knew i kind of had some confidence then that we that various stations connected to this cable could inject their square waves and the other guys could recover them so that hardware wasn't that hard but it um it was a straightforward so the first hardware was kind of relatively straightforward very straightforward and the and the square wave by the way was called manchester u what we do is we take the bit we'd make a packet of bits and then we'd send them one at a time down this cable and we would encode them and the encoding was also simple it was manchester coding meaning for each bit the first half of the bit would be the complement of the bit and the second half of the bit would be the bit so you would have a transition in the middle of each bit cell which is a very simple modulation scheme so as you're sending the packet the cable's wiggling and you can recover the signal at the other end and clock those bits into a shift register then click to clock the shift register into the memory and collect the packet that way and that all took just some soldering to make that all happen basically well i don't wanna it got a little bit complicated when you wanted to put 255 of them on a mile of cable you had to be a little bit careful about the impedance of the taps because a square wave coming by a tap might generate reflections that would then interfere but the beauty of manchester encoding was that while you were sending a packet there were constant transitions so if you listened you could tell whether a packet was going by and you didn't have to listen for long you only had to listen for about a bit time and then you which which turned out to be 340 nanoseconds so you could wait that long and tell whether there was a packet going by so one of the first differences between the ethernet and the aloha now there are a lot of differences but one of the first ones was carrier sense in the aloha network you couldn't tell if somebody else was transmitting at the same time as you but on the ethernet you could and the advantage of that was you might as well if you're sending and somebody else is sending at the same time you might as well give up because you've destroyed each other's packets so punt then that recovers that bandwidth that would otherwise be lost to just continuing to transmit a damage packet and the the other thing was this this manchester code meant that the cable was on half the time and off half the time that is the the driver sort of an open what we used to call an open collector driver would either yank the cable up to three or four or five volts i even forget the voltage now but it was under five i'm sure that i had a rule i never won above five volts uh you would yank the cable up and then during the other half the bit when you weren't yanking you let go so so for example in each bit cell you got to look to see if anyone else was transmitting when you had stopped and if there was somebody else then you had a collision so that was the second feature so it was really a digital signal it was really a digital signal it wasn't modulated signal in any way right no well it was uh best you could send a digital signal over coax on off with the manchester encoding so the ink manchester encoding is akin to a modulation scheme but it's the simplest one you can think of it's very baseband yeah and it is very baseband but it's a sort of modulation scheme that a computer scientist would come up with as opposed to one of those fancy radio people right so and and by the way i took a lot of gas from all those radio people why did you use such a simple scheme you wasted you would have wasted all that ban you wasted all that bandwidth that capable that cable was capable of carrying hundreds of megabits per second and you only carried 2.94 what a waste of the cable well there was plenty of cable to waste so we've talked about carrier since inclusion detection and the scheme would be that each packet would carry two addresses the address of the destination and the the address of the source and each of these would be eight bits and so on the back plane of these little personal computers with wire app you would wire wrap in a code between 0 and 255 and that would be the serial number of the machine and that and then you would read that off the back plane and put it in the packet so each packet had two addresses this is different from the loan incidentally which had one address because the channel was only going through two channels so two addresses and we added uh a cyclic redundancy checksum on the end of the packet which we implemented in hardware so that you in addition so you could tell if the packet had been damaged so if there was a collision and the terminal the contending stations backed off there would be a hunk of garbage on the cable zipping around but when it was received the checksum wouldn't add up and you just throw that chunk of garbage away the day that i was launching pulses down this spool of cable i was doing some soldering and i was doing some knife work to get the insulation off the copper and and across the room was a young grad student who was doing something else and he noticed me not being good at this and turns out he was very good at it because he had worked in a television studio and he had worked in cable television so he knew all about skinning wires and coaxes so he came over to help me and i was his name was david boggs and then we started working together and invented ethernet together so he was and he was he was slightly more hardware and i was slightly more software but there was then a third guy who was even more hardware than david who was the picofarad guy the guy who would put those last few passive components on the end just to be sure that that connection to the coaxial cable would be clean for transmissions you wouldn't every time you tapped into it you didn't put a big lump of impedance on the tapping seems like a counter-intuitive thing to me i mean think that i mean everyone would think that a star is the right thing but you were going to tap into that just well one of the problems we decided to solve was the rat's nest problem we did not want a rat's nest and every time we installed a new pc we didn't have to home run a cable back to the rat's nest so we wanted to put one cable down the middle of the corridor and then every time you want to put a pc you just run up and tap into it and we didn't want the network to go out and go down while you were tapping into it because we wanted 24x7 access to the network so there had to be a way to tap into the network without bringing it down so that led to a a device we found in the cable tv industry okay it was called a geralt tap and it was basically a vampire tap you'd you'd drill a little hole in the outer casing of the coax and then you would screw in this tap that would puncture the insulation and go right to the copper and tap in and you notice in that operation you're not breaking the copper so the network continues sending you tap in and you're now part of the network so that came from the cable industry it did oh so a guy named david liddell who had done cable tv installations when he was in grad school in toledo uh suggested that we use the gerald tap since it was already being made in volume and worked just fine as far as he was concerned and it allowed us to solve the rat's nest what we perceived to be an important problem the rats nest problem the aloha network ran in the kilobit per second range like 4 800 or i don't remember the numbers but kilobits per second ethernet then started at 2.94 megabit and in those days by the way t1 was 1.544 megabits per second so in 1973 he said it was already twice as fast as t1 of course t1 still around oddly but then we went from 29 2.94 we briefly went to 20 megabits per second inside of xerox and then when we bumped into deck and intel for the standardization process in 802 we decided on 10 we came down from 20 to 10 so the chips would work and then we went from 10 and later i helped found a company called grand junction networks that introduced the 100 meg ethernet and i remember being at my coffee table in palo alto i think it was i forget the year no maybe it was in woodside but we were trying to think of how we would make a faster ethernet and uh there's some math that shows that if you as you go faster the efficiency depends on the diameter of the network and as you go faster and faster the efficiency goes down the diameter network in bits and we're trying to go up a factor of 10 and and then i don't know who it was one of us observed that wait a minute we've been assuming that ethernet is a kilometer in diameter but we're going we're all going to hubs now so that we only need 100 meters not a thousand meters and that was the factor of 10 right there so by changing the collision interval you maintain the same efficiencies uh theoretical efficiencies by just assuming going 100 meters instead of a kilometer so that got us to 100 megabits per second and now gigabit 10 gigabit which i guess is the mainstream now and then 100 gigabit is now from beginning to run 100 gigabits per second you can't you can't be a computer scientist and build that kind of digital now you have to actually be an engine a hardware engineer to run at 100 i think at that point you're pretty much a radio person yes but then after 100 gigabits comes terabit so i've already begun giving talks on terabit ethernet [Music] welcome back so i hope you uh enjoyed that now i want to make it real clear that when i give you a 15 minute video of an amazing inventor and computer scientist you don't have to remember every word that that person says okay it's more important to get the gist of it i try to cover the things i really want you to know in my slides and you might want to listen to it more than once or listen my slides and then go back but you're not i don't want you to memorize it i i wanted to give you smart people i want you to hear from the smart people who did all this cool work in their own words but when we hear from their own words they're sometimes pretty technical so just relax and enjoy listening to these people and understand that you're not going to get everything but hopefully you'll come back and you'll get more and more later so the idea was that he really started with this idea of wireless which which was we're going to share one medium and the wireless and the wire the ethernet wire that he built the coaxial cable that he built was like a giant radio except it ran in the ceiling and so the design ended up really simple and that's that's been good for ethernet over the years and later wi-fi is a variation of ethernet um and so it turns out to be uh sometimes it's best to build something simple but that just make it go really fast and make it really cheap he also formed the company 3com which was one of the first manufacturers of pc cards and so back in the day you would go buy a pc and you would go buy a 3com card and plug it in the back your computer and so there was a time when san francisco stadium candlestick park was named 3com stadium and our friend bob metcalf was involved in all of that so he's done many things throughout his life and we're honored to have met him so we started with this four-layer architecture that says we're going to break down a big problem of cooperating applications across various kinds of networks we're going to break it down and we just got done talking about the link layer it's literally 20 000 maybe 50 000 engineers have spent the last 20 years figuring out how to make this work because the layered architecture lets them think although it was informed they have twenty thousand fifty thousand i don't know twenty thousand engineers that think about that problem and they ignore the rest of the problems okay they ignore the rest of the problems that's great because they've gotten really good at this one problem now we're going to go like you know i don't know how many engineers let's just make it up 5k engineers think about this next problem the next problem is called oh my color's got to change the inter network layer it is the notion of forwarding each of the postcards with a from in the two address forwarding enough times to get them all the way across the network that's the next problem we're going to solve we're going to stop thinking about the link layer we're just going to assume it works it's just magic that's what a layered architecture gives you you don't worry about the stuff above you you don't worry about the stuff below you you focus like crazy on the stuff that you're focusing on so the link layer only works on one link right it worries about one link and there might be 15 or so of these links but the internet layer worries about all the links and the proper sequence of links to follow to get from um to stanford that's a kind of complex problem we're not going to worry about reliability we're just going to worry about if i had a pack if i had a postcard with a from a2 address a packet can i get it there how will i forward it i assume the link layer is perfect so the internet layer is the first end to end because if you recall this vertical box set of boxes is the host starting computer and this vertical set of boxes is the destination computer so in this little raspberry pi all four of these things application transport ip and the link is all part of this one gadget okay so i p is best effort and it's okay to drop data if things go bad and that's one of its charming most charming features but what it had to introduce in addition to the ethernet addresses or the media access layer mac addresses is an address of the destination now if you remember the mac addresses the addresses for the link layer come from the manufacturer the moment this equipment is manufactured is burned in a serial number but these move all around the world if this runs here at the university of michigan it needs one address to connect to the network one ip address if it runs in stanford it needs a different ip address so we have to be able to change these they're assigned differently so it's the worldwide number that is your like postcard address it's like a phone number right so wherever i'm at you call my phone and this gadget rings so it's address well that's actually kind of a bad example because i use the same phone number okay ignore that phones use magic they use crazy magic so in computers they're not as cool as phones you have to change the address everywhere you go we'll talk about how you change the address everywhere you go the ip addresses are based on where the station is connected they do get reorganized once in a great while but not very often and you can even go to various websites on the internet and say i p address lookup is the most common search and it will tell you something about where you're coming from right and it'll kind of be sometimes really weird because you might be at a starbucks and it might send it all to st louis and it might say oh you're coming from st louis even if you're not so you can look it up and you can look up other address as well you can say oh here's an ip address i'm going to look this one up first it starts by looking yours up but you can put in different addresses like in this particular one if i put a different address in here it will actually go look that address up so the ip address format it's four numbers that are separated by dots each of the number can be between one and two thousand and 255 and it's just a representation of a 32-bit number back in the day they kept the numbers small because didn't want to use all the computer memory and so this is an example of an ip address four numbers between 0 and 255 separated by dots okay now the concept the address is broken into two parts there is the network number part which is the prefix and then there is the computer number within network okay and it's kind of like phone numbers were before cell phones where the area code was where the phone was and then this was within that area code where to find it and even the older days these were actually geographic too right so these numbers would be neighborhoods or whatever these days it's all electronic so that the the precise mapping of a phone number to a geography is uh is less precise and that's why you get a cell phone number and you move to a whole new state you keep your old cell phone number because it's become electronic but in the early days when it was actual relays and switches making the phone numbers work they actually had to do with where it's actually kind of fascinating how phones work gotta how to teach a class on like how phone switches worked in the 1800s actually quite fascinating because they were surprisingly simple that's the interesting part how simple turn of the century well turn of last century phone numbers were so let's get back to ip addresses there's four numbers some part of those four numbers is a prefix we call that the network number and when the packet is in the middle of the network it doesn't really look at these numbers it only looks at the prefix so it kind of thinks of all the packets going to a piece of university of michigan as 141 211 star star and then we we design these numbers within michigan and then this number is assigned to us by the internet authority that assigns numbers so we say hey we need some more and they give us a prefix then within that prefix we get to set up all the other things so this network number which is the prefix of the ip address is the way that a packet is routed through the internet as it progresses through the internet and so it's sort of like if i start with if i start with my computer here at um and that's my number that's my actual number and here's the stanford computer and that's its number right and i send a packet and i say okay this i'm gonna i'm gonna send it from 141 to 11 144 188 269 67 149 yada yada as soon as the packet enters the network it doesn't throw the data away but it stops thinking about it only looks at the prefix and so it simplified greatly simplifies what the routers have to do as it goes across the hops in the internet it's greatly simplified because it only looks at the first part it doesn't have to look at the whole thing and then it starts making decisions as to how to get there says okay i'll hop it over here then it's like in kansas city and says ah what am i going to do here and i'll pick one of those two links and i'll go over here and then as it exits the network it dumps it somehow onto stanford's campus and then it finds its way oh didn't find it to this one it found its way to that one there you go unpredictable computers now it so within the stanford campus these numbers mean something and the stanford campus uses those numbers to get the thing to the actual real computer and so it's sort of use this prefix you know it has a real number but then it uses the prefix in the middle and then the real number reappears at the end so that's called a network number and it greatly simplifies it's complex enough to be in the middle and have to worry about all these things coming from kansas city and beijing and like where these things go to have to look at each one and know where every computer is is crazy so that's why they call it the inter network protocol because it says there's one network there's another network and the only thing i care about is the network number and i am the internetwork protocol i p internet work protocol i'm just getting these things from one network to another network and then it's up to that network to figure out how to get the darn thing to the right computer within it and it's up to stanford to figure out how to get it to the right computer at stanford so these things are called the network number it's the prefix of the ip address now the key thing that it's really a beautiful beautiful design because it because the center of the network is both exceedingly complex in one way that it's so big and so fast but exceedingly simple in the other it just has to move the data from point a to point b and if you think about it's even simpler if we look at it from the perspective of this router the router here i've got circled right it just received a packet it looks at the network number and it actually doesn't even care to some degree where it really belongs it only has one of two choices it's going to go this way it's going to go that way right which one's better and it turns out if you look at it either one would work turns out this one is better but if i went this way it would just take an extra tenth of a second or something right so it turns out the decisions that are made don't even have to be perfect you can make the wrong decision and the network automatically corrects that's part of the goal of the network and so it really simplifies and limits the the the need for each router to understand the entire network which makes these routers i draw them small on these diagrams on purpose because we think of them as small and fast and only solving a really tiny problem but doing it really super awesome so routers maintain what we call router tables and they maintain a list of network numbers and the best outbound route for each of the network numbers now the other thing that they do is they pass routes back and forth and this is how they adapt to errors right they get updated dynamically they ask each other for the best place if they see a network number they haven't seen before they ask their neighbors and their neighbors neighbors and they go oh god okay i got a good way to get to 67 149 and so there's all kinds of communication but it's relatively slow and it doesn't have to be perfect and so it's router tables are what routers have but they're indexed by the network number of the packets not the host of the destination so that's an amazing amazing improvement in the performance and efficiency of the internet core the ip core so it's really quite simple right you basically have a local area network on your campus this might actually be your house too house is kind of like a campus and you can do all the crazy things you want hundreds of computers thousands of hundreds of servers and thousands of laptops and you get one address that is the address of your campus it's the network number of your campus to the rest of the world and then all over the world people can send to you and by simply looking at the prefix data makes it to you keeps this really simple and really fast okay so it's beautiful that's what the network number one area code one network number it's the only thing has to be kept track of a whole um campus can be characterized within the core of the network as basically a single network number now the reality is it's usually a few of these because they get given to you in smaller chunks so you end up with you know 20 or 30 of these things for your campus if you're a medium to large campus okay so now i want to talk a little bit about the problem of computers that move around so you have a laptop and you use it at your local coffee shop and then you close the lid and you go home and open it and it works at home and then you close the lid and you go to school and you open it and talk to wireless at school and it works it's like hey dr chuck you just told me that i have to have this address and i can't talk on the internet if i don't have the right address and the packets are routed based on the prefix of this address why is it that i can be three places coffee shop home and school and it seems like everything works well that's because while your computer has an ethernet address that's baked in at the factory most computers are configured when they first open up and connect to a wi-fi to ask instead of having an ip address configured in your computer they send a request out to say hey i'm new here is there anyone who will give me an ip address that i might use for this particular location and if there is an access point say at your starbucks it says sure use this one you're at starbucks that's a good number for starbucks now it turns out that the prefix of this is exactly the prefix that the the world sees all this traffic to starbucks so there's actually kind of a real address okay and it gives you an address okay so you ask when your thing comes up it asks what can i get and then it's told what address to use so using one address while you're at starbucks a different address at home and a different address yet again at school so that's called the dynamic host configuration protocol now it turns out with schools and homes and there's just too many computers to give every computer a real address and so we have these special addresses they're called the non-routable addresses so you'll probably notice if you go to one person's house you will have a ip address of 192.168.something something go to different person's house it'll also be 192.168.something.something they go to your house it's 192.168.something.something and you're like how can that work i seem to have the same address well that's because of a technology called network address translation where each of the home routers actually has a unique and distinct address but you're not seeing it it's giving you a temporary address an address that really can't run at all on the internet it only lives within the house but then as your packets leave the house or leave the starbucks the real address is put in and then when it comes back the real address is taken out and your local address is put in so they're called non-routable addresses because if they ever escaped into the real internet they'd be like oh those are for your house they're only for they're not supposed to go very far and i don't even know where they go they go nowhere in the core of the internet but they go properly inside your house so the way this ends up working is if you are at your coffee shop the coffee shop has an address you associate with their base access point it gives you a non-routable address to use locally and then as your traffic gets sent this address is changed to the address that address you never see this happening it's done as the packet goes through the base station and then when the packet comes back it comes back with this address but then as it goes to the base station it switches back to this address so even though this address is not the real address the network sees they see you having this address here so if you would do an ip lookup from a coffee shop the coffee shop will be identified you will see the ip address here and if you looked at your laptop your laptop would not have that same address because it's being translated in the access point so then you go home and at home your access point has a different address but your computer asks your ass address asks your access point for an address and it gets a local address that's generated locally and again there's a mapping it's called nat n-a-t i'll change the color n-a-t network address translation so as the packet goes through it takes out the one address and puts in the second date comes back it takes the address out and puts the local one in and the same thing happens at school right so you're at school finally and you get a different address they're not the same but you look and the prefixes are the same it's like they should be the same place but they're completely different because your school has a different address and there's a translation that goes back and forth as the data goes around and round so just for the distance of the local wi-fi or whatever use these 192.168 numbers and then they're translated to the real numbers by your network access points so these are illegal inside this network so if it ever saw a 192.168 it would just throw that packet away they're only for very local connections so with that i'm wondering if by now you know why this is funny i'll give you a minute okay so the reason this is funny is she traced the killer's ip address and it had a prefix of 192.168 which means the killer had to be within a relatively short radius using likely the same wi-fi access point which means it was close very scary i hope you think it's funny it of course xkcd is never exactly funny but it hopefully makes you smile a little bit so up till now i've been talking about so up to now i've been talking about this cloud network that's magic and the packets take different routes and we don't know but they show up remember the mailbox like they just show up or don't show up right so it turns out that sometimes we as engineers want to take a look at what's going on inside the internet and it turned out that there was a feature they added early on to help diagnose problems in the internet that we still use today and it's so convenient that it's built right into your operating system if you have a macintosh or linux traceroute is built in and if you have windows you've got to install traceroute so just say windows install tracer out and you'll find something but there was a problem so if you recall i said that each router sees the world very narrowly and simply sees a packet and makes a decision on one place so let's say that here comes a packet on the way to stanford and these routers are sort of strangely configured this router thinks it should go there this router thinks it should go there and this router thinks it should go there well it comes in and goes like oh well i know where that one goes there so if you end up with this misconfigured router situation you end up with your data going round and round in circles it would actually like crush the network because it's like a whirlpool you can't even notice that it's happening because there might actually be you know 10 you know there might be a bunch of them and then it comes back and you send it around again so you're filling up all your bandwidth it's never going to get there unless something changes and such something crashes it's not going to change because these routers think that's the best thing to do they're mistaken but they are they can be mistaken because they're they're operating with imperfect information so how would you solve this problem it's like routers are imperfect so they solve the problem with a thing called the time to live field so much like the network interest translation tweaks the addresses on the way in and on the way out the time to live is a field that routers change every time a packet goes through a router it subtracts one from this field and it starts with a number between 25 it can be as high as 255 but it's usually like 25. and um what happens is every time the packet goes through a router the number goes down by one so if it was it would be come in here as 255 then it would go through here 254 and 253 it would come back it'd be 252 and what would happen is eventually it would get to one of these guys and it hit zero and they would decide okay you've been running around too long we will throw you away so the number goes down and it always goes down but then when it hits zero they throw the packet away they say we you have been through 255 hops chances are good you're never going to get to your destination so the traceroute command sort of cheats normal packets are sent with a ttl or time to live of like 30 30 hops or 40 hops but what trace route does it sends broken packets it turns out that when a router throws away your packet most of the time it is courteous and it sends you back a notification says hi i got your packet i decremented it subtracted one and it got to zero and i threw it away sorry about that here's here's who i am i mean i really feel bad about throwing your packet away maybe you want to figure something else out i don't know something must be messed up can't be my fault but i threw it away so what traceroute does is it first sends a packet with a time to live of one so the first router goes like whoa this has been around a long time set to zero throws it away then sends a note back then traceroute sends a packet of two across it goes hop hop and it gets thrown away a little note comes back so you can kind of build a map by sending enough packets and getting kind of a return rejection from one of the routers it would have got there so if for example i do a trace route from the university of michigan to stanford you'd if i did it now it'd be a different set of things i get this output and if i take a look at this this is the hop so there's the first one two three four through fourteen fourteen hops so it takes 14 hops now interestingly again i don't remember what the hop count was in the store and forward days but it's quite a few and that's because it's optimizing geography and so you can see the first top is on my campus then i the second and third hop are bouncing around my campus some more now we don't quite know where this one is but now it's on an on a national network called internet2 internet2 internet2 this is probably going across the country and then it ends up on scenic which i think is sort of california's network it bounces through california a couple times let's see uh hpr lax so that's los angeles this is i don't know where that is los angeles now looks like it's making to oakland i don't know there's probably some meaning to these things so it's making it through oakland and then it's going to stanford from something oakland i don't know but now it's on stanford campus and now it has three more hops to get across the stanford campus to the stanford campuses web server now what's also going on here is keeping track of how long it's taking it sends each one a couple of times and so these are milliseconds so milliseconds or thousandths of a second so 534 milliseconds is a half a second 490 is a half a second so these are like half a second half a second oh wait no no sorry not a half second that'd be four thousand well one thousand milliseconds would be one second so i got that all wrong so point four nine milliseconds is like half of a thousandth of a second so that's fast fast six milliseconds is six one thousandths of a second 76 milliseconds which is nine hops away that's seventy six thousands or seven one hundredths so it takes about seven one hundredths yeah so like 77 over a thousand so no seven one hundredths sorry seven one hundredths of a second it's less than a tenth of a second to get through fourteen routers from michigan to stanford less than a tenth of a second pretty impressive now if i do a trace route from ann arbor michigan to east lansing michigan michigan state university we have a very close connection i mentioned the merit network where we've had a close connection for a long time so not only is it fewer hops it's only eight hops to get to michigan state um if you look at the if you look at the hops here um i'm bouncing around the campus for three i'm bouncing through the state for two and bouncing on the michigan state campus for three so it's a total of eight two hops to get two three offs to get across my campus two hops to get across the state of michigan and three hops to get on campus and it's really fast it's nine one thousandths of a second which is less than a hundredth of a second so really fast you can kind of see it now if you ran this trace route more than once this might change it doesn't change too fast but legally it could change i mean it there's no guarantee it's going to be the same it's highly likely it's going to be the same because the most efficient way is not going to change within a few seconds but if you start in the middle of the day and you do it the next day it might change quite a bit so be something to play with run it print it out run it again at midnight run it's noon see if your trace route is different right be interesting so here's an example of a trace route from university of michigan to peking university in china and um and so it's again you know it's it bounces through the state of my campus for a couple of hops right my campus for a couple of hops it bounces around the united states for a couple of hops and then it starts crossing the pacific ocean it goes this there's traffic actually went through seoul korea and then it ended up in beijing now the interesting thing is you can see that it looks like it's taking about sixty one thousands or six one hundredths just over uh just almost well a half a tenth of a second to get across the country and then it starts taking longer and then by the time it's going all the way to china and back it's doing 256 milliseconds which is about a quarter of a second now the big difference here is likely not traffic it is likely the speed of light it is how fast it takes light to get across the pacific ocean so it takes a while the reason i think it's most likely these numbers understand it's low traffic because these are very consistent pretty much all the time so that suggests that we're not waiting for any traffic we're getting through as fast as we physically can which is some combination of the speed of the length and the speed of light so we just got done talking about how we add to every packet this global number the ip address it has the prefix of the network number in the internet part the internetwork where it's really moving data from one network to another network and leaving it up to those destination networks how to how to move it um we end up really simplifying the postcard and the postcard ends up being a really apt example but the key thing especially when thinking about the four-layer architecture that the reason that i think tcpip succeeded was in this real complex problem of moving data between billions of computers it kept the part in the middle real simple it doesn't try to be perfect it doesn't try to re-transmit data don't try to store it it doesn't try to keep it in the right order it doesn't try to say that if this packet went here i'm gonna make sure the next one goes there the two packets one if one gets ahead we don't really care um it means it really fast and really scalable and um by keeping it simple and really fast um it solves really an amazing problem but we yet have other problems to solve so i want to close this lecture by introducing you to another person so vint cerf was a graduate student as the whole notion of packet switching was being sort of examined both in the federal government at the defense advanced research project agency darpa and in higher education both in the united states and europe uk and elsewhere and so vint cerf was kind of at the right place at the right time and we could be very thankful for that he's credited as being one of the fathers of the arpanet which of course begat the internet and so vint is going to talk to us in a sense going back even free arpanet and bring us up packets and what packets mean and then how that all flowed into the arpanet and how the arpanet evolved to become the internet enjoy [Music] packet switching was an idea which was specifically studied by len kleinrock at mit who was actually looking at message switching and he did a brilliant dissertation on the use of queuing theory to analyze what networks of cues would look like using this message switching approach and his analysis although we'd never use the word packet is as equally applicable to packet switching as it is to message switching so that's one important milestone in round 1961. in 62 or thereabouts paul barron is doing work for the rand corporation and is deeply concerned about the ability to preserve command and control in a post-nuclear environment we were seriously worried that the russians would actually launch and that we would suffer a nuclear attack and then we had to be able to respond and we needed command and control for that so paul in 1962 before the existence of integrated circuits or anything else is saying we really should be digitizing and packetizing voice and then using sort of the pole mounted radios that are able to transmit in all directions to create a highly connected environment so that if holes were knocked out of it by nuclear explosions you would still if it's a fabric that's in any way connected information could get from one end to the other so he envisioned the notion of a message block and it was dynamically routed he used hot potato routing if you got something he got rid of it as fast as he could he chopped up the speech into little 20 millisecond pieces he didn't talk so much about data as i remember it and this was supposed to be a highly resilient voice network for command and control i may have just done him a disservice because later he was very much conscious of the importance of data communication too so that's around 1962 it gets documented in an 11 volume series called undistributed communications and he can't sell it to anybody the traditional telcos att in particular and the people at what was then the defense communication system or defense communications agency laughed him out of the room said this was a silly idea couldn't possibly work and so you know he should just go away so he never got anywhere with that in spite of all the documentation in 66 larry roberts along with one other guy whose name i'm now not remembering does a point-to-point experiment to test packet switching it was between the ansfq seven machine at s system development corporation in santa monica and uh the uh tx2 machine at mit lincoln laboratories was where larry was they demonstrated on a 2400 you know bit line bit per second line that you can move packets back and forth then in the 64-65 time frame a man named donald w davies at the national physical laboratory in london also gets the bug tries to get money from the science research commission at uh in england and gets only enough to build one node you know he the one node network so he builds this packet net he invents the term packet to describe what these objects are and it works he's got a bunch of you know terminals and other things hanging off of this one node so in a funny way he built a local area network if you like but it was you know based on physical wires so those three guys introduced packet switching larry and uh whoever it was that he worked with um demonstrate that it's possible to get two very distinct kinds of computers to talk to each other using the standard way jcr lick glider is a psychologist actually at mit but he's convinced in the early 60s that computing is important to non-numeric processing that it will allow people to work together and collaborate in ways that they never could before he comes and starts the information processing techniques office at arpa with this being his bonnet and who does he encounter he encounters douglas engelbart at sri international and the two bond basically because engelbart was all about non-numeric computing and the ability of people to build up the superstructure of communications and documents and interact with each other hyperlinking the mouse the portrait mode display back on my black and white presentations i mean the guy had a world wide web in a box at sri and lick lighter understands that lick lighter is sending out notes to his little community of people talking about the intergalactic network you know maybe tony and cheek so he really gets credit for having put this meme in place at arpa then taylor comes along to pick up the the responsibility for running the ipto from lick lighter and is all hacked off because he's got three terminals in his office at the pentagon connected to three different machines and he can't through he says why can't there be one terminal talking to all three we need a network and so as he's pursuing this idea uh with charlie hertzfeld who's the head of arpa at the time charlie hands him a million bucks over a 20-minute conversation and now taylor's got the problem who's going to actually do this because taylor is not a technologist either he's another you know kind of psychologist type so he decides to get larry roberts from lincoln laboratories who did that packet thing and larry doesn't want to come so he goes and complains about this to charlie hertzfeld charlie calls up the guy that runs lincoln laboratories and says you know we pay for a significant fraction of your research budget every year you should you should tell larry he should show up and in fact i thought that maybe larry had been forced to do this you know by charlie i think it was probably a little less awful than that but larry was persuaded to come down eventually of course inherited the operation of the office from bob taylor but in the meantime is the guy responsible for doing the initiating the internet or the arpanet project so they write an rfc or an rfq request for quotation a bunch of responses come back probably on the order of a couple dozen i don't know personally for sure how many i know that i wrote one of them with my colleague steve crocker while we were still at ucla as graduate students but we were consulting with a company called jacobi systems in santa monica jacobi systems wrote one of the responses bull bear neck and newman wrote another response primarily written by bob khan who come to bbn from mit so the responses come back and they get evaluated and four of them end up and the jacobi systems one isn't one of them or if it was it didn't get selected bnn got selected so steve crocker and i kind of hiked back to ucla as graduate students and the next thing we know glenn kleinrock who's at ucla and who'd written you know this original dissertation work on packet switching has come to ucla to teach and explore um a queuing theory is uh a closed uh compatriot of larry roberts because they were both at lincoln labs together so he gets the network measurement center piece of the arpanet project and steve crocker and i and john postel all of us from the same high school in santa monica and the san fernando valley end up in lane kleinrock's operation running the network measurement center so i was the principal programmer for that steve crocker took the responsibility for managing and leading the network working group has led to the protocols host to host protocols and john postel eventually becomes the keeper of the documentation he's the rfc editor which steve crocker started request for comments he's the guy that becomes the numbers are which is keeping track of address spaces and allocations and eventually becomes the domain name manager or the internet assigned numbers authority when the internet happens that hasn't happened yet the this period of time of the arpanet program it brings us up to 1972. and this is an important moment in this whole history because the first demonstration of the arpanet happens in the washington hilton hotel basement in october of 72 a whole bunch of people from the networking interested networking community packet switching community attend not only in the us but from france and from england and italy and germany and elsewhere that group of about 25 or 30 people convenes sees the arpanet in operation sees applications that were being done including doug engelbart's stuff and then forms this international network working group modeled after the working group that steve crocker managed to build the arpanet system and at this point i become the chairman of that group because steve is busy at arpa doing artificial intelligence the next at the end of that year bob khan leaves bulk baron nick and newman and goes to arpa i leave ucla where i've been working with kleinrock and and crocker and pastel and i go to stanford so bob is at arpa i met at stanford and in the spring of 73 bob comes out from arpa and he says i have a problem he says what's your problem he says well we got this arpanet and he said yep but we also are working on other networking capability to make command and control work for the military if you're going to be serious about putting computers in command and control they have to be mobile they have to be in you know armored personnel vehicles and tanks and all these other things they have to be seabourn so we can ship the ship and ship to shore communications which means satellite and we so we need mobile radio we need satellite in addition to the fixed wire systems that are represented by the arpanet we have fixed installations that are not moving around so we have all these different technologies and bob's brilliant idea is not to build one network with all those technologies embedded in it instead he breaks them apart and says let's build a packet satellite network which optimizes the use of satellite takes into account that it's got a half a second round trip time let's build a packet radio network which optimizes a system whose connectivity is changing with time as things move around and you get variable delay and also variable interference so these were three different packet networks then the problem is how do you hook them together you wouldn't have this problem if you put all the technologies into one net but if you put them all into one net it makes it really hard to do control over all these highly variable parameters so instead he says break them into different networks and connect those together so we design and build a gateway which today we call a router and that concept also introduced a whole bunch of other things like how do you refer to another network each network thinks it's the only network in the universe this is true of the proprietary networks like sna and decnet and so on and you didn't have a vocabulary to say take this packet and move it to another computer on another network somewhere else that you might not even be connected to so we have to invent an internet address space in order to solve that problem we have to find a way to allow packet losses in this path to be recovered which is where tcp now becomes a manager of reliability on an end-to-end basis instead of relying on each net to be reliable the arpanet was built on the assumption you could build a reliable underlying man the internet was based on the assumption that no network was necessarily reliable and you had to do end-to-end re-transmissions to recover so during this 1973 period bob and i get the papers first paper written and published in ieee transactions on communications may of 1974. and i think mostly nobody paid too much attention to it meanwhile arpa is funding us to go make this actually work at stanford i am working with my graduate students some who are in xerox part summer at stanford on this detailed specifications of tcp we published that in december of 1974 and it's the first time the word internet shows up in print anywhere because the first papers we talked about internetworking so internet shows up in a complete spec in december 74 and it's also got bugs but we don't know that yet until we start implementing it in 1975 with two other organizations so bob khan says can't have just one implementation baltimore naked newman becomes one of the implementers university college london and peter kerstein's group in england is the second or third implementation so we have three implementations of tcpip in fact by this time it's only tcp we haven't broken off the ip part um three implementations are going we instantly find problems with the design and we start evolving so over a period from 1973 to 1978 we go through four iterations of the design and implementation and test until we have a fairly stable thing and then we standardize and so 78 we fix everything it's now by this time the internet protocol has been split off because people like david reed and danny cohen are saying we need to have real-time communications that is not necessarily reliable but which has low latency so voice communications radar tracks and all that you don't care where the missile was two seconds ago you want to know where is it now so and in the case of voice if you lose a packet you just say say that again i missed something so we split tcp into tcp and ip and we create something called user datagram protocol which is parallel to tcp and it is the real time low latency version of the reliable tcp all of those little components iptcp and udp now go into the internet architecture starting in 1978 and we start implementing for the next five years we start we do everything we can to get tcpip implemented on every operating system we can find it goes on to the ibm machines it goes under the digital machines hp goes into unix we have a unix version built by bolt bareneck and newman we send it out to berkeley to the berkeley bsd release guys and bill joyce's i don't like that code he writes his own puts it into bsd 4.2 and that's the version of unix that carries a lot of tcp to the academic world because at the same sort of time frame some microsystems comes along and builds these fantastic workstations and they want to use open source or at least open protocols and open operating systems so they adopt unix and the tcp comes with it and they use ethernet as a way of connecting workstations together so they are the engine that's driving the academic community which are all game busters for workstations and high-speed local networking so all of this of course places huge demands on the arpanet backbone which is only running at 50 kilobits a second and eventually leads to the need for higher speed nsf jumps into the fray seeing how valuable all this is for the academic community and concludes that it should build a network that runs even faster and it does so it's called nsf net [Music] the domain name system it really doesn't have a great place to fit in this architecture it is the thing that converts user friendly names like www.umish.edu to to uh like a network-friendly address these network numbers and ip addresses are important because they are the they encode the geography of the connections of the internet but we humans really don't care about the geography of the interconnections of the internet www.edu or www.facebook.com is what we want to remember so i like to kind of think of the the uh the domain name system is kind of like sort of somewhere either between the internet and the transport or between the internet and the link or somewhere like in this area you know somewhere here and it's certainly not in the application layer it's not the link layer it uses the link layer but basically it's kind of a little add-on to the side so the domain name system is for user-friendly names okay so let's talk a little bit about just how it works because domain names are what we use all the time and i p addresses are what computers use all the time and routers use routers really have no knowledge of the domain names they simply move data based on the ip addresses numeric addresses are tough for people to remember i remember in the old days we would have little post-it notes on our computers and that's how i would keep track oh there's a new minnesota just put up a server let's put that number on my little post-it note so the fact that they were numbers didn't bother us at all because there was only 40 of them but quickly there was more than 40. and each campus even in the early days only would have one network number but now campuses have 20 or even 40 network numbers multiple groups of addresses they're groups of addresses but they're many of them they get reorganized you move a server from one place on campus to another like the the michigan web server it was on our north campus it would have one address it was on our main campus would have another address and so you don't want people to know what these ip addresses are so they invented this notion of a domain name system the name the the visible name that we could switch the mapping from the name to the ip address transparently and so the domain name system is like the internet's address book and it's a big distributed database that's fast it uses caching so that it's locally fast even if the network is partially down i p addresses reflect the technical geography and they read right to left at least to the point where the network number is kind of the physical attachment point for the campus or business and then this part is the attachment point within that campus or business and so the most specific the least specific to most specific goes from left to right domain names on the other hand are like organizational structure the least specific is on the far right and then it goes more specific as we go left so edu means educational institutions umich.edu means the particular educational institution university of michigan si.umich.edu means a school of information at the university of michigan not all schools is just a particular one and then dub dub dub is a particular server at that school and it reads like a postal address so if you think of where i teach 2455 north quad it's kind of on earth and it's on the country usa and it's in a zip code it's in the state of michigan it's in ann arbor and then this is the building and then this is the room number in the building so we go from very general to very specific so domain names read in a sense right to left phone numbers and ip addresses read left to right the other thing that's interesting about domain names is that they're owned they're owned from right to left and so there are sort of there's a hierarchy and basically an organization owns edu the organization's name is educause it has conferences and other things but one of the things that it does is it owns in the public trust the name edu so university of michigan could go to edu and say hey could i have umich.edu and they would say let me think about that and then maybe they would give it and they did because university of michigan nobody else had it and it seemed like a good name and university of michigan was like accredited institution of higher education so they would give it to them if i wanted to say like hey i'm dr chuck and i want to create dr chuck university i could have dr chuck.edu and they would go like no you can't you're not a university you're not you're not an educational institution so they might say yes they might say no things like.comand.org they're kind of a catches catch can and so the first one who gets there uh tends to win there's there's some situations where if if i got coca-cola.com and had no particular use for it and coca-cola owned the trademark coca-cola they could take it away from me unless i had a legitimate purpose for it it's harder for me to imagine a legitimate purchase coca-cola but target.com might be a club for target shooting and it would be difficult for target to say sorry i gotta take your your your target shooting club away because that's even though it's not as big as target's company it uh it's a legitimate use so at each level once the university of michigan is awarded the university of michigan it creates a mechanism to award sub domains within that and so there is actually a committee at the university of michigan and if we want like our learning management system is called ctools.umich.edu if we want that we have to go to the committee and convince them to give it to us a top-level domain within the university of michigan we got one the school of information where i'm a faculty member is si.umis.edu and they gave us that one i could go to that committee and i could say hey i'm going to be doctorchuck.umich.edu and they would say no you can't so then the school of information where i teach has si.umich.edu and they have a committee and they can give out sub domains within that so you get the picture right so i could go there and say hey can i have doctorchuck.umich.edu si.umich.edu doctortruck.site.iumish.edu and they would say no but you can have csev.people.si.umis.edu so once you own one of these things you can give out subdomains i do own doctorchuck.comchuck.com [Music] and if there was a high desire for something underneath doctorchuck.com then i could have my own little business selling domains under doctorchuck.com nobody seems to want any of those things but you know some people have mail.com and other things and so each one of these is a new potential for expansion and so that's kind of how the domain name works right it is this mapping between these names there is sort of this right to left ownership of these names and uh and you have to ask people to get those names and so that's a real quick overview of the domain name system [Music] hello and welcome to the transport layer we're working our way up from the bottom of our four layer architecture up to the top so we're kind of at the halfway point we've covered the link layer like ethernet and then we covered the internet work layer which is kind of like the postcard layer so let's let's review the magic of ip right the magic ip is this postcard layer that bounces these packets with from addresses and two addresses through you know 15 or so hops getting them there getting packets to their destination network from one network to another network as well as it can and when it gets to the destination network it finds the final computer on that network the magic of this is there is no interim long-term storage inside the network all the long-term storage is outside and now we're going to talk about that so one of the things that makes ip so fast is that it is not demanded to be perfect it is not demanded to deliver data in order and it's not demanded to it is there's no requirement that it doesn't lose data it's fast and barely ever loses data but when it does there's a layer to recover and that's we're going to talk about next so the internet protocol is this multi-hop packets can take multiple paths they can make use of all kinds of crazy links but now we're going to move up one right we're going to move up to the tcp layer so the tcp layer is both simple and complex the purpose of the tcp layer is to compensate for the possible errors in the ip layer as well as make best use of available resources so if the network if the overall network from the overall network from here to here is extremely fast we want to send data really fast right if on the other hand the overall network from here to here is really slow we want to send data slowly and be efficient because remember part of the goal of the tcp networking is to share effectively and so we need to be aware of whether our network is fast or slow and those are the kind of problems we solve the tcp layer how fast is the underlying network how reliable is it and if something goes wrong what do we do to deal with that so the key idea in tcp is that when we send some data we break it into packets and then we send each one and then we keep them until they get an acknowledgement from the other side and then and only then do we throw them away and at some point if a packet gets lost it can be sent again and again and again until it finally is acknowledged in the destination system and so that's basically what tcp does is it figures out which packets have or have not made it across the internet layer so here's an example scenario so we got a message it's broken up into five packets we've got sort of the first hundred characters the the you know first characters the second and third packet now what tcp does is it speculatively sends a few packets says okay let's let's get ahead because if you sent one and waited then you might not make best use the network so you kind of guess and dump a few packets out there and so you said let's say we send three packets cue them up for sending and as fast as we can send them out the back of our network we start spooling them out the back of their network and somehow two of them make it across the internet to the destination but somehow that poor second package just like every time i'm doing a lecture the poor second packet never gets it you don't want to be the second packet in one of my lectures because you're gonna get it but somehow the second packet has gotten vaporized now time passes and the receiver gets this this sensation that maybe maybe maybe something's missing and so it sends an acknowledgement back and it says i'm going to send a note back that says look i'm ready for 200. i do have 100 and i'm ready for for 200. matter of fact i might even throw away 300 because it's been so long so i want you to start me over at 200. but sender all of a sudden knows that 100 has been sent and so it knows that it can throw this one away now right it's been acknowledged so there's an acknowledgement stream that goes back and forth so now the sender says 200 and 300 and then they make it across the receiver says i've got 200. this is the 300 and then they send 400 speculatively 300 makes it 400 makes it and the receiver says i got 400 which means now it can cross off 300 and 400 right check those off and throw them away and then send 500 across and then at some point check those off it's got 500 and now the sender can sort of like empty check everything off everything's been sent and it's been acknowledged and we know and so that's kind of an oversimplified view of the kind of bookkeeping that's going on on both sides of a tcp connection and so this right here is in a sense the genius of the internet the storage requirements in the middle the routers this is the ip basically the the network of networks they we didn't design these to require a lot of storage we wanted them to be fast we wanted them to be agile we want to be dynamic we wanted to be clever but we also gave them right to fail so we didn't demand storage we didn't say hey to hold on to packets you know all over the place just store up piles of packets piles of packets would pile up in these we don't ask that no router has to keep the pack as a matter of fact routers are supposed to throw packets away to communicate back and forth between the system that maybe things aren't working so well and don't use me to get to california this on the other hand with hundreds uh maybe you know hundreds of thousands or millions of routers but billions of computers on the outside so we need a way to make this reliable so we need to have memory to store the packets while they're in flight so we can retransmit but we store the packets in the computers that are outside and there are billions of these computers there are billions of them every computer that you carry around every laptop yeah come here every time we add another computer to the network we add storage for packets that are being sent so when your computer or our phone is sending across the network it is responsible for retaining its own copies it does not expect the inner part of the network to do so and that is absolutely brilliant so the it's what really makes this work and so that's kind of a a great oversimplification of what's going on and it turns out that there is still a lot of engineering to make that happen so i'd like to introduce you to to another of the innovators in the that we meet in this class uh his name is van jacobson and i met up with him at xerox park uh he did the work we're going to talk about while he was at berkeley and in the late 1980s uh there was this prediction that the internet was going to die and uh it seems obvious that it was a it's a good idea now but back then there was a bunch of folks that felt like academics weren't smart enough to make a network and uh and they were like ibm and digital equipment that thought they the vendors should make the network and we should just pay them to use their network and um as the nsf net was coming up and more and more computers were being connected and the background was backbone was so slow it started to fail and it looked like the predictions of all of the computer vendors that said academics couldn't build a robust scalable network we're going to be true and so van jacons tells the story very differently but the way i saw the story happening and unfolding in 1987 is van jacobson saved us the network was crashing and we all installed van jacobs and patches and the network got better and it in my recollection this was the last time it appeared that the entire internet was going to crash so we named it the van jacobson protocol and he doesn't like to call it the van jacobson protocol because he's a shy and unassuming guy but i i think he saved us and so this interview is him describing sort of that moment back in the late 80s where he invented the slow start algorithm that really is part and parcel of every tcp implementation that you have on every computer that you use as a matter of fact it's being used to to flow control on this lecture right this second as we speak so here's van jacobson [Music] do [Music] there were lots of campus ethernets because they were really easy to deploy and you could put them in a department and then you could run a wire between two departments and you had a bigger network and so we'd grown up networks sort of by agglomeration in lots of different university campuses and nsf came up with some money said oh we've got a little bit in our budget where we could get some 56 kbit lines and tie those campuses together and they did that made the nsf met phase one but now you're tying together 10 megabit campus infrastructure with 56 kilobit wires and it was wildly popular because people that couldn't talk could suddenly talk and they're sending emails and moving huge files and uh just everybody's really excited about this technology but um any one of those campuses could over subscribe the net by a factor of a thousand so we had a lot of packets piling up and getting dropped at the time i was a researcher at lawrence berkeley lab which is in the hills up above the berkeley campus and i was also teaching on the berkeley campus even back in those days which was mid 80s we had a for every class there was a messages group you know like a little news group that was set up all the assignments would be put online and i was trying to get course materials from my office in lbl down to a machine in uh evans hall at berkeley and uh there was like zero throughput in the net it was uh one packet every ten minutes or so and it seemed unbelievable would be unbelievably bad and i went down and talked to mike carrolls who was heading the bsd group the people that develop berkeley unix and he's getting reports of these problems from all over the country from in those days the easiest way to start running tcp was to bring up berkeley unix because there was a arka-funded very nice implementation in it and everybody was seeing poor performance so we talked for a long time that day and on succeeding days about well what's going wrong but is there some mistake in the protocol implementation is there some mistake in the protocol is this thing was working on smaller scale tests and then it suddenly fell apart i think we struggled for three or four months just going through the code writing tools to capture packet traces and looking at the packet traces and trying to to sort out what was breaking and i remember the two of us were sitting in mike's office after we've been pounding our head against the wall for for literally months and one of us i can't remember which one said you know the reason i can't figure out why it's breaking is i don't understand how it ever worked uh you know we're sending these bits out at 10 megabits they're zipping across campus they're running into this 56 kilobit wire we expect them to go through that wire pop out on the other side go through how could that function that turned out to be the the crucial starting point that at that point we started saying well what is there about this protocol that that makes it work how does it deal with all of those bandwidth changes how does it deal with the multiple hops so this picture that direction is time this direction is bandwidth so that's a fat pipe and that's a skinny pipe and uh the scale at the time is you know this is a 10 megabit pipe and this is a 56 kilobit type so here the difference is about three to one it was really closer to 100 to one and so time seconds times bits per second equals bits so each of these little boxes in there is a packet it's the number of bits in a packet and if you scrunch it down in bandwidth it's got to spread out in time because the number of bits doesn't change and so uh see the burst of packets of windows worth of packets gets launched it's going to fly through the net until it hits this fast to slow transition and then because the packets have to stretch out in time they'll have to sit there and wait as they're fed into the slower wire and you they pop out the other side they get spread out by this bottleneck by the slower wire once they're spread out they stay spread out right there's nothing to push them back together again they hit a receiver it turns every data packet into an ack so you've got a bunch of acts that are going back towards the sender and they remember what's the right spacing for that bottleneck uh so the axe get back to the sender and every act gets turned into a data packet so we can see the data packets flowing back and this is after one round trip time now the packets are coming out perfectly spaced so they go by uh a new one goes into the net in exactly the time it takes a packet to exit from the bottleneck so these acts are sort of acting as a clock that tells you tells a sender when it's safe to inject every new packet and they're always going to be spaced by whatever is the slowest point in the net and the key thing is is how do you how quick how can you get the steady state sort of most quickly without wasting i yeah and the issue the failure we saw was this works perfectly after you've exchanged a round-trip time worth of packets but when you're starting up when you're here there's no clock and so the hard part on tcp is not getting it running it's getting it started because once you've got it running you've got a clock tells you exactly what to do so if you turn them on suddenly you get in this repetitive failure mode where you saturate the the buffering that was available at some gateway and then when you retransmit you do the same thing again so you're always losing packets but if you turned it on more gradually then you wouldn't overload the buffering and you'd get enough of a clock going so that you control the amount of backlog to fit the available buffer but you'd still be growing the number of packets in flight so that you'd eventually get a you start with a kind of sporadic clock but you'd eventually fill in the details and get it for pack o'clock how did you get it to the point where it was in all the tcp implementations on the planet because they kind of have to cooperate in a way so remember it was a much simpler time when you're talking about all the tcp implementations on the planet at that time there were like four so there was the berkeley unix one there's the mit pctcp there was a bbn one that was used in butterflies and imps and there was a multix one i took the couple of tcp kernel modules that we've been working on packaged them up in a tar i had this horrible driver hack that would let us nerf packets from the kernel and i mean it was really a horrible driver hack it was the way you said what you wanted to snarf was by adb in the kernel you you wrote in binary some new values these are the ports that i want to look at and the driver would capture those into a circular buffer and you'd read kernel memory to pull that buffer out craig larris and chris tork who were working in my group at lbl and we're both long-time colonel hackers we're just embarrassed at this and they put together a really nice clean driver thing called bpf the berkeley packet filter that would let you pull packets out of the kernel by a a very efficient i o control interface and so we bundled all of that up and um on the tcpip mailing list which in those days was you know tcp was very experimental it was very leading edge and pretty much everybody who was playing with it was on that mailing list to announce that this stuff was available a bunch of people ftp'd it uh tried it it blew up sent kernel core dumps and bug reports and fixed the bug reports and put new versions out and somebody would immediately come back and say panicked here do you want the k core and i go oh no embarrassed put out a new version go out somebody else would come back and say panicked here and fixed that and just cycled like that and after about a day we got a version that didn't immediately panic and then started working on the uh the actual algorithms and a little bit of tuning to make sure that it actually did good all the time and didn't do any harm just completely a community effort and you know sort of when the community was saying this uh mostly does good and never seems to do harm that's pretty what much what mike needed to put it into the kernel so he took that the community developed modules and rolled them into the bsd release [Music] so basically the transport control protocol has a responsibility of compensating for the imperfections of the ip layer data can arrive not out of order it can arrive not at all and so the tcp layer marks the data in a way and stores the data in the source computers until it is acknowledged by the destination computers and so tcp buffers this information and the buffers are kept on the edge and this allows the internet to grow in wonderful ways so this results in a layer that really provides us what seems to be an end-to-end connection so we can send data in one end a stream of data just send it in just roll it in and out comes a stream of data it comes out in the same order it comes out reliably and we can stop thinking about it again the whole goal of this layered architecture is so that we can pretend that the complexity inside this box is simple so it's not simple inside but we the applications we're going to make use of this transport layer can pretend it's simple okay and so we just i put in h and an h comes out and i put in an e and an e comes out and i put in an ll and ll comes out an o comes out and it's hundreds of thousands of engineers and billions of dollars of research that makes it work but it just is simple it's been oversimplified so now we've talked about the link layer the ip layer and the transport layer and up next we're going to talk about what we do once we have reliable pipes connecting one application to another [Music] so now we're going to talk about the application layer and up to now we've been talking about these bottom layers and we've been sort of working our way up we talked about the link layer the ip layer and the tcp layer and and and each of these layers works with the other the the ip depends on the link layer the tcp layer depends on and adds value to the ip layer and now we're going to talk about the application layer that is going to make use of the services of the tcp layer and the services of the tcp layer are basically to give us a reliable sequenced end-to-end stream that can start in one application in one computer and end in an application in a different computer and have a two-way communication so i can send the word hello hello from one application and then all kinds of billions of dollars of hardware hundreds of thousands of engineers 40 years of engineering design and out comes hello on the other end and what's beautiful and this is the beauty of the layer of the four layer architecture is we just pretend this is magic this is magic we send something and out it comes with that magic available with that billions of dollars and 40 years of research magic available now what would we do right now what would we do with that magic if we had it available to us so tcp is giving us this reliable pipe we're going to think about now we're going to ignore all the rest of it right we're going to ignore all this part we're going to all we're going to say is we have an application on this into our computer now remember this is your computer and say this is the server right so this is you and this is the server and all the software runs in it in in every all the software's running in your computer and all the software is running in the server so the question is what are we going to do between these two things okay what how we what how are we going to ask for data and this application is the client application and this application is the server application client and server are not necessarily the same very much the same client server are likely very different applications the client is like give me information and the server is responsible for giving giving the information to the client so client is often making a request and then the server is making a response back and so this client server mechanism this application to application communication we can build a mail system with it worldwide web or we can stream videos much like we'd be streaming this lecture back and forth but we're going to focus on the world wide web because it's really simple and really elegant and it's probably the easiest one to understand and it's the most popular but there are many other protocols going on file transfer and others but the world wide web protocol httpd is the protocol that is the most popular so the there are two basic questions that the application layer has to uh solve one is which application gets the data and this is done using a mechanism called ports and ports allow a ip address or a single computer or a single server to serve up multiple services and then for a client to be able to dial up much like a telephone extension and pick the service that they're interested in once you've connected to the service that you're interested in like the world wide web service then you have to know the protocol to talk to that ports in tcp are like a telephone extension number again like i said they you can connect to an ip address but then you can connect to a port within it and so if you think about a telephone number and an extension it's sort of a further refinement an ip address gives us a particular server a piece of hardware connected to the internet and then a port within that tells us what application we're going to talk to so let's talk about ports and connections so if i have a server and this is kind of a arbitrary server here's here is a single server www.mish.edu and it has one ip address connected into the cloud we can have many clients talking to it and we have many services running on this server like sending email or logging into that server or retrieving web pages or retrieving my mail from that server and there are ports so incoming email is done generally in port 25 remote login is 22 or 23. web servers depending on whether secure or not or 80 or 443 and the personal mailbox is on ports 109 or 110 and so these computers don't just connect to an ip address but they connect to a port within an ip address so some common tcp ports that we have are the ones i just mentioned and you can take a look at these on the internet the various ports okay now once we have a connection to the web server or to the mail server or to the post office server we have to know how to talk to it and that's what's called an application protocol and that is the rules for conversation the rules for conversing so tcp gives us this reliable connection we now can connect to the server that we desire to connect to by using ports and the question is what are we going to say across that connection and what we say across that connection who talks first what do you send what comes back depends on the kind of server that you're talking to we're going to play mostly with the world what the world wide web click the wrong button there the world wide web server because it's the simplest and it's the most obvious as to what's going on so the world wide web clients and the worldwide which are otherwise known as browsers the world wide web clients and the world wide web servers communicate using a protocol called http as a matter of fact if you look at the top of your url it's http colon slash something right and and so basically there is a specification for how this is done so when they wrote down i'm going to write the first web browser the first web survey they also wrote a document about how that piece of software would talk to this other piece of software and it's a very simple concept http has a protocol where you make a connection to the from the client to the server the client requests the document the server feeds out the document and then the connection is dropped okay so it's very simple we'll actually simulate this this is called the http request response cycle okay and so the way it works is you're in a browser and you click on a link the browser is an application it's the client running on your computer so so this is kind of your computer down here okay you click on a link then the browser makes a connection to the web server and sends a request to for the document the web server looks up does whatever and sends the document back and then the document is displayed on your screen okay so let me show you sort of how this works with a web browser so here is a canonical web page so here's a browser here's the url that i went and got you can get this url as well you can even look at the source so if you look at this source there is my html but one of the things that's in this source is a link and if i hover over top of this link it's going to look like that to hover over top of the link you can see at the bottom of my screen you can see it's going to tell you that we're about to go retrieve page 2.htm now this browser is running on my laptop it is a web client it is an http client and i'm about to click on this piece of software running on my computer and then it is going to make a connection out the back of it retrieve a document and then show me the other document so it's gonna happen real fast i'll click on second page it requested and got a document back and showed me the document so that's the contents of the second document okay and so that is the request response cycle that is triggered that is triggered by me making a click somewhere in my browser the the browser sees the click it opens a connection this is sort of the internet over here it makes a connection request a document the document comes back and then the new document is displayed the request response cycle click request response display click request response display so let's take a look in some more detail as to what's really going on here so the command that is being sent up to the server across this connection remember this part here is the internet right there the command is a get command and the document is based on the url i clicked on that the browser knows about and says get page2.html and then it comes back is html and that is a markup language that describes how this doc this page is supposed to be shown so let's say that we wanted some more detail on how this really worked right let's say we want to write a web browser and we want to be a good citizen and we want to talk properly to the uh the web well we would go back to the ietf the internet engineering task force and we would go grab the document that says hey if you want to write a browser this is what you do this is what you do if you want to write a browser go grab rfc 1945 which is the hypertext transport protocol version 1.0 now i'm sure there's more complex superseded versions of it but if we read it and read it and read it we spent a lot of time we get down to section 5.1.2 and it would say if you are making a request for a document this is the request line it has the letters g-e-t followed by a space followed by the url you're looking for followed by a space followed by the version okay so that is the rule that is what you're supposed to send down this connection i'm going to give us some more examples on page 24 and so we could read all this and we could write a web browser but what we're going to do is we're going to fake being a web browser we're going to cheat we're going to hack it we're going to pretend to be a web browser we're going to send the commands to the web browser now i'm going to do this on a macintosh you could also do it on windows if you installed a telnet client okay so the key is you install a telnet client now most people would say telnet is insecure and it is it's a very old protocol because all it does is it opens a tcp connection to a host on a port that i specify and whatever i type goes across that tcp connection so it's a great way to hack especially older less secure protocols so because http is generally a public protocol it's not really highly protected so it's much simpler to hack it hacking secure protocols is requires writing software rather than typing commands so if you're on windows you have to install telnet t-e-l-n-e-t so what i'm going to show you in a second is i'm going to show you hacking and i'm going to telnet to www.doctortruck.com port 80. what i'm saying is connect me to this server on the internet and connect me to port 80 because i want to do http connections and then i will send one command i'm going to send that command and then the web server will give me back the page okay it's probably easier for me to just do it so let's take a look i am going to simulate what the web browser does when it's requesting well actually we'll go to the second page and when it's requesting the first page it's doing a get for page one okay so that's what we're going to do except now we're going to do it in a terminal so my type telnet oops i always type this wrong t-e-l-n-e-t www.doctordashchuck.com port 80. okay so now i am connected to the web server on doctorchuck.com i could for example type yo what is up dude now that would suggest i have not read the protocol specification for the http protocol so i do not know that i'm supposed to type the word get here i'm going to type yo what is up dude and it goes you are not from my country you are not from my planet i do not know what you said of course that the way web servers are saying this is they're saying bad request your browser sent a request that this server could not understand it is the request header is missing a colon separator you do not comply go away but we did talk it just told us that whatever we were saying is gibberish to it and that's okay because we have read the document and so we can do it so let's do it so now i'm going to do a telnet to doctorchuck.com on port 80. go to dr chuck hook onto port 80 so we can talk to the web server and now i am going to behave i've read the specification i'm going to type gt get space one space http colon slash slash www.doctorchuck.com one dot htm http space one dot i think there's a slash there oops oh no no no get it right did you write slash one dot zero i think i got that right so that is what i'm supposed to type if i am a real browser now i have to actually type another new line so i'll type another new line and now whoa oh i gave it the wrong page look look look look look i wanted to do i told it to get page.htm so let's do it again do it again i'm not very good at this my browser is much better at it so i type get http colon slash www.drdashchuck.com page1.htm space http slash 1.0 i believe that's it let's hope i got it right this time and then another enter and another enter yay i got it so this time i type you know i type this then i type this now it gave me some header information it's saying things like 200 okay means i like you uh what the time is when this file was last modified it's an html file and it actually shows me the html text and so we have just hacked a web server you can do this to like facebook.com and do a get if you get the right thing it'll give you stuff you could go to various sites you can fake browsers now you're only going to get so far because at some point these servers know that you're not a browser because the browser actually sends a little more stuff than what we just sent but you get the picture okay it's a protocol where we have rules and if we type the right thing if we type the right thing we know what the specification tells us to type we are rewarded with the information on that server if we comply with the server's request we can talk to it if we know how to talk to we know a port to talk to and we know what protocol to talk to that port we can write a client that meets the needs of that server and extract the data so this is i like i like command lines i like sort of old school user interfaces and this is a scene from one of the matrix movies that you may recall i can't put it in here because of copyright requirements i might have the url or maybe students will come up with the url you can probably search for the trinity hacking scene people put it up and then it gets taken down and someone else puts it up so let me tell you a little story about this scene this scene was actually written by a former student of mine who worked on all three matrix movies doing i t work on the movies and they were in australia shooting the the three the second set of three movies and he saw this scene and the scene was supposed to use a minority report style way for her to hack into the the power grid and shut it down he said that's not the way that you hack into computers all good hackers use a text based interp interface some old thing and that's a throwback interface so the wachowski brothers told him rewrite it so we actually downloaded the exact hacking software it's called nmap which stands for network map it is mapping the ports it's probes the ports and sees what's on the ports see what version of the software is on each of the ports figures out what the what the weaknesses of those things are and even automatically exploits the weaknesses so he wrote a script that downloaded this thing that automatically breaks into computers by the way the good guys use this and the bad guys use this same thing the good guys use it to test all the computers on their campus and make sure that they're okay and the bad guys use to break into the ones that the good guys aren't smart enough to fix and so this was kind of cool and and the people who wrote nmap were really quite amazed when they saw the movie they even figured out what version it was and things like that and so it actually triggered a series of uh follow-on scenes that used actual real hacking software uh nmap hacking software so i don't know if we've got a url for this or not whatever our url i come up with um even the one the nmap people come up with whatever it is it goes away because of copyright but you can find it because someone always puts it back up so okay so you can go and maybe the unmapped people will know it or maybe you just have to search for the trinity hacking scene so you can use telnet and you can hack legitimately hack a web server so basically the application layer is a rich place and there's lots of things that happen in the application layer we have this pipe abstraction it's kind of like a a string with two tin cans and we can shout into one end and stuff comes out the other end uh later we'll talk about adding security to that and we use these port numbers to allow to multiple different servers or services on a computer that we can connect to so we've kind of now reached the top of the architecture of the internet we've talked about ethernet the link layer and the fiber and all the diverse wireless all the wonderful things that happen there the internet work layer which is this kind of geographical hopping thing that's unreliable the tcp layer that retransmits automatically if necessary and then the application layer which is what we do with all this networks it's really impressive that this came out of research work from the 1970s and that most of the architectures that were really present most the ideas the architecture from the 70s are really still present with us today as the nsf net came out these architectures evolved a little bit the last few tweaks and there have been tweaks going on ever since but the really the last few critical tweaks happened in the late 1980s the uh the number of web hosts um or the number of hosts on the internet goes from six in 1969 to sort of a billion in 2011 and and it's pretty impressive that roughly the same architecture that was designed with six computers is still mostly functioning in in mostly the same patterns all these years later with the built billions of computers and so if you start looking at the network now it almost looks like a live creature with veins and nerves and and it sort of almost pulses and you see things like you can search on the internet for how the internet reacted to hurricane sandy when like new york city was just like shut off for three days the internet just kind of like routed around it yeah a lot of stuff went across the atlantic ocean from new york city and yeah i found different paths and you know yeah new york was down but the rest of it wasn't um in many ways pretty amazing and we certainly see other situations where governments try to shut the internet down by going into the server rooms and shutting off you know what they think is the only connections but then more connections pop up and so it's it's very much a uh almost a living thing it's billions of computers and hundreds of thousands of routers and and it's hundreds of millions of simultaneous connections going on all the time including the one we're doing right now probably and it's trillions of bytes of data moving across and it's it's it's not perfect it but it kind of works you can think of it as like the largest energy collective engineering thing that we've done together is humanity we built this it's all one thing and yet it's so many different pieces and we just kind of keep gluing them together and gluing more things on more things and it was it really was created almost like life itself it's very organic it's designed to heal itself rather than be perfect because things that try to be perfect are fragile and they break too easily but things that are designed to heal can heal and you're never perfectly correct you're never all the way up and you're never all the way down and so that's what's interesting about the internet oops went the wrong way and so we kind of come to the end of this where we started out with like machines and pulleys and gears and spraying oil at bletchley park and typing a few characters on a keyboard to this thing where we sort of just take for granted that we can watch basketball anywhere on the planet just with a couple of swipes of the of the key and and in a sense it's just that same human urge to communicate at a distance and so the last thing that i'll close with is another video from van jacobson now this is about a technique he calls content-centric networking and not everybody agrees with van jacobson on this you can ask people and they'll say it's a good idea that i don't present this as the future i present this more to get you thinking that there is a future that might be different than the present that what i just told you that seems seems like oh it's perfect and the and when and vince surf in 1969 he figured this out and yeah it was the perfect thing right and the answer is it might not be and we as engineers and we as the people of the internet we must understand that in 20 years it might be all different because 20 years ago it was all different and 30 years before that it was different yet again and so nothing's perfect and so we should never be complacent and assume that what we have is trivial or perfect or even the right answer so we should continue to question and so it's really interesting to hear again from van jacobson who sort of did this great innovation in 1987 saved us and yet he thinks that these are issues and they're issues that need to be saved again and i think in many ways he's right the question is not so much whether he's right or wrong the question is what will the ultimate solution that we pick uh be produced so so take a look at van jacobson and what he can what he is thinking about as terms of the issues in the future that we're going to need to solve when he talks about content-centered networking [Music] hello and welcome to our security lecture i'm charles severance and i encourage you to grab the power points and remix them and use them in your own courses use the videos use everything in your own courses i i you don't have to teach this whole course it's quite a bit so um so let's start out by meeting some nice people these uh people are alice and bob and they simply want to communicate some information you know left to their own devices and they don't want the rest of the world to know about what they're communicating and so they want to encrypt it some way now uh in the cryptograph cryptography world um they all all call these two people the good people uh alice and bob and they simply are a and b kind of a very computer-esque variable names a and b and alice and bob and then there's some bad guys right there's the bad guys and uh and often they are called c like carol or carlos or charlie or or something like that they're even chuck even chuck is a bad guy or dave or eve eve for the eavesdropper and the idea is somewhere in the middle here somewhere in the middle you have some person with bad intent here we have trinity from the matrix and she looks as she looks as bad as you can look right um so alice and bob seem to be real happy people they're just trying to get you know get things done and send some information without being bothered and then and the trinity's there eavesdropping and but there's other things you can eavesdrop you can sort of redirect the message and change it there's all kinds of things that the bad guy in the middle is going to do and so alice and bob are far away from each other and they have to communicate securely and they it's a it's a it's a literal gauntlet of problems that they have to face and various people with bad intent with bad purposes sometimes just wanting to see what's going on versus wanting to change what's going on versus completely redirect communication so alice bob and then a whole series of bad guys with various bad guys and gals with various evil intent [Music] you know i think computer security is the most exciting part of computing right now because it has something that nothing else has it has an adversary relationship when you do graphics or operating systems or anything there's no one trying to thwart you at every turn unless you have in security that's what makes it exciting and interesting and that's what makes it something that's forever changing and involves psychology and economics and computing and law and policy and so many things so i think it's a great uh area to be in to work in i think it's not going away right as long as we have adversaries as long as we have human beings and and their due wells and evil doers we're gonna need security so it's always going to be like that you know preparing is interesting in a lot of ways security is a mindset it's a way of thinking about the world and if you think about the original definition of a hacker someone who sort of cobbles stuff together you hacked this tool and it works and you put this piece together and this here and that and it all works and it's a great hack but i'm a security guy i'm going to say well turn this like that doesn't work anymore and you'll say well don't do that and i'll say no no i'm the attacker i get to do that i get to do that whenever i want i get you at the most inopportune time get to do that in a way that makes your system fail as badly as possible and you have to think that way not about how to build something how to make it work but how to make it fail and how to make it fail in precisely the right way to do precisely the right sort of damage and that's a way of thinking i mean there are some people who go through their lives looking at systems and figuring out oh i can break that oh here's how to break that you walk into a store and you see the purchasing system oh i can steal something here's how you walk into a voting booth oh i can sort of defeat this here's how you might not do it because of course that would be illegal but you think that way and that mindset i think is essential for security once you have that mindset then it's a matter of just learning the domain right learning the systems and whether it's a self-driving car or a voting system or a medical device and it's going to be embedded code interacting with the real world in a way that involves people and society and i can teach all that you can learn all that so i remember a class in security i forget who did this one of the assignments was come in tomorrow and write down the first thousand digits of pi okay so two things about this test one you can't memorize a thousand digits to pi you have to cheat and actually the students were expected to cheat but if they were court cheating they would fail okay that's interesting right that teaches that mindset allows you to think outside the box but how am i going to do this am i going and there are lots of ways people cheated and and i sort of urge people watching to go google this and to look at some of the stuff written it's a great way of trying to stimulate the mindset can you teach it formally i don't know it's kind of like it's a way of thinking and i think the more security classes you take the more you exercise that mindset a lot of the hacker conferences will have capture the flag contests i remember an early one where they had to build their own private network to cut down on both network license latency and federal violations right that's why you do it but you're going to learn a lot by breaking other people's systems and yeah that's probably going to involve illegal activity and agreed you know this isn't the best way or maybe it is the best way it's not the most socially acceptable way you know but here we have this clash between the tech imperative and what society wants so many of our systems are black boxes i mean you can go and try to hack this right your your smartphone or your computer and there's a lot of stuff you can learn but really it's gonna be more fun if you can hack somebody else's cell phone or somebody else's computer i want it to be open-ended i want it to be you know follow whatever it is you're interested in the neat thing about security is it can go wherever you want there are so many different sub-disciplines i'm often asked should i study forensics or cryptography or network security or protocols or embedded devices or scada systems study what you want and whatever interests you follow that because really what you're learning is how to think like a security expert and honestly if you get a job they make you do vpns you can pick up vpns that's easy and it's the way to think so do what you want and you know what we're learning right now is that demand is greatly outstripping supply right that people who have expertise in security have a guaranteed career because there is such a demand for it and there's such a lack of supply have you written any of your books kind of in aimed at those kind of you know pre-computer science students or early computer science students that would sort of be a good read i tend to write my books for general audience so i think of my parents my friends so computer experts yes but really for a more general audience so going back to something like secrets and lies i wrote in 2000 it's about how network security works right 15 years out of date but it's still a good introduction on the basic concepts of how to think about security you know later cryptography engineering how to engineer cryptosystems my book liars and outliers how to think about security as a way to enable trust very non-technical but very much here's how security is embedded in society my latest book is about surveillance and data in goliath talks about what's going on in the world of surveillance and how we can regain security so to me all of these books are for someone who might be interested in this field because what they're going to do is spark interest in different directions they're going to give people ideas they're going to go and research further and that's how you get your passion that's how you get your calling you know it's not that someone gives it to you that you notice it going by and saying hey that's kind of neat i want to do more there [Music] so it's always fun to talk about security because you know the question is is you and what does security mean to you and what does it mean to me and um and and when you're whenever you're thinking about security you got to figure out sort of who is out to get you and why are they out to get you and for us average people we are so boring that literally pretty much the only thing the world wants from us is they want to steal our you know they want to steal our credit card numbers and and then buy stuff with them for just a couple of days before the credit card company shuts them down now if you're running for public office uh there's a whole bunch of your opponents that are going to want to break into your email and and that happens right those kind of shenanigans happen so if you're interesting or influential or running for public office you have a different problem than if you're just kind of a mere civilian like us and so at the end of the day you can worry about security and you can kind of wear tin foil clothing or something but at the end of the day you and i are so boring that nobody really cares but it's good to assume that they're out to get you right that that you try to be as secure as you can and be aware of your environment and be as secure as you possibly can with your information and uh with your communications so who is out to get us well the government so now we're going to actually turn the tables right in the very first lecture we were the good guys the good guys were the ones who were trying to break the curtail free and the bad guys are the ones that are trying to hide their their their bad messages and the government were the good guys in bletchley park and they they broke the codes and they they saved western civilization yay good guys well now we're the germans we're the japanese we're trying to communicate people are watching and guess what it's the same people the government government's most likely interested in what you're doing um again most of us are so boring the government doesn't really care at all but the government has probably the best security breaking equipment on the planet right now better than the criminals and so it's just an interesting flip of the model right where now we are trying to communicate and the government with all of their bombas and little rotating devices sitting in some warehouse somewhere that we don't know is taking our private communications and they're trying to break them and we are in a war trying to make our communications more and more secure while the government says well we have to invent like electronic computers because we have to have bigger computers because the people that us us good guys now who are doing the encrypting we are trying to come up with more and more clever ways to encrypt and the government has to keep coming up with ways to decrypt and you could cast this as criminals versus good folks you could cast these governments versus governments at the end of the day it's an arms race and it's it's rather symmetric as to which side you're on right i mean we started out with the people trying to hide the information were the bad guys and now we're the good guys why are we the good guys because it's us as soon as we decide it's us we're the good guys here's a little here's a little picture i took a couple of years ago when i visited uh bletchley park this actually led to the video a year and a half later and in this my friend who works at the open university who also does massively open online courses uh that's joel he's now retired he's the tour guide there and a dear and personal friend and um and another of my dear and personal friends whose face you can't see is right here his name is chuck he currently works at shazam in london and um and he and i just got away for the weekend and went up to bletchley park and took a tour not just just a little interesting connection i sort of mentioned in a previous lecture about the matrix so you notice that chuck here is wearing a jacket from the movie matrix i mentioned he was the one that wrote the hacking scene with trinity so here we all are together we're at bletchley park alan turing had an office in this this is one of alan turing's offices at bletchley park here's joel open open teaching and learning here is chuck worked on the matrix and here's trinity right except that now trinity's our enemy instead of our friend although in the movie we're rooting for trinity but we're the ones trying to protect what we're doing from the trinities of the world we're the good guys now and trinity is the bad gal okay she's the bad lady trying to break into our stuff and we're protecting ourselves against trinity so tables have completely turned okay so before we go too far you know you'll run into your organizations with lots of people who are like security experts and i find security experts generally very annoying because they like think they're smarter than they are security experts have this theory that more security is better and the reality is is that perfect security is unachievable if you want perfect security like lock yourself in a room and whisper to somebody oh wait a sec and then the guard will come in and hear you or whatever there'll be a microphone so the notion that perfect security and that you your company can spend more and more and more money of its budget on security just because the security expert thinks that you should is really kind of a fallacy security is a cost benefit analysis and so if you think of credit card companies you know what's the security of this number i won't show you the number but what if i lose this number what's the consequences i've had many credit cards that got compromised i don't exactly know how they've got software inside of computer companies that catches like when all of a sudden i live in michigan and a bunch of charges at the new jersey walmart are showing up and i never got on a plane i'm amazed at how rare it is that your identity actually gets lost permanently so i'm really impressed with how the banks are capable and so they have this number that makes it real convenient for us to buy we can type it into a web form we can do all these things and one in a zillion times it gets compromised and so they lose 50 bucks but billions of dollars of commerce happen so there is always a security versus cost if basically for me to use this credit card i had to have sort of a a personal uh guard from the the company with me all the time it'd be very expensive for me to buy stuff on amazon and have it shipped to my house and so understand that when you start getting into conversations with security professionals sometimes they're the kind that understand that security is naturally imperfect and everything is a cost-benefit analysis rather than more security is always better so more is not always better that's just my little speech about security in case you get stuck in a conversation with some pompous security expert they're mostly pompous so database administrators are pompous too but hope you're not not you're not a database administrator security expert but um the best security experts that i've ever met are the ones that understand this and they can they can they can characterize every issue in terms of a cost benefit analysis and then you know you know that you've run into one that's really smart and bright so the terminology that we're going to use for the rest of uh segments of this lecture we're going to be solving two basic problems one is confidential confidentiality and the confidentiality is the the the leakage of information so if i was to want to send this credit card number to one of you watching right now without the rest of you seeing my credit card number the the mere revealing of that information is the problem so i want to come up with a way to send it to you so that no one except the one person that i want to see it gets it that's confidentiality the other problem is one of integrity and that is in a sense knowing that a that the information that you've got comes from who you thought it came from and b that it wasn't modified on the way so there are things like digital signatures that sort of fall into this category and uh other mechanisms that we'll talk about so these these two themes of confidentiality and integrity will run through the rest of the lectures that we have on security okay so coming up next we're just we're going to start diving right in [Music] so the first topic that we're going to talk about is confidentiality encryption and decryption and of course this was what was going on at bletchley park in world war ii so uh the terminology that we'll use in this is plain text and ciphertext and the idea is whether it's text or other information there is the information that we actually want to transmit whether it's a credit card number or something else and then there is the encrypted version of that and we'll call that the ciphertext and the ciphertext is what we assume is revealed to intermediate parties whether they are stopping it and changing it or they're just watching it it's still the cipher text is the stuff that we are um just by the nature of the communication we are forced to reveal it or there is a probability that we'll reveal it so it is hopefully unintelligible and hopefully it is difficult to go from the cipher text to the plain text except if you are the actual intended recipient or impossible encryption is the act of going from plain text to ciphertext and returning the ciphertext back to the plain text as decryption and there is a key some kind of a key which is really sort of a some data plus a technique plus an algorithm that goes back and forth so there are two kinds of systems that we'll talk about in the upcoming lectures one is called a secret key and the other is called a public key the secret key is the one we talked about at the very beginning the secret key was really used from uh the romans and caesar on up to world war ii uh the public key encryption really is uh much more recent in the 60s and the 70s and we'll talk about that later on so the first thing we'll talk about is the shared secret or secret key the secret key is also called symmetric key which means that uh both parties have to be in possession of the same information you basically use the same key material to encrypt as you do to decrypt the um the public key is asymmetric which means you use it one key to encrypt in a different key to decrypt we'll get to that later and so the problem that secret key has that led to the need to invent uh public key is the fact that you need to at some point have a secure communication whether you're sitting in room together and you hand each other code books whatever it is you have to have a way to distribute the key in a secure manner the public key which we'll get to later has a way of distributing the key in using insecure medium and you'll see when you're when we get there it's like so obvious and clever you wonder why nobody thought of it until you know very recently so here is the path right you have some plain text you uh you have say the word candy that you want to send you're going to encrypt with a shift where you you just go to the next later letter so c becomes d a becomes b and becomes o and so now we have the d b o z that is the plain text coming from alice alice sends it in the dangerous dangerous nasty wide world of you know routers or radio with uh with morse code or whatever it is we're going to do whatever is we're going to do where the message might be intercepted by somebody now they're not intercepting the plain text we assume that this part here is secure and this part here is secure it's only dangerous while it's in flight somehow in the middle and we only worry about eve getting it and and then at some point because uh bob has the key which is subtract one bob goes from each of the cyber text letters back to the plain text letters and voila out comes the plain text again and so eve's problem is i'm only eve is only handed well no sorry he is not eve is not given the key he is given the cipher text and nothing else and she must like bletchley park must derive whatever it is drive the key derive the plain text whatever it is that's eve's goal the caesar cipher is the kind of the oldest most widely used forms of encryption it uses the notion of a shift the shift number is just as i shown a shift of one means a becomes b and x becomes y and l becomes l m l l becomes m so you just take and move a fixed position down l this was used for a surprisingly long period of time and there's some pretty good youtube videos that kind of you can if you want to see more about sort of the the how this works and the math behind it and how you break it it's uh it's pretty fascinating i mean finally it's just it's completely breakable um and we'll actually going to break it here pretty soon ourselves so the caesar cipher so i want to pause and let you see uh youtube video uh here from uh beloved movie uh called the christmas story where little ralphie gets his uh little orphanage secret decoder ring and uh little orphan annie sends a decoded message an encoded ciphertext through the radio everyone can hear it but only those people who have the secret decoder ring can decrypt the message and you can see that it is a a caesar cipher it has a shift you'll note that the first thing they say before they say the encrypted message in the radio is that uh you're supposed to connect b and 13 or something like that and then you rotate the two wheels of the secret decoder ring to the b13 and then you can read across the secret decoder ring and decrypt the message as it's decrypted and then he slowly decrypts it and then of course there is the delightful moment where he realizes the crash crass commercialism uh that that or the complete lack of interesting meaning in in all of this so uh without further ado let's take a look at uh ralphie and uh christmas story be right back so i hope you like that i hope you like that um and so uh off we go um we're going to have a secret decoder ring for this class i would love to be able to send you all um oops come down i would love to be able to send you all a little mechanical wheel to move the stuff back and forth but instead i use the internet and i'm going to send you a a pdf and at this point you might want to pause the video and grab this pdf okay grab it secretdecoder.pdf doctortruck.com secret decoder.pdf and download it and you might even want to print it out because we're going to at this moment uh do a code breaking exercise okay and so let me tell you how to use this secret decoder ring so the top line here is the plain text and if you recall uh cesar shift has a a a shift number and um so to encode you go from a plain text let's say i want to do uh chuck i want to encode chuck and i want to encode it with a shift of 2. so shift to 2 means we select this and we basically go c is our plain text and then down we go e so then we go e is our first letter and then h is our second letter so we go down and that means h becomes j yeah and so u becomes w so now i'm doing my encryption so e h w is the encrypted ciphertext so let me clear that and write back down here different color e h w now to decrypt remember you need to know the shift so we somehow communicated separately and securely what the shift was and so now we want we have received our ciphertext we received our ciphertext and we need to decrypt it okay and so we know that the shift is two so we go to e we go in the shift row and then we go up to the plain text row and that says the first one is c then we go to h we go to the encoded text row and we go back up to the plain text row and so that's an h oh wait wait wait it's not h w what am i thinking this should have been a j i got this wrong sorry about that so that's wrong here's a j moves up to the h then the w let's see if i got w right yeah so w is my last ciphertext and it goes up to the u okay dot dot dot so you see the pattern that are in encoding is plain text down to shift position and our decoding is shift position back up to plain text okay so this is our secret decoder ring so go grab it and download it so that you can participate in the next exercise so now you are going to be cast in the role of bletchley park okay ready so here is your first code breaking exercise okay so here we go so you're bletchly parked right you just intercepted this ciphertext u b u p b t u whoa it's encrypted it's clearly meaningless so how are you going to decrypt it well the technique is take a look and decrypt it with all the all the shift numbers right you're going to do this by hand you're going to be a computer yourself you're going to do all the shifty cripping and just like in bletchley park you know you succeeded when the plaintext makes sense right when the plaintext makes no sense then you haven't succeeded but at some point the plain text makes sense so what you need to do is take your secret decoder ring and you need to decrypt it with a shift of one a shift to two a shift to three a shift to four and if you have your family members around you can print out multiple copies of the secret decoder ring and you can assign different shifts to different family members so you have to decrypt this 26 times have to decrypt it 26 times and then you look at all the 26 decryptions and then you decide which one makes the most sense okay so don't peek decrypt this one i made it easy on you okay so we'll stop now and give you a little bit of time to decrypt this one don't start pause until you actually have decrypted it okay okay this is your last chance for spoiler alert so here we are we're about to decrypt it here we go i did make it easy on you it was a shift of one it was a shift of one if you started at 12 you're kind of foolish right so you started at one and you go like oh great so now i'm gonna decrypt it i'll start at one here's the plain text this should be p p that's the plain text so i'll start with u if it's one then i go up and it's t and the second one is p so i go up and it's o t o keep going it says toast so you say to yourself well that's a word so it must be it well hello what are you doing here you want to say hi to my students this is a cat this is my eddie cat he likes to come up into my office and look so do you know anything about encryption anything about encryption so you use a shift of one and then you go from the christian text up to the plain text wow okay you are clearly not interested in my lecture so that was my cat hello sorry can't open the window for you because i'm doing a lecture okay um so you're going to like just keep bugging that window until i kick you out of my other room all right aren't you so you're going to get kicked out out you go he was gonna keep hitting that i'm gonna keep hitting that until uh i opened it for him okay so so now you've broken this code and again just like in bletchley park you only knew that you broke it if it made sense and um and so luckily the in blessedly parked the messages were longer and they were often looking for uh canonical things that they would say every day so um here we go and uh and so that's the breaking of that one and it turns out that shift of one was the thing that we did so here is your second task this one's longer and it's not a shift to one and so so this is a situation where you really would have to get your whole family going on this right where you got to do 26 decryptions of this and it will you know make sense to you you decrypt it 26 times right and so this one's going to be harder i guess you could just just decrypt one word but it's just not a shift of one but now we're going to do another trick okay so i don't want you to try all 26 because there's there's a mistake in this there is a leakage of information that makes it so that you can figure out what the right decryption to try might be so this is english this is an english sentence so stare at for a while and find a more optimal way to decrypt it than trying all 26 shift patterns okay so there's a way to optimize this a way to cleverly figure out what might be the best shift or how not to have to decrypt the entire message 26 times to reduce the complexity and that's because we've leaked some information here that should be pretty obvious to you okay so let me give you a moment to break this one it shouldn't take you too long and you shouldn't have to force your whole family to decrypt this stuff okay so here we go give you a chance to decrypt it okay here's your last chance before the reveal you ready here we go so here is the decrypted text the shift turns out to be 13. it's a shift to 13. and the the weakness of this whole thing is this right here in the english language what is a single character we're not encrypting the spaces you'll notice because the spaces come across so what is the one single character word that we have in the english language that's capitalized well that's usually i i need a jet money and a jet what is the one thing that we do in the english language that is a single character word that is all lowercase that's lowercase typically unless at the beginning of the sentence and that is the letter a so basically you didn't have to decrypt the whole message you see some weird pattern and you go like this i just have to figure out and then you go look and you look in the row 13 the plain text and you go like oh where does you go look at the eye because you guessed to the plain text and then you just look down until you see the v and within seconds literally within seconds if you do it right within seconds you know it's a shift to 13 and then it's a trivial matter to convert it so you could figure out the shift code within seconds and these this was how bletchley park figured it out and this is why the known plain text was so important because you'd only have to figure out like one letter if you knew what the plain text was and often they would know by length and certain other things oh this is i i think we can guess what this plain text was that this particular operator would send and this has to do with the leakage of information it's not the it's not the mathematical perfection or lack of perfection in the security key it's some other leak some other thing where they're going like oh wait a sec i can take advantage of something it was just equally encrypted as any other message but because i gave you this clue of an uppercase single character word and a lowercase single character word upper lowercase single character word and uppercase single character word i greatly reduced the amount of effort that you had to put in okay now what's cool about this you go to this website www.com is that long ago before facebook and before twitter and before all these things we had these things called news groups and they were kind of this weird kind of uh collective email list that we had and this was like in the 80s and it was even used in store and forward networks where it was kind of like facebook and store and forward networks meaning that it might take four hours for you to see the status update but we kind of would subscribe to these collective things and there was one that was basically the dirty jokes and the thing about dirty jokes was part of the part of what we were trying to do in this thing was you weren't supposed to swear and there was software that would filter out swear words um from if you put a swear word in to a dirty joke um it would it would not forward the message and so we had to have a way to encrypt messages that included swear words so that we could tell dirty jokes to each other for those who wanted to subscribe to the dirty joke list and so they came up with this wrote 13. so we came up with a simple caesar cipher with a shift of 13. 13 beautifully of course is 26 divided by 2. so it's a symmetric shift the shifting in by 13 is the same as shifting out so of all the sievers caesar ciphers a shift of 13 the encryption and the decryption are exactly the same calculation and so we would um we would type our dirty joke into row 13 and convert it to row 13 and we would send it in row 13 and then we would if we wanted to decrypt it but what became funny after a while was we were so used to reading wrote 13 that it almost became like a second language right we could we could read second we could start reading the dirty jokes in row 13 and we would laugh before we translated them up so wrote 13 has an interesting sort of uh historical thing and you go to row 13 and sort of like encrypt whatever you want to say and and i'll probably have some questions to ask you where you will have to do some rope 13 encryption and so that's the end of this lecture where we talk about uh caesar ciphers and the various techniques and how caesar ciphers work and so uh we'll be back and talk about cryptographic hashes so now we just finished talking about uh basic confidentiality using simple caesar cipher and we'll get better before we'll we'll be more sophisticated than that because caesar ciphers are trivially breakable obviously but now we're going to switch from confidentiality to integrity and we're worried about uh the the message uh just to review the confidentiality means we're hiding information we just don't want eve to see it because eve sees the cipher text and we want her to never be able to extract the plain text and as we saw in the last lecture uh you if i'm just using a caesar cipher i mean there was little or nothing i could do to stop you from doing it all you'd do is enough work and you would figure out the shift and then you'd have everything so it would take you you know you'd write a program it would take you like a thousandth of a second to check all possible things and you're done so that's confidentiality now we're going to talk about integrity right and we're going to kind of assume confidentiality or perhaps assume it's not necessary because we're transporting it in a locked box or in a you know whatever but let's just say i had some um some some piece of paper right and and and you wanted to know if this piece of paper really came from me well we would use uh things like a signature right or we would use uh a bit of wax where we would push the uh our seal on it and the seal only belonged to us but but really did signatures can be forged uh you know people can steal the wax imprint thing i mean in roman times they would wear it around their neck to make sure that no one stole it but you could also just create a fake one so you could seal your letter with the wax but and you break the wax seal but in the computer world we need something right we need a situation where um where perhaps i mean my wife had to get a prescription and they sent her an email with a prescription in it and this prescription had on the end of it a digital signature it was just a bunch of numbers and you think to yourself wow you know here's this prescription and at the end it just has a bunch of numbers and that's the digital signature from the doctor and how does that work well it it works surprisingly well and and that message can be forwarded she can print it out she can scan it she can send it to a pharmacist via email and as long as that signature's in there we can know that the data originally came from the doctor and that that you didn't know and modified it to be a different kind of prescription like a different amount or a different drug and and then that would invalidate the signature and how exactly is that done well it's done using a technique called cryptographic hashing and it is a bit of computer software a bit of code that takes a large amount of text and reduces it down to some small set of numbers a large block of data to a fixed length set of numbers in and the message is the big thing and the digest is the little thing it sort of it's like it digests it and sort of gives you this little tiny thing now the key is there are many different techniques to to map from a message to the hash or the digest and some are better than others and it turns out that there is a long-term whole field of mathematics and computer science that's dedicated to understanding what a good cryptographic hash might be and so there there are these well-known cryptographic hashes that you may have heard of like sha-1 or md5 each of those as the result of many many years of research of thinking through what a good cryptographic hash is so for example one cryptographic hash might be i sign it with the number of characters in the message that would be something that says like well at least they didn't expand or contract the number of characters in the message but then at some point that would be such an obvious thing that you would change the signature as well so you want to make it so you can't change the message and change the signature because then you're sort of properly forging a signature so the hash function that takes the message to create the digest that's something that is a scientific mathematical research effort to get the thing right okay so here is an example of a hash function now this hash function takes as message input on one side right it takes message input of and this can be short medium or extremely long messages and the digest is always a fixed size it's always a fixed size right some get longer than others different message hashing functions give different length but they're all fixed and they're fixed even if the input is megabytes it can be megabytes hundreds of megabytes and you can still run all that through the hash function and get a digest okay and so the key thing is is it is to make it so that for any change in the input the digest also changes okay and so here we have the red fox jumps over the blue dog right and that's the hash that comes out and if we change one letter you know the v to a u the hash changes dramatically so this hash changes dramatically so this suggests this is a good cryptographic hash function right the length didn't change all the characters are the same but one character changed and the hash function changes completely okay and here is another flipping of characters right where the from here to here let's give you a better color here which color is that yeah from here here the v and the e were just toggled and yet from here to here the hash function is completely different and so the hash function needs to generate quite different hashes even with tiny perturbations of the input you're not even allowed to change one character add and remove character and they know when a cryptographic hash function is bad if they can take two different messages and send it through and get the same digest and so there's a lot of research to try to as soon as they come up with one of these things there is a massive amount of research to try to disprove it to say that's a bad one and the bad one is if two different messages come in and they come out with the same hash function then that's bad because it means that it's provable provably that the signature could use the same signature can be used to sign two different input messages okay so this hash function function is a bit of computer code right and you know shot one you can go look on wikipedia for shot one or md5 these are kind of classic hash functions and when you read the sha1 or md5 wikipedia they'll talk about the fact that it's been decided it's kind of flawed and so there's like sha-1 don't use it it's not cool now you can use it for less critical things you just have to be aware of what its limitations are it's like sha-256 is better than sha-1 so what happens is there's continuous research and there's continuous improvement in the mathematics of these hashing functions and they're getting really good at it because we've been using these things for a very very very long time now if you want to play whoa and clear this up so um i've got a simple sha-1 calculator and um you can you'll be using this in the homework the sha-1 calculator takes as input some kind of a plain text and it produces output when you hit the thing so you can you put in pony or fluffy or whatever or even a whole bunch of stuff and it produces a sha-1 okay so um so basically uh get ready to use this because the next upcoming exercises are going to use this and you can read sha one about sha one on wikipedia and you can find out like yeah it's seen as less than perfect and you should use sha-256 or whatever but actually lots of applications in less than critical situations know that you know for short and reasonably length strings you know the flaws are mathematically found but they're not commonly run into so sha-1 is not horrible it's just been proven sort of less than ideal and so for highly sensitive information you would never use sha-1 but for simple things sha-1 and even things like md5 are commonly used like for hashing passwords [Music] one of the things we've learned from the snowden documents is that cryptography broadly applied gives the nsa trouble at least at scale so the nsa does a lot of cryptanalysis and they break a lot of systems but well-designed well-implemented cryptography does stymie them and it's important to understand how it does because if the nsa wanted to be in my computer they'd be in my computer done period no question about it they would hack into my computer they have a lot of tools to do that if they are not in my computer one of two things are true one it's against the law and the nsa is following the law and two i'm not high enough on their priorities list now what cryptography does is it forces the attacker whether the nsa or the chinese government or cyber criminals or whoever to have a priorities list and depending on their budget they'll go down the priorities list and the hope is you're not there right you are below their budget line without cryptography an organization organization like the nsa can bulk collect data on everybody with cryptography they are forced to target and that's extraordinarily valuable because it means the fbi will go after the criminals the nsa will go after the agents of a foreign power the chinese government will go after the u.s government officials that rise to whatever level they want to spy on the cyber criminals will just go after a few of us and the rest of us are protected that makes cryptography a very important tool now cryptography doesn't actually provide any security because cryptography is mathematics when we say we trust the cryptography what we're saying is we trust the mathematics and i think there's a lot of reason to say that i trust the mathematics everything i know about cryptography tells me the mathematics is good certainly there will be cryptographic advances certainly some things will be broken in the future but by and large the math works but math has no agency math can't do anything it's equations on a piece of paper in order for math to do something someone has to take that math and write code and embed that code in a program and embed that program in some bigger system and put that bigger system on a computer with an operating system on a network with a user and all of those things add in security when the nsa breaks cryptography by and large they don't break the mathematics they break something else they break the implementation they break the software they break the network they break the hardware the software is running on and they do something somewhere else and again and again we learn this lesson right the math works but putting stuff around it is much harder now there's an important corollary here that complexity is the worst enemy of security right what these things do is they add complexity the more complex you make your system the less secure it's going to be because the more vulnerabilities they'll you'll have the more mistakes you'll make somewhere in that system and we learn again and again when we see analyses of voting systems embedded systems your cell phone messaging systems email systems that it's always something around the crypto something that the designers the implementers the coders the users got wrong and the simpler we can make systems the more secure they are so what nist is doing is they're trying to build standards around as much as possible right so they have a standard for a crypto algorithm aes is the standard crypto algorithm it was a public process where multiple groups submitted algorithms and the community as a whole picked a winner and it wasn't dictated on high they weren't secret criteria the aes algorithm was the one that most of us thought should be aes actually there were several we thought were good candidates they picked one but there's a lot of trust in the process because there's a very public open international process right shah 3 the new secure hash standard the same sort of process now it's really fun as a cryptographer being involved in this process i mean i think of it as a great crypto demolition derby we all put our algorithms in the ring beat each other up the last one left standing wins it was kind of like that you know we would all publish papers analyzing each other and one of the ones left standing one but you know that's just a small part of what nist does they have standards of random number generation they have standards for key agreement for different protocols i mean trying to standardize these components so the implementers make fewer mistakes but still is a lot you can't standardize and those bigger pieces where you're going to still find most of your vulnerabilities i believe that's where the nsa finds most of its vulnerabilities that it's out there recently we learned about vulnerabilities in the key agreement protocols that are used to secure a lot of the vpns and internet connections right and if you look at where that vulnerability was it's because of a shortcut that was made and and copy that allowed for a massive precomputation the math worked great if you want to make a standard worse you make it super complex and and you're just building in vectors at that point and this is why the normal ietf process for internet standards doesn't really work for security because those standards are compromises right let's put in all the options make everyone happy let's put in as much flexibility as necessary to make the system as comprehensive as possible that is sort of anti-secure security security needs as few options as possible right as simple as possible you don't want to compromise you want one group to win because that group has a self-contained vision when you have a piece of this and a piece of that and a piece of that there's going to be some interaction you didn't notice and that interaction will be the interaction that breaks your system you didn't win aes right you were you were in it you were in the demolition derby with your helmet on tell me a little bit about what it's like to be in the demolition derby toward the end and what it's like to sort of not win the demolition derby so aes was an interesting process it started out with 64 algorithms which 56 met the submission criteria then nist whittled it down to i think it was 15 or 16 and then in next round whittled it down to five and then chose one so it's a constant winnowing process and two fish which was my submission made it all the way into the top five and those top five were all good algorithms i mean there was no bad algorithms there and the arguments were more about security margin and implementability in hardware versus embedded systems versus constrained systems 8-bit 32-bit so we were making distinctions about how we thought it would be used and and to me it came down to i think three algorithms that i thought these were all good choices right two fischer's one ring doll the eventual winner was one and actually at this point i forget what the third is and what i said on my blog at the time is you know any of these three are good and sure it would have been great to have been the winner but they really know something there's a lot of value in nist picking a non-us algorithm right by picking an algorithm from belgium it said to the world that nist is picking what they thought was the best and not trying to pick american so that was an important consideration i hadn't thought of at the time so i i can't fault in this process at all been great to win it actually was really fun to participate and you know i would do it again and i participated in the shah 3 competition which again was picked by uh i mean someone else won my my entry was called skein and you know the these are lots of fun for cryptographers and also for students because they give students a whole bunch of targets one of the hard things if you're a crypto student is you have to break stuff the only way you learn how to make things is by breaking things it's back to that security mindset right anybody can create a security system that he can't break so if you show up with a security system and say i can't break this my first question is well who are you why should i trust your attestation that you can't break it as something that's meaningful what else have you broken and these competitions give a whole bunch of targets so students can start breaking things that haven't been broken before get papers out of it get publications get cred in the field as someone who can break stuff and therefore as someone who can design stuff it's a it's a it's a source of new problems it's a source of new targets but this is what i said to start security is inherently adversarial and that adversarial nature makes it different unlike any other field in in computer science you go to a security conference a crypto conference and there are going to be papers of people who break each other's stuff and you have to get a thick skin you have to understand that we are all learning now if i produce a protocol and you break it sure i'm unhappy but i've learned something and so have you and so is everyone else and that knowledge is more important than my particular creation surviving and you have to understand that and accept that and that has to excite you [Music] the first application of this that we're going to talk about is hashing passwords so you go to a new site even coursera.org and it asks you to create an account and create a password now you're not supposed to but lots of people use the same password for lots of systems and so if your coursera password were somehow mistakenly revealed they might get your linkedin password as well and so it is considered very bad form very very bad form to actually ever store your password in the coursera database in plain text because if the database was somehow compromised then all the bad guys would get all of the plain text passwords and again maybe use them not just on coursera because you could change your password on coursera but use them on linkedin and twitter and whatever and youtube and steal all your accounts by compromising one account because you made the mistake that lots of people do of using the same password in a lot of different places because we're tired of making up a new password for each place so you're not allowed to store the plain text password in the database that's bad practice and we don't want to do that so the best practice is to store a hashed version of it to run take the plain text of the password when you're creating your account run a cryptographic hash on it store the cryptographic hash and then when you log in next you present to the system your plain text password and it runs your presented password through a cryptographic hash same cryptographic hash and then compares it with the hashed password in the database if they match you must have presented the same plain text both times this is a way that they can verify that you've represented the same plain text again without them ever storing the plain text okay that's why a respectable system we'll never send you your password it's i've almost started doing this where i as soon as i go to a new system and i set my password i set it to crap and i have them send me a message to reset my password and if they send me the actual plain text to the password it's like and i use some crap password i've actually got to the point where i'm tired of reusing my passwords and my technique i don't know if it's a good one or bad one is i just put crap in for my password and then every time i use this system i have it send me a new password to reset the password i mean that i really think we should just change it so that when you log in it just comes to your email and you click a link i don't know i'm not expert on this stuff you know so but a respectable computing system will never ever ever send you your plaintext password because they don't possess it and they can't derive it these cryptographic hashes are not backwards you can't make them go backwards let's go back here they're only a one-way hash because this might be one megabyte of data and this might be well let's see four times this is 40 characters 40 characters of data one megabyte squeezes down to 40 characters there is no way to go backwards the information is lost the hash is you know distinct and unique right but you cannot go backwards it's a one-way operation you can go from the plain text to the hash but you can't go from the hash to the plain text which is very different than encryption and decryption right in decryption you had to be able to pull the plain text back out from the encrypted text this is not encryption this is calculating a special digest that is uniquely connected to the plain text message but you need to run the plain text through the hash again and then compare okay so let's do some homework well so now let me let me let me first show you uh the how this works in the hash passwords so so let's say for example you're logging into corsair.org and creating your profile for the first time and it says please give me your password and you choose a singularly bad password as fluffy okay so fluffy and you can type go type fluffy into doctorchuck.com sha1.php in another window if you type fluffy and you encrypt it with sha1 you get this as the hash password and then this is what they store in in coursera's database and so that is they don't know they never store fluffy they they would not do that that would be so bad if they did that so they store this and if i get this it's very difficult to reverse engineer it to fluffy it also is even harder if you make your password long best passwords are like sentences not just eight characters but they're like long sentences of stuff that's rather difficult to predict so this is what's stored in the coursera databases some ugly string which is a a cryptographic hash digest of your password so now you log out and and this is gone that's only in your mind and this is sitting in the database so you log in you log back into coursera and you forget your password so you type in pony if you run pony through sha-1 you get this as the cryptographic hash of the word pony and you look and you compare and you go nope that is not the right password i don't know what the right password is i can't give you a hint i can't tell you hey you you put shift on your password why don't you try taking that off you know you seem to have caps lock on because it doesn't know whether what your password is but it does know that pony is not your password then what it does right so this is what's stored in the database then you go oh my secretary that's right i use fluffy for my password on coursera so then coursera runs that through sha1 it gets the cryptographic hash of the plain text that you entered and then it compares it to the what it has stored as the cryptograph the plain as the as your hashed password and it matches so then it says yay i'll let you back in the fluffy only existed in your mind unless you foolishly like wrote it on a post-it note and stuck it up on your computer which you shouldn't do as well but whatever it was coursera never stored my coursera never stored it coursera only stored this and that again is why coursera can never tell you what your password is or any reasonable site can never tell you what your password is it can only tell you it can only let you change it again which is easy changing it again you just it decides oh it sends you mail you give it a new password and it recomputes a sha one for that one and stores that sha-1 so that's why you have to get password resets to happen okay okay so the next thing um that i want to talk about is i want to talk about digital signatures how we can use this for message integrity so we've got the notion of a cryptographic hash which is a calculation takes a large block of text so far we've only used it on small blocks of text okay but now we're going to use it on larger blocks of text where we're going to ensure message integrity which means we're going to figure out if this message actually came from the person we think that it came from okay so we're going to use integrity now i mean in a sense what the system was doing when you were typing in a password was it was ensuring that you were really the person on the other end of the line right hi i'm i'm logging in as dr chuck and here's my password fluffy that's by giving you the password i'm proving that to coursera that i'm really dr chuck so it's a form of integrity right author of identification is form of integrity right it's no different than showing your driver's license says yes this is really me okay but now we're going to do it in a way that we're going to send a message so it's not just a password we're not really solving just the password problem but we're actually going to use it to make sure that the message a came from the right person and b was not modified in transit this is kind of the doctor signing the prescription digitally and then sending you an email with your prescription that you can just print the email and take it take it to your uh take it to your pharmacist so again message integrity when you get a message did it come from who did it come from and do you come from who you really thought it came from or was it altered in transit okay so if you go back to our little example from the christmas story um the message from annie was eat more ovaltine now the question really becomes did it really come from annie right because little orphan annie didn't necessarily say it little orphanage handed the message to somebody else and then they read it so did that person change the message because maybe there was actually a secret message from annie and maybe annie wrote it and it really was an important secret message but then somebody like the advertiser changed it and sent it to you as if and it came from annie so that we're not really worried so much now about the plain text that is the fine plain text the question is did it really come from little orphan annie because we are receiving this from an insecure media like annie's so out here somewhere but annie handed somebody to hand it to somebody and to somebody to send across radio yada yada the question is did the message originally come from annie long long ago handed through many people or not this is again like the seal that you put on with the wax did it really come from that person or not is it really annie just saying yeah this is annie that's too easy you could say eat less ovaltine or or maybe annie wanted to say i hate ovaltine right that's my if but we don't know if annie said eat more ovaltine or not right because all we saw was eat more ovaltine and it seemed to counter manny but we gotta know you gotta know that it came from annie or not okay so simple message signing using uh shared secret and we'll move to a better technique later but we're going to start with a simple technique of shared secret is that we have a shared secret that we're going to use for message signing it'll probably be different than the crypt encryption secret okay so now we get together with annie in a shared room and she tells us what the shift is going to be and then she tells us what our shared signature secret is going to be and then we separate so the technique that you do is before we send the message we concatenate the secret to the message right so eat more ovaltine and then put the secret on the end of the message and then you compute the digest of the message plus the secret concatenated together then you remove the secret from the message and then you send the message plus the digest across the insecure and in my wife's example this was the little signature numbers that came from her doctor was the digest but it was the digest not just of the message but of the message plus the secret the secret didn't come across the message plus the digest came across so let's look at this when we look at how when we receive a message so we receive a message and we see a digest at the end of the message and it's across an insecure transport so we take the digest off the message take the digest off the message and we add the secret back on the message we know the secret annie knows the secret but the people in the middle who transported the mess do not know the secret so it's finally arrived in our location we see the digest we pull that off and hold on to it we add the secret to it we can take the concatenated message plus secret we run it through sha1 we get a digest locally and then we compare that digest to the receive digest and the only way to make the digest match is to know the secret now maybe somebody like made annie tell them the secret which means they could forge the messages but if the secret is not been compromised somehow their only way to create the digest is to know the secret right and so we can compare the received digest to the known uh the the the the known digest that we compute on rn in a secure way because we and annie are the only ones in possession of the secret so here we go so you can play with this on shaw one dot php dr chuck um and so if the message is that annie wants us to send or any wants us to get is eat more ovaltine and the secret is santa so what the what you do is you take the message concatenate the secret and then run that through sha-1 this is all happening in in annie's secure room and she comes up with a digest now i'm not there's it's longer than this but that's just the first uh six characters of it and then what she does is she removes the secret and then concatenates the digest okay and that's what gets sent across the insecure medium it could be many steps could be many people it could be on paper it could be morse code it could be phone call who knows radio but we we just this is the danger right this is we do not know if the message is harmed in any way as it moves across this medium okay so then what we do is we receive the message right we don't know if it's a good message or a bad message so we see that it's a message and it has a digest on the end of the message and we split that out we split the digest out and we hold on to the digest separately okay and now we have the message minus the digest and so then what we do is we add the secret back on because only annie and us know the secret right and annie and us know the secret we all know how to do shot one and so we take this message and we run it through sha1 and we get a digest that we've computed locally this is the receive digest that's the local digest and then we compare and we say this is great that musta came from annie and we know that it came from annie even if it came through a dangerous set of steps and we can't trust any of the people that transported the message we they're all untrustworthy but we know that no matter what happened that originally at that moment annie did this okay annie made this digest because without knowing the word santa there is nothing and this could be like megabytes of data and this is a real tiny you know 40 character thing here the digest is small the message is large so you can't go backwards to get it there's no backwards here now if you can steal the secret from annie then all bets are off of course so we we have to assume that annie's okay and that you know annie was not compromised like in james bond movies for example but i was trying to get the secret from the good guy okay so let's go and do this again right so here we go we want to send the eat more oval team and the secret santa and so we do the same thing and we end up with end up with that and then she concatenates it and this is all done you know in annie's bedroom and secret thing and then she sends it okay she sends it to us but a diabolical diabolical courier says i have a thing about ovaltine and i'm going to change them more to less so i'm going to change this message to be eat less ovaltine right so eat less ovaltine bad evil err i don't know how to draw evil i'm a terrible artist can't draw evil so some untrusted courier has changed the word more to less so we see the thing that says and and we don't know right we didn't see the courier they was carrying the box who knows what they did but they changed it we gave us new copy so we receive this message from a untrusted medium there's our untrusted medium and it is our job to decide if we really think it came from annie or not so what we do is just like we did before we break the message and the digest into pieces and then we add the known secret to the end of it right so we've added the known secret to the end of it right here then we run it through the sha-1 calculation and you can take there we go and go ahead and try this if you want you maybe you have this up in a separate window sha one dot php put less ovaltine santa in and you will get a different signature because you change even a single character and sha 1 will give us a different digest that's cryptographic caches in action right even the tiniest change in megabytes of data will change the cryptographic hash that's the beauty of sha-1 and md5 and you know shot 256 and the others there is no match so we know that this message did not either did not come from annie or was modified in transit so we can tell the difference and we do not have to trust the medium right okay so um let's see what got coming up next here okay so here is the encryption technique and let let's just stop and let you do one of these on your own and say that we've got two messages from annie and i want you to stop and i want you to calculate santa is the santa is the secret okay santa is the secret and i want you to tell me if free cookies or free candy actually came from annie or not one of them is a valid message from annie and the other is not a valid message from annie okay so i want you to take a moment use the sha-1 calculator and i want you to try to figure out which of these is valid and which of these is not valid okay give you a minute okay one last chance before we do the reveal okay here we go so here comes a message from uh insecure medium free cookies with that as the message digest we and we got the other one free candy with that is the message digest so what we do is we take and take off the digest and add the word santa to each one and then we run the sha one on each one and then we get the two shah ones here's the one and here's the two and then what we do most importantly is we compare them with the received sha-1 and when you compare them with the receive sha-1 or the received message digest or the receive message signature you see right away that one is good and one is bad it's as simple as that right one of these is good and one of these is bad so digital signatures are are actually really surprisingly simple and surprisingly easy to do without a lot of complex technology the only complex technology in here is really the the clever mathematics that makes these cryptographic hashes work effectively it the simple concatenation of a secret now you want your secrets to be kind of longer than this and more random than that but ultimately the notion of a digital signature is actually a simple and and rather elegant and beautiful notion that really leverages this notion of cryptographic hashes in a really cool manner so that kind of sums up our first half lecture where we really talk more about the techniques of both integrity and confidentiality but we've done it all with share a secret key right where we have the same key we have a moment where we're together in a secure manner and we exchange the code book whether it's annie or or the caesar or whatever where we know what the shift is right so every pair of communicating people or systems needs us a key now in the internet with everybody buying from everybody else and using credit cards it's just not practical you just could not have a secret key for amazon i guess it kind of works for the password now let's do it we'll get to that in a second and the password doesn't solve everything because then you would have to actually visit amazon to get your password set up and so the problem is is you you we have to use an insecure medium to establish the first secret as it were and so it just was never going to work so we need a different approach for the internet and that's we're going to talk about in the next lecture okay see you then so welcome to our lecture on public key encryption where we're going to go back to uh confidentiality and so so here we go if if you recall we've been having these two topics that have been our theme throughout uh just some grizzle oh sorry i'm starting to talk in row 13. let's translate this back to uh non-row 13. uh the terminology the two kind of themes we've been following over the last couple of lectures in this lecture are confidentiality and integrity and confidentiality is hiding right shielding information not leaking information to people that you don't want to show it to and integrity is making sure that you know who you're dealing with and then the previous lecture we really talked about kind of real light approachable ways of ensuring confidentiality with things like caesar cipher and then integrity using a simple message digest that based on a shared secret so the problem with all of those things that we just saw is that they require a shared secret and the problem in the world of the internet is it's just really difficult for every one of us before we establish um we before we can make any uh purchases or whatever at amazon that we somehow have to drive to amazon headquarters and and get a shared secret from amazon they'll open up a book and say okay hi chuck i see who you are and here's our shared secret and you walk away and as long as you carry that shared secret away you go and if the shared secret is lost it's uh difficult to revoke so as the internet and and frankly in general as uh security needed to be able to work at arm's length meaning that you couldn't always bring everybody together and hand out shared secrets and then have them go to the far reaches of the world and communicate public key encryption was identified as a extremely elegant solution to this problem and so it was proposed by uh diffie and hellman in 1976 and it relies on two keys it's asymmetric meaning we're not using the same key to encrypt as decrypt the way we were in the previous lectures these are asymmetric there is a public key which is actually does not need any protection whatsoever in a private key and the idea is they're generated inside of a computer you generate the public key in the private key you send out the public key the public key is used to do the encryption and then private key is used to do the decryption and they're related mathematically in a way that's well understood but difficult to compute for a key length that's large enough so there's a public key and a private key so i'd like to you to take a look at this little video up on youtube of uh diffie hellman and merkel the the inventors of this and i think it's a great video um i would love it if this were my video but i didn't produce this video so uh so take a quick look so one of the things about this public private key encryption is now that we know about it it's like whoa it's pretty obvious and frankly caesar and the germans and everybody could have used this idea they just hadn't thought of it yet and the other thing that's kind of interesting if you look into the story of this is that the first reaction people got when they start thinking about this is like it can't be this easy no it's sort of both easy and hard but but the concept is real elegant and really beautiful and that is that we have this public key so the public key is part of a public private pair and it's used to do the encryption the beautiful beauty is it's computationally difficult to recover that private key from the public key and the encrypted text a key thing is it's not impossible and that's kind of one of the interesting philosophies of security that that we started at the very beginning talking about security the perfect security is kind of impossible to achieve unless you simply don't send anything and so public private key asymmetric keys is well understood as to how you would break it everyone knows how to break it the problem is is that computers aren't fast enough to break it and when computers get faster we'll just make the keys bigger so the mathematics of this makes it impractical to break i mean literally in practical break now i think we can safely assume that governments probably have enough computation to crack these once in a great while i mean if they're not cracking every transaction between you and target when you want to buy something but if they really have to they can record the encrypted transmissions and if they really had to it took a long time i have no idea how long it would be they can break it so that's actually kind of a neat way to think about this by revealing it all frankly any computer scientist could make a name for them their whole life if they proved that there was something wrong with this by revealing the algorithm revealing the cracking technique if someone can come up with a better cracking technique it is like fame and glory forever which means that we're pretty sure that there's no good way to crack this other than the brute force mechanism that requires a large amount of computation so if you're going to use public private key encryption you have to generate a pair and it starts by charging choosing two really large random numbers with hundreds if not thousands of digits that are prime so you kind of choose a choose a random number really big and then you kind of look around for a nearby prime number and you choose two of those and then you multiply them okay getting an even larger number and then through some steps through some calculations you compute the public and the private keys from that large number the essence of this are those two prime numbers prime numbers of course are numbers that you only divide by themselves in one which means they have no factors which means they're kind of like looking for a needle in a haystack and so the public and private key is really based on these two prime numbers if you could figure out what the prime numbers were you'd be okay but the computational difficulties finding the prime numbers that are extremely large and finding the right prime numbers that are extremely large so it's easy to do some calculations in one direction but not in other so for example what are the factors of fifty five million one hundred twenty four thousand one hundred and fifty nine quick but if i simply ask you what do you multiply 70 919 to get that 55 million number that's easy you do a division and it turns out that you can find out 69 61 really easy right so if i just say what are these two numbers that's hard if i say given this number what's the other number that's trivial so you can think of this as the decryption is where the receiver of the message knows kind of half of the calculation whereas the world doesn't know either half of it doesn't know the calculation so it has to figure out both halves whereas the receiver only has to figure out one half and so that's how asking the question of what are the factors versus given one what's the other so it takes a problem that's easy makes it computationally nearly impossible but again not impossible just nearly impossible okay so here's the notion so you're about to type your visa card into uh a credit card into like amazon's web page and so what happens is is that amazon will has a public key and a private key that they retain and they will send you the public key across a medium the internet they're going to send this to you somehow but the bad guys eve or charlie or whoever they are the bad guys this is alice and bob and eve and charlie are always looking so even charlie kids intercept it and you assume that they can this is the key don't don't try to pretend they can't even though it's very difficult for them to do it but you assume they can so the public key comes across it is simply sent to you as part of the beginning of establishing a cert secure connection and the bad guys see it too or gals they see it too so the public key comes to you and then what you do is you encrypt using that public key and create some encrypted text ciphertext which you then send back across the danger where eve and charlie are watching and it comes across they intercept the encrypted text they've intercepted the public key and they they can try as hard as they like with super computers to derive this and frankly like i said if they had months and months and months and really fast computers they could okay but because amazon is in sole possession the private key and it never left amazon servers it is a very simple matter for amazon to decrypt and get your plaintext it happens very quickly just like if you kind of know half of the prime number calculation figuring out the other prime number is really really easy okay so so again these people see all this information and yet is computationally virtually impossible for all practical purposes to do it and so it's beautiful because there was no need to protect the public key we never had to get in the same room and away it goes so you just amazon just blasts out its public key and we encrypt using amazon's public key we can't decrypt it but we don't need to decrypt it all we need to do is send it to amazon and voila it works so the beautiful thing is is the public keys can be distributed they can be intercepted and it does not matter so with this notion of public private key encryption in general we made a change to http a layer a mini layer is in the data model if you remember way back perhaps you even forgotten about the layered model remember that layered model application transport internet remember this is sort of one computer and this is the other computer these are the routers routers these are the hops there's like 15 of these remember remember all this so it comes back not on us okay so if you recall just sort of to briefly remember the the transport layer is responsible for the re-transmission it gives us the appearance of a reliable ordered connection between the our application and the far application uh http is one of the application protocols um and so there is a little mini layer that that is layered in sort of seamlessly on top of the transport layer that basically takes plain text and encrypts it and turns it into ciphertext and then ciphertext on the way out and turns it back into plain text okay and so what happens is these applications just send plain text and out comes plain text and there's a little bit of extra glue in the middle here that's sort of a secure transport secure sockets layer this is this thing here is often called a socket oop sock yeah it'd be good if i could spell socket so this is a socket and then the red part is a secure socket so the applications kind of don't encrypt the data at all there is a library that encrypts it and the other thing is is that all the rest of the internet the internet the link layer the routers nothing the ethernet the fiber they don't even know the difference between encrypted text and non-encrypted text because the encrypted text wanders around fully encrypted addresses are not encrypted and so it stays encrypted all the way through the entire network it actually if you then think about the fact that this is the moment that it leaves your computer the only thing so the plain text comes in here gets encrypted here encrypted comes down the only thing that leaves your computer is encrypted text and it makes it all the way across the encrypted text goes into amazon so this is amazon this is you the encrypted text finds its way through all these things and it comes in encrypted and it actually doesn't get encrypted until it's sort of right at the point where amazon's web server that's going to actually charge a credit card so this is actually beautifully elegant in that the rest of the network is blissfully unaware that any encryption is happening it's just moving the data so this did not require any change again the beauty of a layered architecture did not require any change sort of below the transport layer and as a matter of fact all of the sequencing and re-transmission that happens in the tcp layer that happens with the encrypted stuff too because it's just encrypted it's just text it's gibberish text it's not the original visa card number that you're sending you're sending one two three and out comes you know w x y the w x y just goes it's retransmitted all this crap just works it's like beautiful it's a beautiful thing it's absolutely beautiful thing and it's just like this mini layer kind of between the app it's like the top slice of the transport layer that's how i'm drawing it right here it's like this little kind of top extra little thing says you know what we're going to transport but actually help me out and give me some encryption while we're at it and there's all kinds of cool stuff that goes back and forth the public and private keys get exchanged that's all kind of stuff we don't worry about we just send data and get data back pretty cool huh so this really solves the problem of the fact that we basically should assume that everything between our computer and the destination computer this is you this is amazon right everything here this is all dangerous there's some like terrifyingly scary individual that's watching everything doing packet sniffing this might be eve the eavesdropper this looks like a little he he looks pretty tough right and and so that even the wireless right this is the wi-fi connection the wi-fi is dangerous now the reality is is these things aren't all that dangerous the wi-fi is probably the weakest link of this whole thing but we have to assume that it's dangerous right we we want to assume that the only thing that's safe and unfortunately if you put viruses in your computer then they can get it the plain text if amazon loses its data somehow then they get the plain text right but but basically you know we want to distrust all of this okay so this concept is called transport layer security also called ssl also known as https http for secure and it's kind of like between the tcp layer and the application layer or the top half the tcp layers the way i like to think about it it's because it's based on public private key encryption it's difficult but not impossible normal people don't have the kind of equipment to break it and even governments if they can break it i don't even know i'm not an expert i don't hang out with the government so i don't really know but assume that if they really put their mind to it in a very narrow situation if you become really interesting they will find your credit cards probably there's easier ways to get your credit cards than by decrypting your text so it's hard to decrypt and as i mentioned because of the layered architecture the tcp layer ip and link layers are completely unaware so you'll you you see this in the form of urls that start with https right they start with https www.facebook.com versus http and there was a time a few years back where you know they were it used to be a little more expensive inside of the servers to do https that still is and so some sites would try to do some of their activity without using secure protocols and others would use uh use non-secure and secure and then flip you back and forth like if you're typing your password the problem was is that there was actually still sensitive data being sent even across the insecure and there was a quite a famous uh thing where people could install a firefox plug-in and watch facebook non-secure facebook things go back and forth across like a starbucks and it would just show you all the people's facebook accounts and you could log in as them and post as them and so you've seen a situation where companies are just starting to use https for everything you as a user have to be aware to see if you're typing anything sensitive never type it into a url that doesn't say https okay never do that you're typing password uh credit card number any kind of personal information make sure you're doing https and make sure that that you know what it is we'll talk about that uh in a bit we'll talk a little bit more about that in a bit so so if we take a look and we think about where the bad guys are at the bad guys are kind of everything and this secure tcp runs is the one part of the layer of architecture that runs from within your laptop to within the server and so then if we kind of assume the worst the the the backbone is pretty safe the wi-fi is probably the most dangerous right but when we do secure system tcp secure system to system tcp we are doing the encryption right here inside your computer before it leaves and we're only doing the decryption right when it comes back in the computer so the decryption and encryption are happening inside of amazon's computer and inside of your computer and nothing else so secure sockets is pretty good now the place where you're still danger is there might be a virus that's watching your keystrokes right this is why virus checking is so important because at some point you're typing it into your computer and the greatest danger you have to losing your data is really two things one that you've got a virus or b somebody has redirected you not to talk to evil amazon instead of amazon and that's what we'll talk about in the next lecture how to know how does these browsers really know they're talking to the real amazon and that is not confidentiality confidentiality is stopping the bad guy from seeing what you're sending as their eavesdropping eve is eavesdropping okay now the next thing is the question of is this the real amazon or is this a fake amazon so we'll talk about that next [Music] so now we have a way to ensure the confidentiality using secure socket layer and public private key encryption and the only question now remaining is who are we talking to and are we talking to that server that we think we're talking to are we really talking to amazon are we talking to coursera how do we know now you'll notice if you take a look at the top of your browser perhaps right now even you can take a look at the top of your browser and usually when it's indicating that you have a secure connection you can click on this and see some information it's called the certificate information okay and so https has the notion of a public key we retrieve the public key when we make the connection but there are two kinds of keys there are public keys that are just made up that are sent to us and then there are public keys that are signed and validated by a third-party certification authority so this is a coursera and it is certified by godaddy certification authority so it's not just that we're getting the certificate from coursera we're actually getting the certificate signed by godaddy the godaddy has has has got checked the id of coursera said okay you must be the ceo of coursera or i'm not going to give you this sign signed private key so it's a it's a process to get private keys signed and it's a way to make sure you are talking to who you think you're talking to so this is called digital certificates also known as sort of signed private keys now if we go back and we talk about the integrity right we want to know who we're talking about and so we had this notion of a signature a signature is a way that you know that you're talking to who you're talking to so for example like if a guy comes to your office and says hi i'm dr chuck got like a beard and some white hair you can say hey if you're really dr chuck show me your tattoo and now you'll know that very few people will also look like this and have this tattoo right so this is my private key and this is my signature of my private key this is like my message digest so so people won't have this tattoo if they claim to be dr chuck so this is dr chuck this is my message digest so there's a difference between a private key and a private key that's been certified by one of these designated third parties these designated third parties are called certificate authorities now you could say i'm a certificate authority well some certificate authorities are more better certificate authorities than others okay so they're a trusted third party and so how did they start well some are more trusted than others and the more we trust them they kind of all work out so it's not everybody you can't become a trusted authority so one of the many trusted authorities and one of the oldest ones and one of the more popular ones and one of the more expensive ones it's pretty expensive to get your certificate signed it can be as inexpensive as a couple hundred dollars it can be thousands of dollars to get a certificate signed and verisign is one of the oldest and most well respected of these certificate authorities so the idea is is that i have this website called online.drchuck.com where i teach python classes and do various other things and i wanted a secure certificate because i would be handling people's data and i wanted to to be respectable and have a secure certificate so i had a public and a private key and then i sent it to a certificate authority i paid them money and then they send me back a signed private key okay so this is uh now now you might say oh this is kind of evil or this is really expensive because all they really are doing is like changing a few bits in the in my private key you're adding a few bits to my private key but they have a lot of responsibility and the good ones have a lot of credibility so they don't want to lose information they got to spend some time validating identity saying okay are you really the owner of doctorchuck.com they're not going to hand the certificate assigned certificate for drchuck.com to anybody except the true owner and so they spend some time checking to make sure that it's the true owner and they they do this by looking at the registration data on doctortruck.com et cetera et cetera et cetera and so there is a cost of verifying all this identity kind of like the signature track on coursera and that's kind of what's going on on the signature track of coursera there's a difference between a certificate and a certificate that corsair is going to assert that we have verified the identity and the cost is in the verification of the identity so certificate authorities are charging amazon but then ensuring that they don't mistakenly give the amazon.com certificate to a random bad guy because if they did that bad guy could pretend to be amazon.com so then you might ask who decides which of these certificate authorities to trust we use the certificate authorities to decide whether or not to trust amazon.com or coursera.org or doctorchuck.com or whatever how do we decide which of the certificate authorities we're going to trust well it turns out that apple microsoft and linux and other operating system vendors pre-install at the moment that you're either purchasing your computer or installing your operating system part of that operating system is actually a list of the public keys of the chosen certificate authority so if you look deep enough inside your computer this is my mac you can see the companies that have been included by apple as the manufacturer of the operating system and so you see that verisign is one of those companies that has been pre-included in apple macintosh which means that a certificate from verisign is going to be known right so let's so let's look a little bit more so so these come your browsers and your operating systems come with pre-built in public key certificates for certain certificate authorities like verison now that's a lot of trust that apple microsoft and linux have placed in verisign and that's because over the years verizon has earned that trust that says verisign doesn't just give out certificates without checking right if the verisign gave out an amazon.com certificate to somebody without checking they would lose a lot of credibility and then microsoft would like take them out right say well verisign seems to be kind of sleazy they seem not to be able to to handle their security but you know they have and so they remain in there and so we it's kind of a an interesting thing where they are motivated to keep their security high they're motivated to do a good job because the moment that they sort of fail they lose a lot of credibility and respect and their value of verisign brand and is all of the respect that we have for verisign so so so we mentioned public private key encryption we have the public key that goes across and i'm about to type my credit card in and so the problem now that we're going to solve is uh is this really amazon's key is it really amazon's public key i mean i got a public key from across this connection i made to a server and it claims that it's amazon.com but do i believe that it says it's amazon.com and so that's the integrity thing that's the security that's the do i believe it is it really got the uh verisign tattoo in addition to the amazon.com that's it represents that it's amazon.com public key so we can also use public keys to do signing and and basically verisign has a public and private key for verisign the public key for verisign is sitting in your browser right now and they use they do an encryption much like the message digest they do an encryption of amazon's certificate and then sort of create a digest and then add that digest to it so a certificate says i'm amazon.com and later it says oh yes and verisign signed this with verisign's private key okay so verisign's private key is used to sign amazon this is probably easiest if i just show you sort of a video wait ooh going the wrong way am i going the wrong way what's going on here how come i'm going the wrong way yeah i'm going the wrong way okay so here we go going backwards so here's how it works um this is how amazon gets a public key signed by uh verison right so in the beginning verisign makes a public and a private key somewhere in a bunker and they store the private key and they'll you can sometimes read up on how many how much effort they go to storing the private key and then they hand the public key to apple microsoft and linux and then they bundle that in with your laptop so your laptop that you buy you walk out and you have a laptop and it's got public keys in it from the vendor now amazon says you know what i'd like to do some commerce and i would like to be able to use ssl and us and have a certified private key so then what amazon does is amazon inside of its servers generates a pair a public private key pair so this private key is not leaving amazon servers it's it takes a while it takes minutes sometimes to generate the right random sufficiently random public and private key by looking at all the large prime numbers and then picking one and then bang making a public and private key then what amazon does at that point is it transports its public key to verizon now during that transport it might eve might have seen it but it's okay because it's just it's just a public key right so it's just the public key and so it actually can be sent across the internet and it's most commonly sent across the internet like when i got the online.doctorchuck.com certificates we just typed it in and sent it because if you get a hold of the public key all it means is you can encrypt it doesn't mean you can decrypt so then what happens is inside of verizon's servers verizon computes a message digest using its public it's private key right and then it adds basically a signature that says oh here's amazon's public key that i received from amazon verified the identity of the person and now i have signed it mr ver verisign i've signed it and that of course is just like message digest like information that is appended to the bits of the public key then that public key with signature is sent back bundled together and sent back to amazon and now amazon has not just any old public key it has a public key that says i am amazon.com and verisign is now asserting that i am really who i am and and again so eve saw that one who cares it's just a public key there's nothing about the verisign private key it never left the verisign servers the signature is public information you can use the verisign public key to verify that the signature is right but you can't forge the signature so eve can look at that eve could look going this way eve can look that way eve gets nothing he gets nothing so amazon now has a signed and certified private key then what happens is sooner or later many hours many days many months later you decide on your laptop remember this is you oops this is you you want to buy some shoes so you connect to amazon.com with your browser with an https connection and then what happens is amazon sends you its public key and eve of course is eavesdropping all the time eve sees it goes by it's worthless right it's worthless because it's just the encryption key it's not the decryption key the fact that it's signed it sees that but it can't do anything with that information now within your laptop within your laptop you have from the vendor the verisign public key from macintosh or apple or whatever so you can look with this quebec you can look with this public key at that signature and just like we did with the message digest before you can go yep that's good that really had to have been signed by verison and if you're if your computer's really conservative it can actually go check with verisign can send it up and say hey did you verify this and then verisign can verify it too but you actually don't need to connect because you have the public key the only way that the the you know whatever that message digest could be right would be is if verisign private key was used to generate the message digest just like we used to do santa in the other one okay the sand is just really simple but it's the same basic mechanism this is verifiable that it came from this private key now if somebody broke in and stole the private key that's a different story but if the private key is safe and secure hasn't been compromised the only way to generate that message digest is be in possession of the private key so now you are in a position where you are in a good mood right you see an https you can pop that little thing and say that was signed by verisign you can be assured that verisign is asserting that that key came really that public key came from amazon and now is time to encrypt your visa card and send it over an encrypted connection to amazon okay because you won't send your encrypted thing your your you won't send it unless you believe that the https is proper and your browser will pop up a little pop-up and say wait a sec this certificate looks a little funky claims to be from amazon.com but it's not signed by one of the signatures that i believe in so you send your data it is encrypted and eve is watching right eve is always watching but because it's encrypted with a public key unless he has super computers in a couple of months there's nothing that eve can do and then of course amazon decrypts it using the private key so the private key comes in and let me redo that right so in it comes eve watches but is helpless because they don't have enough computers eve doesn't have enough computers and your key is large enough so amazon then takes its private key and uses that to decrypt it and ends up with your plain text again so if you think this whole thing through this eve was watching the whole time we sent it public key we signed and returned a public key then we sent the public key to your laptop we verified the public key and the whole time eve is sort of watching all this information and she is powerless to break it pretty dang clever if you ask me and we can thank diffie-hellman and merkel for that pretty darn clever because eve sees it all just think what would happen if like the germans had this in world war ii would have been pretty cool of course they didn't have computers so it would have been difficult i don't know too much to think about right now okay continuing on so what we have is we have this thing called the certificate authority which is a trusted third party that signs these certificates right and so it's the it's entity that issues digital signatures on public keys so that we the public have a way of validating that an amazon.com certificate really came from amazon.com so if you then add this all together right if you add this all together we have basic public private key encryption that makes sure that this data can move across the internet out of your computer out back into the next one all encrypted that's just public private that does that and then we have this third-party certificate authority that your application can use to validate the certificate that comes out and so the combination of ssl or the secure sockets layer and the certificate authority gives us high confidence that when we're talking to something we know we're really talking to it so it's pretty non-intrusive security if your browser pops up with a little pop-up message that means it's got a certificate that it has no certificate of authority to validate and that's not a good time to be typing in sensitive information unless you know exactly what's going on so that sort of brings us to the conclusion of this this these last couple of lectures have been about message confidentiality and that is protecting the contents from being revealed we use encrypting and decrypting for that and then we have message digests and cert to sign things we've signed messages we've signed certificates we've signed many things and those are important and we talked about both sort of shared key and secret key where you have to get together and agree on a key which is a symmetric key that's used for encrypting and decrypting and then you have the public private key which is the asymmetric which is one key is used for encrypting and the other key is user for decrypting and you can freely show the encrypting key because it gives very little information although it is mathematically possible but difficult to decrypt a public private key message so that kind of sums up that kind of sums up our lecture on public private keys and i hope you find it valuable see you on the net [Music] you
Info
Channel: freeCodeCamp.org
Views: 266,061
Rating: undefined out of 5
Keywords:
Id: 47NRaBVxgVM
Channel Id: undefined
Length: 568min 1sec (34081 seconds)
Published: Mon Dec 20 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.