We've talked about various aspects of bootstrapping a lot on computer file Going way back, I think to a ton robin video three or four years ago now which is doing very well Dilemma of the chicken and the egg. He talks very well about the essential nature of bootstrapping and so on but he very carefully avoids going into detail because it does get complicated the way I've always found best to teach it is to use a Tool called T diagrams, which you just draw for yourself I'm going to explain via diagrams what it means to bootstrap Bootstrap a system like an assembler or a compiler or whatever, but first of all What is a t diagram draw me a t diagram let us assume that you have written a program? In Bell lab C language, you are very very proud of this program You think is absolutely wonderful and will win an award and all that. This is the input To your program. This is the output From your program and at this level I can say what does that mean? It means that the program was written in I've chosen C deliberately because it's low level it's fairly close to the hardware something like this I hope you'll all recognize you're writing a simple program in a simple language and you're delighted when you're high level statements produce Ultimately this thing I will try not to fall back into a cheese usage and say it produces a core image They even call it that's about as no. It's an executable. Okay? well We all know that it's all very well saying our right? Agency is gonna have this input it will produce that output. The next question is but how do you convert the C? into a runnable executable binary Which will be your program. You have to get that C Translated and here you begin to see the power of T diagrams draw this one again Input/output it's written in C Now here's the clever that positioning another T block like that up against this one Shows you how this has to be processed What you need is something that will turn you'll see Program statements into a working binary on the machine you're working on you may say mmm, but where did this come from? somebody provided me with a thing that is capable of taking in C statements is is itself an Executable binary it's running on my arm chip on my Intel on my Apple or whatever This is my C compiler. Of course, there's got to have come from somewhere, but we'll get into that later So there's the C compiler and if you feed the C language for your master program through that compiler, you know what happens? It produces an a dot out on UNIX or an executable binary the net effect of slotting that T diagram Against here slightly downwards is to show you that The C you've written gets converted into binary and the net output from this process It produces out a program that you probably store in a file somewhere Which is like this left-hand side only a little bit different because it's now Input/output but look What has happened during the compilation process? Is that by compiling your C statements for your program? It drinks the minutes data. The compiler itself is a binary running on your architecture that you're perfectly happy with. It is squirting out executable binary So I can write in here Let's say that's my output binary code, but that output binary code I'll stick with a pink our right bin in here to show you that that is what the compiler has done for you You think of it as input? Producing output and I wrote it in C and you close your eyes as to what happens It just works but your binary your a dot out your ultimate executable will tell your input will produce perfect out put first time of course, but it is Essentially written in binary and has to be to execute directly the hardware that has been produced by AC Compiler and the T diagram for C compiler is this I accept as input C. I am a runnable binary Otherwise, I won't work at all, but I have code generation capabilities I Generate binary that will run and that is your user executable And here's the first point I want to make the quality of the binary that actually is your compiler running may be very different from the Quality of the binary that it spits out as a translation of your program they may be very similar or it could be that the C compiler is actually very very slow at Producing your output binary, but if he gets there in the end you don't care it could be ultra fast It could be that you've written a see compilers are so bad that the output that it Produces for you is worse than it itself is using. I hope you all get the hang here There's not necessarily any direct link between the quality of the binary You are running to make the C compiler work as opposed to the binary. It is spitting out for your program has its translation they may be related and may be close and all this what I think now I will try and do for you is to take the story one stage backwards And start Speculating about how the heck did I get the C compiler? Is it something that 10 Thompson handed over or Dennis Ritchie hand it over to me already made it could have been oh you might say Dennis how did you implement that to C? Compiler and he was so well the first version I wrote off the back of an assembler And of course this it goes on recursively forever, but Dennis who wrote the assembler You know and back and back and back right down to squirting ones and zeros in on hand kids No, let's presume for the next stage of the story that from somewhere your little machine did also come with an assembler for low-level programming for filling in the bits that may be The C compiler for one reason or another isn't suitable for well. How is that done in T diagram terms and By looking at how the assembler is done. We can then see how it's possible to Bootstrap off an assembler version of something to a higher-level version an assembler is like a Compiler except for those of you who've done it, you know in every sense an assembler is at a lower level Assembler some particular macro assemblers, they'll give you a vestigial capability to do an array but they may not have structures almost certainly won't they're a high level constructs you see So you you're down with simple add subtract? Statements you can move about memory. You've got to understand pointers We've done lots of those you do them at the assembler level. You're in charge You must keep track of what your pointers are in so on and so on and so on It's just a simpler view of the world and a more detailed and complicated one in several ways I'll call it AC and that stands for assembler input now You expect that assembler to produce you a runnable executable binary. Well, I'm going to write this in here as saying that the code generation from the Assembler is going to produce me what I'll call a binary of type a is produced from an assembler It's an assembler quality binary It depends on the Koho wrote the code generator for the assembler as to how good that being a is now Here's the thing again. You'll have to get your head round and get used to this is au Been you I don't know how this was developed this assembler All I know is I met Ken in the corridor David wheeler It came to the corridor and said hey, here's the assembler I use they handed over an executable and I did not ask questions about how that in itself may have been Bootstrapped out of nothing. They just said here's an assembler it works use it. So okay. I say, thank you I mean at the midpoint of a endless sequence of bootstrapping but I can build on top of an sm To build the C compiler shall we say so you put assembler codes in of course? The unknown provenance binary is worrying away. And that is executing the assembler code and out it comes as binary of type a Which you can put in a file BN a dot out file under you mix it could be whatever you can store it in a file you can invoke it you can Execute it and the only difference from what's gone before is that the assembler code is it's a lower Sophistication level than you'll see you would be a lot of you know again I keep harking back to Bell Labs history largely because it's very good for Illustrating this thing you can say look this really happened. This really is what Dennis inten did This is what you have to be aware of Dennis said you can and cancers to Dennis for the next version of the UNIX operating system We've gotta write it in something higher level It drives you mad writing it in assembler because there's no easy way of keeping track of pointers. There's no structures In fact in a recent video now out there in the wild somewhere ken Thompson admits He said our first three attempts to write UNIX in a higher-level language and Dennis was developing Say the first three attempts. I tried I'm Ken Thompson Failed. Why did they fail I found later the fourth attempt Dennis had put structures in C And that's what I needed because they automatically keep track of offsets and pointers. Wonderful. Okay So here we are then back in that kind of era. We are wanting to create a C language Compiler, but we are writing it as a simpler code level it is going to produce a binary But going back to what we said last time because it came out of an assembler era We're going to presume that the binary that this thing produces is what we'll call binary a it came via an assembler Did this execute them fine, but then you look at that say hey, come on. This is brushing over certain details You can't directly execute assembler and unless somebody's written an assembler interpreter for you, but they're again No, they haven't that's emblem coding. It's got to be converted into a binary How do you turn the assembler coding into a binary and so sort up another T diagram and visualize what goes on? There just happens to be hanging around because Ken gave it to us in the corridor a thing Called an assembler and the assembler accepts any old assembler code you like? produces what we'll call been a an executable binary but it came out of an assembler and that the assembler itself The binary is of unknown provenance candidate in absolute binary off the top of his head probably not but somebody labored long and hard to write a thing that really works as an assembler on this particular machine that you are working on what you need to do is to feed the assembler code that implements your C compiler Into the assembler itself to assemble it. You've got to assemble the assembly code That is the compiler and it goes in as assembler coding. It's running on its own binary. You're thinking a while' crash No, it won't Ken wrote it, you know this sort of thing that was around and it converts the assembler code into binary of quality a That binary of quality a when it's produced you can store in a file. It's an executable It's an a dot out file and then you have created a thing which takes in C produces binary a Assembler coding so your net output here then is the following It's C. It spits out an executable binary for you the C compiler So we now have a C compiler that is not just as it were implemented at the assembly code level But that assembly code has been Translated into a binary So that the C compiler can actually run on hardware and once again There's trace through the assembler code that implements The C compiler is fed into the assembler This thing was round and round like mad but the assembler code statements that are the compiler get translated into bin of quality a and that's the Implementation vehicle for your new C compiler you have produced a C compiler by building it out of an assembler by using the assembler as the next stage long, which you have to do the assembler converts the assembler code that is the C compiler into being a workable runnable binary of quality a so that's the first start of your compiler is to say we We doing bootstrapping we've come up off an assembler solution We wrote in our high-level language and here we are a very very first C compiler Is there a problem with it? Not really? Except that what we want to do here is to say well We want an AC compiler that will produce runnable binary of some sort and then we say hey But the only tool we've got to making a binary for this C compiler is that there is the assembler itself Right the assembler Produces Benet, so you're stuck with that so Can you see that by running the assembler there and making it squirt out binary a quality? equivalent to the assembler code that is the compiler you end up with AC compiler Marquand that Spits out Benet but is running on Binet We're at the mercy of bin of quality a the assembler quality binary. Is it good? Is it bad? Could we do better? Yeah We stop there but just as a marker for what's coming if your head isn't aching yet that thing is building a Compiler off an assembler when that is working what do you do to make it better you rewrite it and You make a new version of a C compiler that produces beam B which is so binary But how do you compile a new version of the C compiler answer with the old one? So You end up with a seat have been better, but it's still running on Binet next time I'd lost my Compilers notes and assembling those did you see diagrams? But thanks to one of my grad students rom nots bless you Ron who was a complete pack rat? He had rewritten all my notes in much better shape and had come on to them from 30-something years ago so I couldn't made this video without your own. Thank you very much