So you want to write an interpreter?

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

welcome everyone I'd like to please help me welcome Alex Gaynor it's going to talk about writing an interpreter so hi everyone thank you guys so much for coming out so we're here to talk about how to write interpreter and this is going to go pretty quickly writing interpreter is a big topic if you study this in university you might have spent a whole semester on it so we're trying to cover a lot of the basics level concepts you need to understand and get to the place where hopefully after this talk you'll be really excited and you'll be ready to go forward and write your own and learn some more so don't worry about copying every single thing on every slide down so first one thank my employer are do we screaming Internet music we're hiring you should come talk to us thank you very much for having me here so Who am I I been around Python for a while I work on pi PI and C Python both interpreters which is kind of where I got my love of this I also work on Django so in a sort of former present life I was a web developer and recently I created topaz which is a Ruby interpreter written in Python so really love this whole interpreter thing so sort of started one tell you guys a story a lot of people ask me how I got involved in writing interpreters compilers it's kind of an obscure branch of programming not a lot of people are involvement they think of it as a black art so four years ago at a PyCon in 2009 I believe unladen swallow was announced it was it was this new effort from Google to try to speed up Python and their target was five times faster in a year and I thought like this sounds incredible five times faster you know it sounds like a really cool thing that's going to be awesome Google's doing it so you know I signed up for the mailing list I got every commit that happened sent to my inbox and like at first I had no idea what I was reading every single thing I saw was basically just nonsense to me over time kind of a learned what was going on learned learned what all these commits were eventually even got to a point where I was able to help out myself then when online swallows sort of faded away and went into arrest and by that point I was totally enamored with interpreters and performance and all this stuff so I jumped up right over to pi PI because they were doing all this other awesome work so it is totally an approachable thing and you can totally learn it so start with up from the top what is an interpreter so you know what is a program even while a program is a thing that takes some input and it does some stuff and maybe it has some output or does some computations and maybe it runs forever maybe it runs for five seconds or all programs so an interpreter is just a program it's a program that is a pretty pretty cool behavior though its behavior is that its input is some program in some language and its output is what would happen if you ran that program so the Python interpreter it's a program its input is your Python files and its output is what happens according to the sort of magical Python machine if you ran it and it's an implementation of the Python machine you could imagine we could build a Python machine out of raw hardware if we wanted we could wire up wires and hook them all up and we could build a Python machine and so the Python interpreter is software that implements this hypothetical machine so sort of none of this is magic this is this is just a program that's got a really cool behavior and why would you want to write an interpreter well there's a couple good reasons first of all is maybe you want to build your own language you've got a design you thinking build something better than Python or better than maybe whatever you use at your day job and you want to build your own for fun because you think you can do better I personally I like implementing existing languages I don't really have any skill for language design I'm sort of in awe of what python is Grudge because I would have never come up with some of the the great ideas I love today so I like I like just looking at the engineering side of it here's a design for a thing can we find a better way to build it and of course I think the best reason to build interpreter is it's a ton of fun it's a super cool software engineering project and you know you can sort of scale the problem from as small as you want to say fit in the 45-minute talk or to a thing that you could spend your lifetime working on so an interpreter like most programs is comprised of a bunch of parts so an interpreter is basically six parts we start with a Luxor we have a parser we have an ast with a bytecode compiler and we bytecode interpreter and then a runtime and we're going to sort of walk through what each of these components look like what each of them does what you need to know about them and one of the cool things about an interpreter is that basically each stage feeds right into the next stage so you start with a program then you feed it to your lexer you feed the result of your Luxor to your parser your parser spits out an AST you feed the AST to the bytecode compiler each one of these stages just runs right into the next one so it's a very cool machine very cool pipeline so basically the rest of this talk we're going to look at what does it look like to build an interpreter from scratch for our own language which is basically a tiny tiny subset of JavaScript just a few small things and one small addition it's going to it's going to be pretty cool I think so this is our language we've got out we've got assignments we've got we've got comparisons we've got if statements we have a we have a print statement which you know it's not a JavaScript thing but it turns out IO is like useful and we don't want to emulate having an entire browser so yeah this is our language small little thing looks like a real language and it's totally an awesome building block so yeah small couple small pieces you know to sort of in terms of data types we have numbers and we have strings and sort of what we're working with so this our ply is a toolkit we're going to be working with if you used apply Python Lexy ak from David Beasley it's a very similar basically a new API for a lot of the same concepts so when you're working through these slides you can install it and use it so first step is writing a Luxor Luxor is basically the first step you might also know Luxor as a tokenizer if you've ever looked at this before so a Luxor is a prote eight is a basically a thing that takes your program as a big old string say something you read right out of file and it turns it into a list of tokens and what's a token a token is basically the name of some sort of symbol some type of thing that occurs in a program as well as it's bad what was the what did it look like in the program we'll have an example this in a second as well as maybe its position where did it occur which is you know it turns out really useful for debugging to know where things occurred so if we look at our example program we can see that these are 1 2 3 4 5 6 7 lines of code become tons and tons of tokens so if we sort of look through them the first token is a name which that's the the name of the symbol name of the token we have and its values a obviously a named token put a tons of different values you did write whatever variable name you wanted in this program and then we have other tokens like equal whose value is an equal sign an equal token would always have an equal sign there's only one way to represent equal so you can sort of look through it some of the key things to notice here is there's a if you look through the tokens you won't see any white space anywhere our language white space is not significant so it's just emitted emitted sorry we also there's also things like L brace R brace you know really just representing every symbol here except the white space everything that's important all the key words if else print they're all becoming their own token so yeah this is basically output so how do we build this well you know thankfully the kind folks at xkcd have explained this for us regular expressions it turns out each token can basically represented as a regular expression so here is how we build our entire Luxor so we have this Luxor generator object and first thing we tell we want to ignore all whitespace whitespace is not significant and then we tell it sort of the list of rules that we want to know about so we have the if token and to match it you just look for the string I app every I F in the program begins an if token every screen else becomes else every print becomes print every opening parenthesis because an opening parenthesis closed in parenthesis becomes a closing parenthesis braces become braces equals become equals greater equal becomes greater equal semicolons numbers we actually have an semi exciting regular expression here we have you know multiple numbers in a row it turns out make up a number and you know we could see how would we expand this to also match floating-point numbers for example well we might want a separate rule that included decimal points you think about what does our language look like in Python it turns out your if you're writing a decimal number you're allowed to omit either the leading 0 so you can straight point one to mean zero point one or one point to represent 1.0 does javascript have a honestly I don't know but if we were designing you know a more complete lecture we would think about those problems and try to express them here and maybe the most exciting a token name we can see it's something that starts with a letter uppercase number K uppercase lowercase or an underscore and then can be followed by any number of uppercase lowercase numbers or underscores and then we actually take a final step we call a LG our lexer generator build and we get a lexer a-- object out the back so how does this look like so we import our lexer and then we we passed a string and we get this token stream back and basically a token stream just emits tokens as we have them so we call the next method we get we get our first name token with my VAR string we call next again we get the equals token we call next again we get our 23 number we call next again we get none were out of tokens so it's just giving us tokens one at a time which turns out is exactly how our parser wants them so now we have our parser our parser at this point is going to know nothing about the original string for the program all it's going to know about is this token stream this thing that gives us tokens one at a time and so now we want to actually do something with that list of tokens what does the parser responsible for a parser takes this list of tokens and tries to figure out what is the structure of your program what are those if statements refer to how does this all work so we're taking this this big list of tokens we have and we're turning it into what's called an abstract syntax tree so it's this basically this big thing of nested objects that contains all the parts program so if we look at the topmost level we have a block what is the block a block is basically a list of other parts of the program so you can see that the first item in the block is an assignment we're assigning to the variable a we're assigning the number 3 which you know if we look at our original program is basically what we have now we look at the next item in our block next thing is an if statement so an if statement basically has three parts it has what is the condition when does this if statement occur what happens if the if statement is true and what happens if if the if statement is false so the first parameter if is a this condition what is the condition and that's turns out it's a comparison it's the greater than equals comparison and the left-hand side is the name a and the right-hand side is the number 2 the second part of an if statement is a what happens if it's true and this is another block because you can do multiple things you have multiple statements if an if statement evaluates to true so it's a block and in it we have a the only thing that happens if the if statement is true is we have a print and the puts the print printing it's printing the string a is big and then finally we have the else clause on the if what is that again it's another block because we can have multiple things that happen and it's print name a so you can sort of see every one of these nodes in this industry it's comprised of multiple child nodes or is a sort of a leaf node it has it has no children so number name these have no children they refer to a specific value or a specific sort of things so we don't know what value and name has but want to find it out when we run our program we know exactly what value a number has and it will never change when we say a equals 2 it will be sorry 3 it will be 3 for the rest of our program that that's what the code says so a becomes 3 there whereas the if statement for example what it does it depends it depends on what that condition evaluates to so what are these ast nodes look like there they're pretty unexcited so we have a node-based class that gives us some simple helpers for equals and not equals extraordinarily exciting and then we have we have a couple modes so we have a we have a block node which you know just takes a list of statements we have a statement node we'll talk more about why that exists in a bit this statement is basically an expression we have a number and numbers got a specific value and you know we can imagine what does the name node look like well it's got its got one attribute and it's the it's the name it is we can imagine what is a comparison H me look like it's got what is the operation what is the if clause and what is the else clause so we can imagine what is what a writing all those things look like so now we have to get to the question of how do we construct them we need actually build a thing so our ply is kind of a similar API as for building Luxor's is does for building parse so we can see there's a couple parts first we construct a a parser generator and the first argument is a list of tokens so we're going to start really simple we're just going to start writing about numbers and semicolons that you know the most simple program you could possibly imagine just basically just numbers and it also takes a cash ID which is entirely unacceptable the parser we would do a PG bill very very similar and so the first parses are basically built of what's called basically rules rules are generally called productions and parsers and basically how this works is you have the production which is a sort of this small little mini language for describing what the rules are and then you have an action you perform when you when you run this rule so the simplest rule is the main so the first rule that's registered with the parser generator becomes the top-level thing this is what is your entire program and so name it main here so the syntax is basically rule name colon set of things that that rule is so it turns out main is statements where does statements statement they elicit of is it's basically a block so in what rule do we perform when we actually evaluate this block we we just return s Sub Zero what is what is s here basically acts as a list where each item is something from the right-hand side of the colon so in this case we have one thing on the right hand side of our call and we'll see some examples where we have more and so we're just returning the statements so tell you now statements is one of those block objects and so we're evaluating an entire program the top-level thing is a block and to be honest I consider writing parsers and this whole sort of rule thing to be the single most complicated part of a writing a compiler whereas I know many people find it entirely intuitive and find the rest of this talk extraordinarily confusing so do not worry about what you find confusing I found that no two people find the same thing confusing so this is now sort of set of more complex rules so the the first rule is what is statements statements are statements followed by a statement so we sort of dive right in super confusing like massively recursive thing so maybe it's easier if we look at the second rule first a statements can also be a single statement and you can see what we do if it's that then we build a block that has a list with a single item pretty simple you know if our statements is a single statement while it's a list with up one item of course and so then if we look back to our first rule statements is statements followed by statement you know how does that work it turns out this parser generator thing knows how to read these all these rules and basically figure it out so if we look at the rule for what a statement is it becomes I think a little clearer so what happens if when we see a statements that has many statements we take the the list that we get from the first item and we combine it with the list or with a single item for that single statement from the second item and we we combine them to return a new block so basically this parser generator to all will know how to basically handle this recursion for us and do everything for us so our third rule is uh what is the statement well one example the statement is an expression so in Python for example you can write the expression 3 that's whole line that's a statement with the statement 3 a new value as an expression in JavaScript and in our little mini JavaScript we follow expressions with a semicolon we want them to become statements so you can see what do we do there we return a statement object and then what does this statement have in it it has an expression it has the value of that expression and we totally ignore the semicolon we're not required to do anything about it one thing you might have noticed we have a bit of a naming convention lowercase things refer to other rules and uppercase things refer to tokens so semicolon there gets its name from what is the token that our lexer emitted whereas X forgets its name from the rule we define write below so in our world simplest parser what is an axe per well it's a number and the number is that token and you can see in JavaScript it happens to be that all numbers are floats so what is the rule there we we create a number instance and its value is converting the number token to a float and we can imagine what do we need to do to add more operations our our thing as more expressions than just a number that would be an extraordinarily boring computer programming language I imagine so here is a slightly more exciting rule so what is an expression an expression is a number an expression is also an expression plus some other expression so again we have this sort of immensely confusing sort of recursive thing but if we just sort of imagine what is the simplest thing so we want to we want to run the program 1 plus 2 semicolon and I don't know about you guys I I write quite a bit of Python for my day job whenever I'm writing a tasks for these many languages I so I run my tests I write 1 plus 2 and I get a parse error I'm just going to tell you now you're going to forget I know I assume I forget you can add the semicolon you get a syntax error pretty cool languages so the program 1 plus 2 semicolon so what happened what is the parser see ok first it sees a one so it it goes through and basically it matches it figures out that's a number so we number becomes an expression so it puts that somewhere what is the next thing we see now we see a plus and a plus now it knows ok so I've got a number which is an expression followed by a plus so I know I want to I know I want to find some sort of plus thing and then it sees another number it sees number two and now we've got 1 plus 2 and it knows okay I know this matches the rule X / it this is an expression so I'll run this rule so now we've got the expression which is this addition thing and then it sees a semicolon and it knows okay I have an expression followed by a semicolon that's a statement and then it sees I have a statement okay that becomes a statements now I'm at the end of my program I have a statements statements becomes main that's my old thing I now have a program with a block that is one item which is addition so you know here's we have another note here obviously we didn't write it out we have a binary operator 1 it turns out that all the binary operators whether it's plus x division subtraction they all need basically the same node so they need what is their operator what is the left-hand side what is the right-hand side and they need to keep track of this if you look at sort of think of what are the problems about adding more operators most exciting one probably operator precedence it turns out if you write 1 plus 2 times 3 it doesn't just evaluate left to right it evaluates the 2 times 3 first and then the addition so we get 7 instead of 9 if we had more time we would go deeper into that so the next bit we have is we want to talk about the compiler so now we managed to build these abstract syntax trees we have this AST we have this block node we have these if nodes we have all these things we want to actually do something with them something for the interpreter so look at what is byte code if you've ever played around with the disk module in Python you've probably seen some byte code if you've ever seen a dot pyc file those are byte code for Python putting on disk so what is byte code byte code is a sequence of instructions for this machine we're building sometimes byte codes have arguments and for our language it's going to be stack based what does this mean so it means the program 1 plus 21 it's going to be these 4 byte codes it's going to be load Const load Const binary add pop top what on earth are these so first thing to notice load Const has an argument 0 or 1 binary Add and pop top do not so how exactly is this working so we have a we have a program and basically load can't see row refers to the first constant that we see so it's it's 1 so 0 is some sort of reference somewhere to an array of constants a list of constants we have so the zeroth constant is a 1 and then we have load constant force 2 21 it's the second or first index English has no good way of referring to index things so 21 is 1 here and then we have an addition we have this operation we want to do with them and then this program doesn't do anything with the result of 1 + 21 it just throws it away so we we pop that away we throw them together so let's sort of step through this so we're going to have a stack which is going to be basically what are the values the program is sort of working with right now so from the top at the beginning we have an empty stack no items on it so we're going to start we're going to run the first construction which load Kanzi row so that's gonna that's going to find the one it's going to find the 0th constant and it's going to put it on the stack we're not we're going to run forward now we're going to see a we run load cons 1 now we have two items on the stack so we have 1 21 and basically the the bottom most thing is referred to as a the top of stack or the bottom of stack depending some people think of the stack is growing top to bottom other people think of the growing bottom to top so pretty confusing a language unfortunately especially with how I wrote top to bottom and then said bottom of top of stack very confusing I'm sorry so the the next operator a binary ad what does it do well we didn't tell it where what things were adding how does it know where to find them it looks at what the two topmost things are on the stack and it pops them off so we see one in 21 are now gone from the stack and it adds them it computes the result and it puts the number 22 at the top of the stack so it basically replaces those two items with the result of doing computation if we wanted one or 21 again we would have to do load cons ero or load Const 1 again to get that value back and finally we're not doing anything with the result so we pop it off we get we just discard that result so how do we go about implementing these so this is sort of where wire statements important that I mentioned for where it comes in a statement is basically responsible for throwing the result away of the expression if it's unused so we're now adding sort of these compile methods to our nodes and what does compile think it takes a context and a basically context is responsible for keeping track of what we've compiled sorry we should be 45 minutes so we have time I think sorry about that yes so we have this context it's just keeping track of what we've compiled and we can see so the statement is really simple it compiles its expression it just calls the compile method and then throws away the result that adds this pop top method what does it mid do I mean is basically just depend on its internal data entirely unexcited what is a what is a numbers compilation do well it first thing you adds the load constant that's you know the constant we defined somewhere that refers to what is load Const and then what is the argument it's a this context new constant and this and that takes this j/s number object so somewhere in our program we need to decide how do we represent objects in our language so javascript has numbers are many fake JavaScript has numbers so we have we basically a number object somewhere and it contains all the behaviors for numbers as you might expect what is this context that new kant's thing basically our bytecode is basically made up of numbers only it doesn't have any direct referenced objects soloq on zero load Const one they know that somewhere there is a 0 and a 1 constant what is new constitu basically keeps track of what all the constants are and returns you the number you're at now so your first number new Const and puts it away somewhere your second number returns one puts it away somewhere for later so these are pretty simple what is uh what does by not do well by nup we know has can have a bunch of different operations so first it compiles the left-hand side then it compiles the right-hand side you know it's important which order we compile them in generally we expect things to run approximately left to right minus plus or minus the precedence issues so it compiles left-hand side then it compiles its right-hand side then it just figures out what operation was so we keep we have our OP as an attribute it is a you know just the original symbol plus minus times divided and now we map it to this binary ad and we don't need to tell binary ad where anything is because we know the left left compile is going to put something on the top and then right dot compile is going to put something right above it and so we'll end up with these two top things on the stack are our left and our right hand side and then we'll put binary ad there and it'll figure out what something is so for me the absolute most exciting part of implementing a language is the object model what do we deal with Python has an incredibly rich object model we have lists we have dictionaries we have user created objects we have type objects and we can create any of these on the fly we can add methods we have just a rich Richard rich set of object types and behaviors this language we're defining as a relatively boring set so we have a a je s object that's sort of the base everything in our program at this lowest level is a je s I do some sort and then we have a je s number and it's got a constructor and it takes a value so that's really boring it all that lets it do is pass stuff around so we can create the je s numbers we saw in a number dot compile let's add some behavior so let's add an ad to je s number numbers know how to add themselves exclusively to numbers in our program so we're going to a simple assert make sure the other side is a number what does that do it takes self dot value gets the value of itself gets the other sides value ABS them together and returns a new number you can imagine this is almost exactly what that under-under ad does so under under add implements the plus operator in Python you have three plus four it's basically in a call under under add combine them return a new integer object this is basically what our program is doing besides the fact that we have you know very terrible error handling so final part we started with this premise we want to write an interpreter now we need to actually you know right if that's the came in to do all these other stuff is just busy work so we have the basically score we have an interpreter runs the bytecode for forever until it knows to exit it runs the bytecode from start to finish so this is probably the world's simplest interpreter PC stands for program code and it basically tells you where in the bytecode are you what what's your position we've been we wrote all this bytecode out inside our compiler now we have to do stuff so while the program counter is less than the length of our bytecode we need to do stuff so first we find out what's the what is the opcode so we read it out of there we get what's what's our current position we figure out what opcode we're at then we get the the name of the opcode from some map we built somewhere and then we call the method we just call this as a method so let's look at a load Const it's fairly simple so what happens first we we sort of fish around in our bytecode we want to know what arguing we got we don't want to know are we are we load count zero or we load counts 1 are we load counts 10 so we fish around we just put that argument right after the current position so it's at a self up bytecodes a b c plus 1 and then we use or because bytecode is a a string so we use Ord to convert it to an integer and then we keep self constant which is that list of constants we had earlier and then we we find that we find that constant then we do a self dot push what is push it just adds a thing to the top of the US or bottom of the stack I should say so you know it's basically self set stack dot append and then what is our new position well it's PC plus 2 where did that come from we increase PC by one we read one thing to find out what opcode we were and then we read another thing to get our argument so our new position is where we were and two paces farther so that's that's how we get numbers on our stack now how do we do something with them so we have a binary add operation what does it do it pops the top two things off so what is pop it's basically self dot stack pop simplest operations so the first thing we pop is our right hand side because if you recall we compiled the left hand side then the right hand side so the left hand side thing one at the top the right hand side thing went on the bottom of it now we're binary ads so the bottom most thing is the right hand side first we pop that off and then we pop the left hand side offices we are getting the values and then we're removing them from the stack finally next or next I should say we do the addition we call left add right which you know probably calls Jas number or maybe if we have strings we know how to add strings or anything like that and then finally we push this result back onto our stack and binary ad didn't have an argument so we we only need to move one place forward so some things we didn't cover our we didn't cover how to actually implement a compilation or parsing or any other bits with if for a while we didn't actually put any other types in this program besides strings we didn't really we didn't do any operations besides addition there's tons of pieces left to our language that we'd like to add we didn't even add the print statement please please come and find me come and find any of the other pi PI folks we would love to talk to you about this help you get excited about writing interpreters because we have nefarious purposes we want to recruit you to help out with Wi-Fi and c python and topaz what we want you to come hack on all these things with us here are some useful references topaz's interpreter it sounds scary because you know Ruby giant language but topaz is actually I think an incredibly clean source code very easy to get started with there's the Kermit example interpreter that pi PI maintains please come by pound pi PI pound topes on freenode come chat with us and sort of recap we have a with the Luxor which takes our original program returns to us a list of tokens each of which says what it is what the value was and where it is in the program we have the parser which takes this list of tokens and it turns into a tree which says what is our what is our entire program what is it from top to bottom I never turns that as an ast the ast then knows how to compile itself to bytecode using the compiler and then we have this object model which says what are all the what are all the types of objects in my program and what can they do and finally we have the interpreter that runs this mini bytecode that we have and that's that's the interpreter from from top to bottom basically so please go forth have fun I really hope you enjoy hacking on these things as much as I have and thank you guys enough please I hope everyone isn't an amazing PyCon we'll be taking our questions please go up to the mic thank you guys so much all right thank you very much that my what I'm trying to understand is so everything you showed us thank you for us all was written in Python so you have a Python interpreter to run your interpreter how can you explain a little bit about how sort of the very first program that ever might have been written would be run is it that the compiler when you don't have an existing sort of set of software to write your interpreter in how does that get written and run on hardware sure so the other question was we wrote this interpreter in our Python which is a very nice language what how was the first interpreter written so when ghee dosa Bet said sat down to write a hika or a Python many many years ago he wrote in C so C Python is by and large the interpreter the compiler all these pieces are written in C so there C programs look kind of like this but you know uglier because it's C may be a more exciting questions how was the first C compiler written and it was not written in C or any other exciting language it was written in assembly language and it's actually really cool someone recently did some nice software archaeology and found one of the very earliest C compilers from Dave Ritchie I think and so if you go online and you search for like very early C compiler you can actually find this original source since it turns out writing assemblies a real pain so you try to write the smallest language possible so that you can start writing your compiler in in your own language because writing assemblies in kaneen so we have the small assembly program and we get a binary out of it and I know we can use that hopefully to compile our new C program and then for forever as long as you have the previous C compiler that you just built you can build a new one hi I'm I see that you are using bytecode and for the purposes of doing simple languages isn't it possible to skip that step and just interpret the AC directly sure so another strategy for writing interpreters is what's called an ast Walker and basically your nodes don't compile themselves to a bytecode they execute themselves directly one of the advantage this is it's generally slightly simpler one of the disadvantages it's generally slower the the reason I went with a compilation of bytecode for this talk was that's how C Python that's how pipe I that's how topaz that's how most interpreters work and so I think it's it's useful to get used to that idea because it will help you make it easier to read bigger more complete interpreters any other questions how would you go about writing a tour rather what were the differences between a highly functional language and highly liked sequential language how would you go about writing a lexer for something like that so the question was what are the differences between writing an interpreter for something like a highly functional language like say scheme versus a writing for something like JavaScript which is a more imperative language and uh most of these steps are actually very similar so scheme it turns out in other lisps turns out are incredibly easy to write Luxor's for it turns out nested parentheses for forever while in my opinion very annoying for humans to read are really easy for machines to read so it turns out if you look around the internet there's all sorts of posts about how to write a scheme interpreter in like 20 lines of Python so the lexer and the parser are very very similar if not a little bit simpler the interpreter often shares basically the exact same design so the design we were looking at for this this javascript interpreter it turns out is exactly the same as the design I used for the Topaz interpreter for Ruby and which was in turn the exact same as the design that pi PI and C Python used for Python so it turns out all these languages can have very similar interpreter designs and even though they look like incredibly different languages on the surface it can be very very similar under the hood just to tack something onto your explanation of my bytecode is good it also gives you another layer of abstraction there so you can go and optimize your bytecode executors and you don't to go mess with it necessarily if you had a new construct to your language yeah that's a great point sort of use saying that it gives you another level of abstraction personally I'm a giant testing nut so one of the things I love about writing an interpreter it turns out each layer is really easy to test so you test your lecture you test it you get you give it a string you test you get the right output you test your parser you test that you give it a token stream and you get the right ast out of it you can then test it you're a estie's compilation gives you the right bytecode and finally that your interpreter executes your bytecode and gives you the correct result so this is for me is a testing that a really nice thing I think we're just about out of time now so thank you guys so much go forth a knack

Info

Channel: Next Day Video

Views: 97,323

Rating: 4.833148 out of 5

Keywords: psf, pycon2013, talk, AlexGaynor

Id: LCslqgM48D4

Channel Id: undefined

Length: 40min 38sec (2438 seconds)

Published: Fri Oct 16 2015