Is this the Future of Programming Languages?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
looks like we're live hello everyone and welcome to yet another recreational programming session with a Aus and let's make a little bit of an announcement and officially start the stream as usual live on Twitch and what are we doing today on Twitch at. television website today we're checking out QBE language backend right so I'm going to give the link to twitch.tv/ soing the place where we're doing all that and I'm going to Pink everyone who's interested in being pinked and there you go the Stream has officially studed the stream has officially studed so on the previous stream we checked out a programming language called ha all right so it's a very simple programming language like it's a c style programming language and one of the notable sort of like um feature of that language is that it uses QBE backend right for those who doesn't know like a languages are usually split into two parts like a front end and a back end all almost like web applications right so but in case of web applications front end is running on your machine and back end is running on somebody's else machine uh in in case of compilers it's like it's part of the same application that is running on your machine obviously right so and essentially the where lies the front end and where lies the back end this particular line can maybe shift around right so to speak uh but essentially by the front end we understand a thing that uh parts and understands the language right so you have a text file with a program written in a particular language uh and the front end just lexes it right splits it into tokens parses it constructs an as right abstract syntax tree out of that thing and then it traverses that as maybe does some Transformations maybe something something like uh syntax dis sugaring if there's a syntax sugar and stuff like that and then and uh Traverse all of that stuff and generates uh you know intermediate representation right intermediate representation that is usually handed to a backend and a backend takes that intermediate presentation uh applies some optimizations and generates the final machine code right so usually these kind of parts are developed separately quite often by different people right so there are people who specialize on uh like developing only backends and there you go here is the comp a back end right so essentially it accepts an intermediate representation of your program in the form of a text itself right so by itself it is a small programming language already right so and then it applies optimizations and generates assembly uh right and then what you do with that assembly you essentially take that assembly and construct an executable so and this is very small and very not that known back end that I discovered only because I checked out this programming language uh the most popular back end right like the one that everybody knows about is probably llvm you probably heard about this one um right so llvm compile infrastructure it's not even a back end it's an entire compiler infrastructure uh right so there's something with the website so I'm using basically like a dark buil in dark reader feature in Chrome uh right there's a special experiment of flag uh somebody told me on Discord and sometimes as you can see it fails right lvm like a lot of modern languages use LM like uh clang the C C++ compiler uses lvm rust is using lvm uh right a lot of newer languages like I think Zig uses LM maybe Ain I'm not sure about that so all of these modern languages uses lvm lvm is basically the react of compilers right it is basically the react of compilers minus the toxic Community um right but but the the downside of this particular back end is that it is very very huge it is extremely bloated it extremely bloated and it's a huge dependency in in my opinion it has to be justified right so it has to be justified and if you are developing uh like a hob small language like a hob small language depending on lvm is kind of too much it's kind of too much I developed hob small programming languages and not only I didn't use lvm not only I didn't use lvm I uh basically created my own small back end that doesn't even do any optimizations or anything like that right because you don't need optimizations to make your language work right if you're trying to make a working language if you're trying to make a work in language you don't need optimizations to make it work optimization is not a necessity it's a luxury right it's a luxury and that's one of the reasons why people program in Python right so because nobody really gives a right so that's why web is popular because nobody gives a right so it would be nice right so if all things were fast but um it is what it is and it isn't What It Isn't So I think QBE in that regard stands somewhere in the middle right so essentially they even stated themselves in their marketing right it's a compile backend that aims to provide 70% of performance of industrial optimizing compilers and we know what they mean industrial optimizing compile we know what they mean they mean they're talking about right in 10% of the code 10% of the code 10% of the code QB Fosters language Innovations by offering a compact userfriendly and perform back end the size limit constraints could be you to focus on essential and prevents embarking on a never ending path of diminish returns so yeah and essentially that's what I want you to do today I want you to check out this back end right maybe uh like directly programming it because like this back end is straight up its own programming language right it's straight up its own programming language you can programm in it right so I do not understand by looking at this text uh what the is all of that it's worse than PHP but we can try to program in these things and maybe if we'll have time maybe we can write a simple programming language that generates uh this back end right and see how it goes so essentially I want to put this thing on sort of like my tool belt right so essentially if I want to quickly develop like a simple language I know that there is such thing as QBE and I should be able to quickly pick it up and just prototype something so yeah that's basically the idea for today's stream that's basically the idea for today's stream sounds good sounds Gucci sounds at Gucci let's go um so let's acknowledge the subs uh thank you so much to Cube aot for two months let's go subscription thank you so much thank you thank you thank you uh corer thank you so much for subscription with a message season SE season in coping right now uh so yeah thank you so much ker for t one subscription thank you thank you thank you rest in peace VM why are we okay we're buing LM already I mean LM is not going anywhere uh right so all of that is just a clickbait title come on this is Twitch right I'm a streamer like you know what to expect from me uh so but I mean it's it's a it's a cool alternative nonetheless it's a cool alternative in my opinion um right especially if you don't want to deal with llvm and to be fair like recently a lot of people started to complain a lot about lvm right so about it documentation about how you know bloated it is and stuff like that and it's just like it's people are trying to slowly kind of move away like kind of distance themselves Fromm and yeah so this could be like a good alternative in my opinion uh at least for myself right at least for myself I know that not everybody masturbates to Simplicity as much as I do uh but it is it is and it is but it isn't okay good so we already have QB installed uh right so I'll already do in fact have QBE because I installed it from the previous stream so I I think I I'll need to put the description uh like uh link to the previous stream in the description right link uh to be done link to hair Stream So This is where we install QBE so I'm not going to spend time installing it and I think it would be fair to actually include l in the description as well right so as a comparison uh of the the popular um you know popular back end uh so what we're going to do in here so we already have a here folder where we have the source code of QBE uh but maybe I'm going to actually go and create QBE folder right so um let's try to maybe create some sort of a Hello World so essentially um we create a function like this right we create a function like this and we start the function with a word function uh so let's actually do something I remember that uh the extension is called SSA right so the extension is called SSA so this is a function and because of that let's actually set the highlighting to JavaScript right it it doesn't even work what is this where is my JavaScript yeah there we go so this is going to going to be JavaScript so we're programming on JavaScript now actually it's more like PHP right so can I do PHP mode uh yeah so PHP is a little bit better because we're going to be having like these percents and um not percent but dollars and percents and stuff like that I'm really curious what the is w what the is W is that the type it could be the type so my hypothesis as of right now that w means word right so essentially like whatever the side of the word of the current machine so in a 64bit machine it's probably going to be uh 64 bits on a 32bit machine it's going to be 32 right it's the current word sort of speak uh right so we're going to do word so the name starts with percents PHP mode actually works really great with this right you know your back hand is good when it can be easily highlighted with a PHP mode you know is good right so we accept two words uh the A and uh B um so and essentially it's kind of interesting I just had this thought like why in Old languages especially in Old languages especially why there's always a prefix in front of either a function or a variable and sometimes there are different prefixes because obviously just by observe in this language we can see that dollars probably mean function name and percents probably mean variable name right like what's the point of doing these prefixes exactly because a lot of modern languages just don't do that but all the languages like to do that and if you ever implemented a language by the way even in interpreted one you probably know the answer you probably know the answer it makes it easier to actually look up that specific name in a correct place right that specific name in the correct names parsing it dep depends probably not I wouldn't call that parting actually it's it's more like a semantical analysis right it makes it easier to do semantical what the is this okay disable loging off um so yeah it's more like a semantical analysis because right if the name starts with dollar I know that I need to look up this name in a table of functions and not in a table of variables in a table of functions not in a table of variables because obviously they're going to be in Separate Tables probably and if you don't have these prefixes what you have to do you have to check okay let's check uh that in a table of functions okay it does not exist in a table of function let's check it out in a table of variables okay so it does exist so that means it's a variable right and because of that you can actually have separate name spaces for different kinds of names in your language different kinds of name of the language right so if the kind is function uh you you look only in that table if it's a variable you look in another table and because of that you can have functions and variables with the same name and they are distinguished with this prefix so this is kind of stuff that kind of makes sense if you're developing the language but doesn't really make much sense when you're programming that language right you don't really see this internals right so it only justifies something internally so to speak but this is just like my opinion this is how I see that because I implemented like a few languages and every time I was implementing few languages I was kind of thinking that if I had some sort of a prefix it would be kind of easier to distinguish these kind of things super quickly if you know what I mean and of course you can actually keep all the names in a single table and do some sort of a polymorphism you can basically tag different kinds of objects with different tags uh right so you can have unions uh or if you are into oop you can have an abstract class right so you can have an abstract class and basically for different kinds of objects you can have different implementations uh so people in the chat say that uh that's exactly what they did so uh yeah basically table with attacked unions and stuff like that so yeah there they are both approaches I think I used both um right so to be fair maybe a single table with different kinds of projects is a little bit easier because if you have Separate Tables it's just like you have to check sequentially and if you add add a new kind of object right new kind of object you have to visit all the places where you sequentially check all of the kinds of objects and it's kind of a kind of like a mess anyway so yeah we created a function so the next thing we have to do right so we adding these two variables together I suppose this is sort of like an instruction right this is sort of like an instruction so you cannot really write a mathematical formulas in this language right you cannot just do things like uh a b a multiplied by uh b + 32 right you literally probably have to split it into sequences of these operations almost like an assembly because this language is slightly higher level than assembly right so and it is responsibility of your programming language to actually parse the expressions and generate all the sequen of instructions right um so and here what we do we actually assign to C it's interesting how I don't have to really declare anything in here so it does all that automatically and I simply then return that c I simply return that c so that's pretty much it right so that's pretty much it um and uh we also have to denote start in here and I'm not really sure what start means so far everything kind of makes sense except the start and I would assume that maybe it's an entry point of the program but but start exists um for each function and you may say oh it denotes the startat of the function okay but isn't this supposed to denote the startat of the function like like why U maybe you can have some code that is before the startat of the function um so I don't know so far it feels like an extra syntax but maybe I just don't understand something maybe I just don't understand something so the next thing we have to do we have to create the main function uh right so we have to create the main function we have to do expert function because this back end is also developed by JavaScript developers right so you have function you have expert function uh and this time we're doing a main right so and this is probably going to be the entry point right so this is probably going to be the entry point uh and apparently right so this is how we call uh this function right so we do add we say we pass word one and word one again then we just call it like so right and all these prefixes probably are needed to indicate the type of the sort of like a literal that we have in here right so we indicating the type of the literal uh sure so then we're assigning result to that and then we call to print F so we then use l FM l fmt right and as you can see fmt is the data right you have functions and you have datas right you have functions and you have datas so and they use l in here so which is probably an an analogous to W right it denotes the type that we're interpreting this thing as right it denotes the type that we're interpreting this thing as um so and essentially we Define the data like this this is actually a pretty cool syntax right so you have like functions as object so they hold code functions hold code and data holds data right so and B uh denotes that this is a sequence of bytes so you can use string literals to den note sequences of bytes and sometimes you can just do extra bites it feels like a fosm right it feels like fosm right because in fosm you can also do similar things so if I do something like f uh maybe uh like this right you can maybe have hello right and you can say DB uh and so that will allocate the bytes and you can have Hello World in here and then you can say zero right and all of that is just like a one sequence of btes with a zero at the end so I think it it it is kind of analogous to that right I think I feel like um so uh now we have to put this thing probably to indicate that this is a vartic function right so we're sort of like this is my hypothesis I don't really know I haven't read the documentation of this thing yet but I think we are indicating that this is a uh vartic function and because of that the things are passed like by the stack or something like that but as far as I know um even vartic functions in C follow the like you know X 664 Co convention but whatever so and we're passing the result of the add function in here and then just return zero okay so far I understand pretty much everything except the start right I don't really understand what's up with that and not fully understand how this particular syntax work but apart from that spread it straightforward right so you can even program in this language directly manually why not uh you may say oh it's it's the low level you're not supposed to program in a low level language I mean I program an assembly just fine skill issue dab dab dab right um yeah so anyway uh now we uh need to actually generate assembly out of that right so let's go ahead and generate assembly out of that we can do QBE and we're going to provide this uh thing there we go so it kind of worked but then uh it says invalid class specify okay so that's kind of weird oh this is because when I'm ass signing I supposed to say w right so what kind of assignment I'm doing right so I'm doing an assignment of the size of the word right if I try to do it one more time there we go it generated actually assembly uh right and we can clearly see so this is ADD function and it's kind of cool that it used registers in here right it's just yeah it directly used registers so um so we passed stuff from yeah so it probably accepted the first parameter y RDI right then moved it into ex and then just added to RSE actually it goes like this so right it added to that and then added to that and through racks it Returns the entire thing okay that's cool uh it even used leave right which automatically sort of restores the stack I suppose yeah I think that's what it does uh and then it uh generated the main and then it generated the data so that's pretty cool so you can actually specify like the name of the file uh into which it's going to save assembly so I think we can do something like this uh right so o main s right and if I open S uh that's basically the assembly so maybe we can even create uh the make file uh which is going to do all that so it's going to main s um main SSA uh QBE o main s main SSA uh and if I try to make this entire thing right as you can see it does in fact compile so and after that we can just generate a program out of that right so let's actually say that we depend on this thing and we're going to do uh this thing so CC I suppose will automatically use the assembl compiler when it sees the extension right so and let's actually try to do that okay so it generated the final thing and we've got the final executable uh we can take a look at what it does right so this is basically elephant stuff uh doesn't really have debug information I wonder if you can make QBE generate debug information but that debug information is probably going to be part of the assembly I don't really know how exactly it works but anyway and we can now try to run this entire thing one and one make two so yeah that that's basically it right so so far so good um so you can program in this thing right so you can actually program this thing and if you take a look at what kind of executable that is it is still Dynamic executable I don't really know how to make static executables with this kind of stuff uh right but yeah you can just use it as an alternative to C but it's actually rather low level again right so you don't have probably don't have mathematical expressions in here uh right because your language is expected to to actually do that uh your language is expected to uh parse all of the um you know mathematical expressions and then generate the the code out of that um okay so let's actually take a look at the docs uh so quick comparison to LM to be fair I don't really know LM so for me comparison to LM is not going to be that useful so I'm probably not going to read that so let's take a look at uh the actual documentation so you can have basic types um so what do they say the Intermediate Language is a higher level language than the machine Assembly Language we already noticed that it Smooths most of the irregularities of the underlined hardware and allows an infinite number of Temporaries to be used an infinite number of Temporaries as opposed like temporary variables because every time you assign something you kind of create like a temporary variable by the way so the the language is based on something called SSA right so it's it's based on something called SS United States Social Security adist yeah compiler uh so can I just do yeah maybe in Wikipedia Maybe Wikipedia also will go to security thingy so it's a static single assignment form right it's a form of intermediate representation that looks like this right so actually I I think it looks like this more more or less right so essentially it's a sequence of assignments of things it's a sequence of assignments of things and uh you are allowed only to make one single assignment and if you are modifying a variable you sort of like denote the version of that variable uh like with a number or something else right and from what I understood it simplifies some of the optimizations right so it simplifies some of the optimizations for instance like this is a I suppose classical example right this assignments is useless right this assignment is useless but for a computer it's kind of difficult to figure out that this assignment is useless but if you uh allow only one single assignment and basically denote different versions of the variable with like some sort of a number it is very obvious that one of the variables is never used in here like y1 is never used so we can easily eliminate that so it's uh yeah everything is like shadoow or something yeah um so yeah I suppose and as as far as I can could understand it has also like number of useful properties and that's why compiler developers really like this kind of thing I never work with that right by the way I never work with that all of that is very new for me and that's by the way why I'm checking it out because maybe it's going to be useful for me in the future uh right so maybe I'm going to develop my own compiler back end that uses this kind of stuff so I'm trying to learn something new if you know what I mean uh I'm trying to use something new John doesn't seem to like SSA well good for him that means he knows like what is SSA and how to use it and why it doesn't work for him so I'm really happy for him for for all of that right so I don't know what is SSA I don't know how to develop it and that's why I'm learning it right so maybe after learning I will come to the same conclusion right so because the way I usually uh I usually structure my intermediate representation uh I basically develop a stack based machine right my intermediate representations usually just basically stack based machines uh and it's kind of a dumb way to do do that right but it kind of works for me for now because I don't know any better so maybe I'll learn something better we'll see we'll see so anyway uh I'm going to put this thing in the description right I'm going to put this thing in the description and let's maybe a continue let's maybe a continue uh right so input files okay that's pretty cool the um Intermediate Language is provided to QB is text we already figured it out so here is the complete hello world uh right in hello world we have a data with hello world we just call to function puts and that's about it BNF notation I'm not really interested in BNF notation right I'm not a machine I'm human so I would rather prefer like examples rather than like a formal description of the uh of the syntax right because it's kind of useless for me and it it is useless for the majority of the people except maybe they you know language developers themselves uh sometimes it is useful right when you want to check out like specific concrete situation but then when you're just learning thing uh the examples are usually better so we have SIMPLE types right so we have base type word is probably long short double word probably I don't freaking know extended type uh so extended type are one of these things and also B and H uh okay so there's no description in here what they mean but there is a description in here so the Intermediate Language makes minimal use of this types by Design the types used are restricted to what is necessary for an ambiguous compilation to machine code and C interfacing okay so making more complex types like structures and arrays and stuff like that is actually kind of up to you up to the language so what they provide they provide basic types and an ability to maybe like allocate like memory and within that memory it is your responsibility to actually sort of part the layout and have more complex types if I understand that correctly which is good right so which is good so they provide the bare minimum to smooth out the differences between the hardware right so w is a word L is long s is single right so D is double I suppose these are floats right yeah this must be floats by the way single and double so they stand for Effective 32 integers and floating points yeah go there are no pointer types available pointers are typed by an integer this is exactly what I'm talking about chat pointer is just a number there you go look underneath the language look it's crazy no thank you so it's it's just a freaking number memory is just an array of bytes and pointer is just an um so anyway um there are no pointer types available pointers are typed by integer type sufficiently wide to represent all memory addresses uh right Temporaries in Intermediate Language can only have have a base type so in Temporaries I suppose are these things um so extended types contain base type plus B B and H half of a Bo respectively for 8 Bits and 16 bits integers they're used in aggregate types and data definition so aggregate oh wait it does have structure wait it allow it just like has structures like why do you need a higher length language that like you can program in this already like [Music] why what so how is that different than just generating C it has structures what's next it also has unions it's a full-fledged language like what's my job then um I freaking know uh they're used in agregate types and data definitions okay for interfacing the intermed language also provides user defined aggregate types as well as signed and unsigned Varian variants uh of the subword extended uh types uh read more aggregate types and function section though maybe Intermediate Language being aware of sort of like structures and everything is very useful for you know X8 664 call convention right because um xx6 64 call convention on Linux is kind of anal it uses like a very specific algorithm to figure out what goes into registers and what goes into stack right so essentially if you have a structure so and we actually encounter that we actually encounter that when we were interfacing to rip through assembly right uh so uh let me actually go to the C mode if you have a structure uh that looks like vector 2 right and it has two components like float X and Y uh and then you pass this thing to a function right so you have a function that basically accepts this Vector so this Vector is going to be passed via a single register this entire thing so both of the flows are going to be put into the xmm like zero right both of them so an entire structure if it fits into the register it's going to be passed bya the register so for intermediate representation if it's supports this call convention it probably needs to know like a little bit about how structures are organized and stuff like that so it can make a decision on how to pass this kind of stuff via the register or via the stack and and everything because as far as I know this Intermediate Language kind of does all of that job for you right so it figures out the call convention for you so you don't really have to think about it right so you just pass things around and that's about it um right so subtyping this even has subtyping okay oh for for primitive types right so inate language has a minimal subtypes in fature for integers types only any value of Type L can be used in W context okay so uh in that case only 32 least significant so L can be used in a w context but isn't L bigger isn't L bigger wait a second uh I'm pretty sure I I saw long yeah so long long can be used in word context so it's automatically converted so it's cutting them all right so that's a bit weird but I mean okay uh so subtyping um so it can be used in Long cont text all right so you have constant values you can have constants in here and there we go you can do subtraction you can do add I would like to have like a reference of all of the separations that it supports so you can also have um you know linkage right so you can link with things so section init array or something definition aggregate types um so yeah here are the aggregate types you can have like tuples or something um so this is a union types right you can have functions uh so far so good low W you're reading oh this is a very interesting synta okay so this okay okay okay okay okay okay so we Define a type we pass this type like this and then we essentially just use it as a pointer so we're loading and uh yeah okay so add bite this is very cool they're using different interesting instructions and I think there should be a reference guide of all of the instructions that you can use because okay so instructions instructions are a small piece of code in the intermediate petion to form a body of the blocks okay so yeah because it's cool that you use all of them what's the least of all of them so I can see uh so there's also jumps and everything for control flow uh so so you can jump to identify um so this is blocks uh so I suppose uhuh uhuh okay so I suppose that's basically the list of the instructions right so they documented in these sections okay so you have a section of memory and in the section of memory here are all of the instructions related to memory right store this thing then load these things okay so there's also blit instruction copies and memory data from its first bite section and oh this is actually pretty cool right so it's like a M Copy right so all the instructions for copying so comparison conversion uh casts uh so this is a call so you can work inside with VAR adic right inside of the F body of the function um right so so this is C I suppose the instructions are specific specific to SSA form uh as some sort of specific instruction uh okay so the instruction index here it is all right all right all right all right all right so okay that looks good that's that's basically the entirety of the intermediate representation isn't it so that means I know everything roughly I I kind of skipped the part about the compound type and like their specific and stuff like that but I mean is is it really that important is it really that important so uh yeah I guess that's that's basically it so I really like that just like you can basically scheme through that and get an idea of how this entire thing works and also it also gives me ideas for some of my uh you know languages in the future so uh there there are users right there are users so oh they actually actually featured ha they actually featured ha in here COC is a comprehensive C11 front end for QBE right so somebody actually implemented a c compiler using QBE uh damn right this entire Community like of people who develop these things so it's just like very freaking active so there's secc damn damn I can want to maybe write like a simple language simple programming language uh that uses this thing um but I need to Define like a scope for this language if you know what I mean like a scope because if I start implementing language it's probably not going to be like a single stream right so what kind of a simple language can we Implement um right to that will fit within the Stream So when Newport uh I don't know like we can try to do do that but we can Implement a simple P why not right so it's just like uh but the thing about p is that it kind of assumes like the stack based machine this is not particularly stack based machine but uh essentially if you have the following language right so let's actually do the text M uh right you do one two plus and then you like print this entire thing how would you even create like what can you even generate in here right so you generate a function W uh main uh something like this right and you're supposed to put start in here and every time you push like an element on the stack every time you push an element on the stack what effectively you're doing you're creating a new temporary variable right so we can say okay so S1 uh is going to be be one right so this is is this how we do that I'm I'm not really sure so um we pushed the first thing on the stack so the stack will consist of Temporaries right so the stack consists of Temporaries so then we push two onto the stack right we push two onto the stack and then we do plus so effectively what we're going to be doing so by the way I think this has to be like like this thing uh we do S1 uh add uh S1 S2 right and uh after that we are just doing call Print f um so maybe something like fmt uh fmt integer right so it's specifically for the integers and uh what we're doing in here we're passing this thing on the top and obviously then we just return like zero or whatnot uh I think I think that's how we do that so return um let me find documentation because I'm starting to forget the syntax but I mean I just learned the syntax today right so I think it's uh yeah you can just do return zero you don't really have to do that uh and then if I'm assigning something uh right if I'm assigning something can I just assign like um like do I have to provide this thing like that I don't think I have to right so because I already say that it's going to be word assignment so I can just do it like that right and think it is necessary uh right and essentially like this is what you would generate out of this code right so this is what you would generate um which sound reasonable right so which is pretty reasonable we can try to implement something like that uh right so that's pretty cool that's pretty epic but um something even more interesting would be to uh write um a compiler that understands something like this right so that takes this expression and then generates this out of this expression right so and that will require implementing something like a PR P if you know what I mean right so PR a Passa is that how it's called Uh PR Passa yeah Wikipedia so this is something that yeah this is something that John recently talked about right so PR pass we can go into that uh implementing Pro is not really that difficult so we can spend some time doing that uh all right so uh I can put that in the description for anyone who's interested and uh yeah so let's actually Implement a compiler that understands this kind of thing right uh that understands this kind of thing I think it's it's going to be kind of interesting uh so let me kill that uh and so here is just like a hello example I would like to maybe call that hello example slightly differently uh right so one + one right so let's actually call it 1 plus one because literally what this thing does it just adds together 1 + one so I wonder which move mode is better I think JavaScript mode actually highlights this thing way better I think it highlights way better so we're going to be using a JavaScript uh mode for highlighting the QB intermediate presentation uh right and in here uh where is the make file I'm going to query replace main with one uh + one boom and I'm going to rebuild this entire thing a boom there we go so we have 1 plus one example I don't want to remove it because I think it's a pretty good example that we can refer to and and um the next thing we can do we can just try to uh we can just try to uh implement the compiler right so let's actually call compiler so what we're going to be doing we're going to be using C right so uh let's actually go ahead and uh do that um 3 month's best inspiration to True enjoyment of programming thank you I'm really glad that people uh find you know um find the enjoyment of programming Again by watching my streams because that's kind of the goal uh the the reason why I even started to stream in the in the first place is because um I got so demotivated with the office job it sucks out all of the enjoyment uh of programming from you it literally kills programming for you uh so at the time right so I was working my office job and I discovered streaming and I realized that when I stream and do whatever the I want I remember those times when I only discovered programming and actually enjoyed it and I really want to preserve that I think it is very important to not forget why programming is enjoyable because if everyone forgets that programming is enjoyable well the industry might die the entire industry might die so um I believe my purpose in world is to be the keeper the preserver of that enjoyment that people forget over time uh right and people forget over time not because they're getting old or anything like that but because of the of the office job right it sucks out all of the enjoyment uh of programming and it is very dangerous right it is very dangerous because again if we don't enjoy programming we may stop doing it and if we stop doing it well all of INF structure depends on programming on software so it is important it is important um I do actually believe that whatever I'm doing is important I work in a startup so it's pretty fun creating new things uh Liberty is given in startups I actually never worked as startup to be fair I just realized that I never really worked as startup maybe I should give it a try maybe I should give it a try because I worked a lot in like this huge corporate Enterprise you know uh companies and that's kind of the perception of uh you know real programming that I've got right so and that's why I absolutely hate it so maybe I should have tried different kind of programming right so already tried recreational programming and streams maybe I should give it a try to startups at some point maybe all right so uh let's go let's go so what we probably want to do we need an ability to uh read the file right we need an ability to read the file and uh as always we do that through Noob right so which is a you know a special library and a build system that I use all the time um right so let me find the knob Doh and since I'm going to be using Noob I might as well actually like use Noob instead of make file uh but we'll see how it goes we actually see how it goes I'm not really sure uh I just don't want to break my flow just want to don't want to break my flow uh and just do that so implementation right so the thing about no is that it has a crossplatform function for reading entire file uh right so there was a read entire file so there we go so here is the function that we need and uh so essentially we can have the uh the input right so this is going to be input and how should we call um you know what's going to be the extension of that language what's going to be the extension of that language because uh the language looks like this right so we need to somehow compile these things so let me actually come up with the with the expression that we want to compile 35 plus um+ 17 * 2 right so to be fair this looks like python so let's actually we're going to be making a python compiler holy yeah so let's actually do it like that so we're going to be making a python compiler we're going to compile subset of python how about that how about that I mean it's a we can actually make a legit python compiler but for a subset of python it's it's a subset of python right so you can take an official python compiler uh interpreter I mean run it and it will work right so and then compile with our compiler so it's subset subset subset right uh so it's actually pretty cool such a dumb idea but I mean uh so is my entire Channel um so main. Pi so we take that [Music] and it's going to be Pat and uh let's actually put this stuff in here we're going to be allocating the builda and we're going to be reading all of that into the builda and if this entire operation fails uh we're going to return with non zero exit codes right so with non zero exit code so let's do no blog no info and we can say uh so contains uh this amount of bytes right right so we can say path contains uh SB count amount of bytes all right so let me go to make file and we're going to have a compiler compiler compiler C uh and let's just you know compile it we going to be compiling the compiler compiling the compiler and uh let's actually make it okay so that seems to be working that seems to be working and if I try to run that it says then main. Pi contains 17 bytes and it is in fact true uh it contains 17 bytes that's pretty cool so the next thing we need to do we need to uh tokenize this need to tokenize this I have written so many goddamn tokenizers um do I have to write it again usually writing a tokenize it takes a lot of time but maybe it uh let's actually take the C tokenize uh nothing says to be right uh I'm reaching the levels of efficiencies never seen before on sing channel right I'm literally reusing all of these components uh right so Lector the back end and stuff like that but the here is the thing um my the difference between me and those web devs is that I understand how these components works and if I need to I can Implement them right it's just like I don't want to so yeah so it's it's the same like with the rayap people think that I'm such a fan of r i I'm a r fun boy but the reason why I like Ray is because I can implement it so the thing is like if I were to implement like you know the library for graphics that I can reuse I would pretty much end up with a similar thing maybe a little bit more complex API but I'm gonna end up with pretty much the same thing so the only reason why I don't do that is because I can just use radio because it's like almost exactly the thing that I would do so you know what I mean right um so that that's the reason that's the reason I like to use things that I understand to be fair granted I do not fully understand QBE but I'm in the middle of learning it so maybe uh I'll just learn how you know how to use it and uh try to implement something similar okay so um what we need we need to have a Lexa right so there's a cxa and we're going to reuse that thing hopefully I already used it one time uh so but I forgot how to use it [Music] um I for how to use it so let's actually grab that and uh there we go so w get and let's include the uh stbc Lexa stbc Alexa play dispositor so implementation right so you have to include this kind of stuff right so let's define implementation for those who doesn't know it basically instructs the header to act like a c right so if you never encountered like header only libraries uh so this Library this single file acts like an entirety of the library if you just include it it acts like a header file that means it doesn't contain any implementations if you do Define it also includes implementations the reason why it is done like that is so you have control where to put implementations so you can actually include the header as many times as you want right so every time you include it it will um you know only include the uh declarations but then through this macro you can also control where you put the definitions so during the Linker linking you don't have the linking conflicts because of the several implementations of the same thing right so that's basically what it is that is basically what it is anyway uh so I forgot how to use this goddamn Lexa uh so there is a huge structure I do remember that there is a huge structure called stb Lexa right so here it is so let's go ahead and create Dela uh so I'm going to call L and I'm going to zero initialize this entire mother flipper right so there we go we allocated it on a stack and it is in fact initialized right so the next thing we have to do we have to call the initializer we have to call Le Constructor Le Constructor so we have to pass the pointer to the Lexa in here right so here is the pointer to the Lexa so then we have to pass the input stream the input stream is located in string Builder so we can say here it is then we have to supply the end of the stream where it ends we don't really have the end of the stream but we do have the size of the stream so we can turn it into an end by adding to the beginning right and there we go so this is where the end is located another interesting thing that this U lexer accepts is the string storage right it is a string storage so essentially when it parses string literals or stuff like that it may allocate a little bit of memory but it's going to be allocating it in a temporary buffer and you are the one who allocates that temporary buffer for the library so uh let's actually do that I don't know uh so I'm going to actually put it like this in here and we're going to just allocate so how much we want to allocate how much we want to allocate uh so that should be enough for everybody right so that should be enough for everybody I think so we might as well even uh put something like uh store in here so this is going to be store and that's essentially the size store length so we also want to assert that the store uh is not n right so Malo didn't fail because Malo does not fail right it didn't fail because it doesn't fail uh so and that's why we are searching it and there we go we initialized the Lexa oh it's actually 40 I'm sorry I forgot the mem thank you so much for correcting me uh thank you so much for correcting me SB items count minus one and usually okay so the common sense tell me right quite often in Computing you have two ends that denote sort of like a sequence in a memory begin usually points begin usually points at the first element and end usually usually points at the element after the last one so that's the usual sort of convention that uh we have in here the reason why it is like that the reason why it is like that it is super easy to calculate the size of the sequence by just doing this because otherwise you would have to do plus one but that plus one is baked in the fact that the end points are the last one right so that's the usual convention I don't know for a fact if selector in it follows that convention but because it's so common I just automatically naturally assume that that's what it is we can check it out actually we can check it out right so uh here it accepts input stream end uh right so and what does it say points to the end of the file or null if uh you use zero for end of file oh okay so you can actually put a null in there I didn't know that right so that simplifies everything but we want to probably figure out how exactly it is used so let's actually go through the usages then it basically assigned to the end of file right it's assigned to the end of file so we need to check how end of file is used uh right so as you can see you have the current parse point and you subtract the par point from the end of file and that thus getting the size get the size of the uh of the remain sequence which actually confirms my hypothesis yes it must be uh you know the the pointer to the element after the last one because otherwise this formula won't work it's literally the formula that just just showed like it's literally the one that I just demonstrated in here right end minus begin end minus begin so I I just guessed it right so because that's usually how it goes uh we can go even further but I think by the way uh since no we don't really use uh n termination right so we don't really have a n termination in here because the string Builder doesn't really append n Terminator but if it was n terminated we could have actually put a null in here and that would work as well uh right thank you for asking this question by the way thank you for asking this question because uh without it I would not look into the uh source code and I wouldn't discover that you could just put null in here if you use n Terminator we don't use n Terminator but that's a very important thing to actually consider because that allows you to maybe put string literals in here right so you can put code in here like that uh right so 35 + 2 multip by 17 right and then you can put uh n in here so and then you can lacks that specific string visual but uh we actually accepting all of that from a file so anyway um right we have initialized Lexa we have initialized Lexa so the next thing we want to do is to iterate through the lexum uh but how do you do that I don't remember so we see um let me find a function that looks for the next Lexus so here it is this function returns not zero if a token is pared or zero if it's end of file okay uh so while we get the token what do we do we're going to be actually printing the token so we have a token ID uh right which is a Unicode code point for a single character token less than zero for multi character or Ile or error uh so code point for a single character points okay that makes sense actually that makes sense so we can do Lexa token uh and if it's greater than zero if it is greater than zero we should treat it as a single character right so we should treat it as a single character so this is going to be that and uh so we can do just that so it also uses like a unic code uh but I don't really care about it so if it's a multi character right if it's a multi character it probably uses a very specific ID uh right for example one of those things um maybe like an ident and whatnot uh so identifiers so yeah uh ident how do we know if something is identify but maybe it doesn't really matter so well well I mean it does matter because we do have identifiers in here so um yeah we probably going to have this this this and this as characters and then we're going to have this as an identifier print is an identifier and this is a number so we need to be able to distinguish between identifiers and numbers right so that's one of the things we'll have to do uh so there's a select uh select SP server so that means ahuh here they are found them found them mother found them so we're going to be distinguishing between uh clex integer and there must be identifier somewhere here um maybe name is there something like name uh plus equal character literal ID okay found it freaking found it okay that's cool so and in here I suppose we can do something like uh switch Lexa token uh switch and we're going to have case for Collex ID right so we're going to do something in here and also clex uh integer right so this is going to be case clex integer boom uh and in case of an integer uh how we're going to be doing that so this is the Lexa so this is the number all right so in case of the integer we can just do uh d uh like so mhm Lexa uh not really talking int what it called int number so this is a int number and for the ID I'm actually really curious how we're going to be doing that probably through um string L right so if I do clex ID is there any examples on how to do that so Collex token Collex ID yeah there's no really yeah it's probably stored in a string I see I see it's probably stored in a string so and uh let's go and just do something like this and as far as I know you probably want to do something like this right or maybe like this Lexa string length and then uh Lexa uh string so that should be basically it that should be basically let's try to go to the compilation errors uh store length uh I actually accidentally deleted apparently uh so let's undelete that what's next we have uh so none of those things by the way are Pointers so I don't know why the am I doing that but Lexa uh Lexa dot so we're going to do something like this uh boom okay that seems to be compiler so let's actually try to run the compiler and that kind of worked except um yeah so for the numbers it definitely didn't work right so that was kind of weird so and for the yeah but it managed to parse these things right so it managed to pars the special characters and stuff like that but for the identifier that was kind of sus so let's actually say ident uh int uh and see how it's going to go so I'm going to rebuild this entire thing uh it did not it didn't even print the information like it didn't print did I something up chat did I something up um so because it doesn't make any sense um right I would at least expect it to see to have this prefix if you know what I mean right so let's actually remove all of that uh yeah this is dumb this is absolutely dumb um because yeah what um if I don't provide any of that um if I provide this oh yeah I'm an I'm an idiot apparently it's still bigger than okay okay so I suppose we are kind of expected to do some something like this then all right so if it is like uh we expecting plus uh right we're expecting plus we expecting this uh we expecting this and we also expecting uh this the rest of the stuff we can kind of ignore or maybe even uh you know complain about if you know what I mean right so this one is going to be more like uh more like this Lea token so that's basically what yeah yeah there we go that makes sense and there're these things are still considered identifiers so that's what's weird about it right so they are still kind of well I mean this is because I made them so I'm going yeah go so this is c um maybe even we can call them punctuation signs right so that's what they are um and in here for the identifier right so we have to do something like this so Lexa string Lex um yeah string Len uh string and this one is going to be uh Lexa uh in int number all right we almost did that uh except for the string and the problem with the string I can never remember is this SC yeah finally there we go for some reason this is a cool Library it saves a lot of time but using it I keep forgetting how the to use this this right it's just like ah right token IDs and it's just like ah why is it so bizarre anyway I mean it kind of makes sense right so it kind of makes sense but it's not particularly intuitive another thing we can do we can always say um you know unexpected token right so assert zero uh unexpected dokum uh we can also provide the location where this entire thing is located right so if I'm not mistaken in the Lexa we do have the position location uh stb location so okay so you can get the location um so this inefficient function Returns the line number blah okay but what is where what is where um [Music] I don't understand what is where but uh so input stream you okay okay you start with the input stream while the input stream is less than where is that the beginning so this input stream must be less than where it must be less than where I hate this function I hate this function honestly it's just like okay so let me let me see so we have a token but I mean it's a very inefficient function as they say we have to provide the Lexa we have to provide Lexa we can allocate the location on the stack that is fine um uh honestly sometimes like writing the lecture from scratch is easier than figuring out how to use this Library over and over again um I think I already used this thing one more time in cougle yeah I think I can just look up uh how to do this thing cougle um yeah where is the uhhuh location mhm no I didn't use it and there's no even example on how to maybe we can find something on the like because the library itself doesn't contain any examples on how to use because you um so maybe something somebody used it on the internet and nobody on the entire internet use this right few words would have saved so much time um but I'm not supposed to complain this library is free I didn't pay anything for it um where first Char obviously you dummy dum dum obviously uh thank you whoever like wrote this thank you right thank you thank you thank you thank you you you you saved me sometime all right so um so this is where so this is what you're supposed to put here which is so bizarre I already passed Lexa in there just use it maybe it's like you can customize like maybe there's a last character or something I don't know anyway um so yeah so here we're going to have a path and then look uh right so where is the location uh it's a number uh line number and line upsite okay so let's actually do um line um line number so in here we're going to do a similar thing the thing is like I already used this library before but I keep forgetting how to use it you keep fing to use it it's just like uh Lexa is supposed to be pointer damn bro like [Music] um wrong argument order what the are you talking about uh get location you're trolling uh ah no you are literally trolling who said wrong we need to ban this person uh so oh yeah that's what you mean okay mod unban them unban them all right no no no keep them banned keep them banned there we go finally and they all on on the same line which is perfect so uh what I want to do in here uh I want to do something like this uh right so and here we're going to be doing the offset uh offset ah too long uh number finally oh this is horri by the way because yeah we probably need to so the line is already one based but the offset is kind of zero based uh right but I wonder if I can just do something like this yeah it seems to be working so we probably need to do offset slightly different lock line offset plus one I'm going to just like add one in here so hopefully that will work a little better there we go we are iterating through the tokens as you can see yeah there we go we part this mother flipper we part this mother flipper uh so we probably don't want that look at that so the only thing that is left to do is to build like a syntax tree out of that shis out of that shis but I'm already streaming for like 1 hour and I ran out of te I ran out of te and you know what that means you know what that means let's make a small break um all right so let's continue what we want to do um I suppose we want to start generating some stuff um so essentially essentially uh I'm going to maybe just open a file right the output file maybe I'm going to even have uh like out path which is going to be main SSA right so the input is main. pi the output is going to main SSA all of that is hardcoded as for now but that's besides the point so this is going to be out we're going to open out path and we're going to open it for writing in binary right so and if it fails of course right so equal to null uh we're probably going to just say bruh uh you up Noob error um could uh not open file s for writing uh right and we're going to also Prov S as an explanation so out path and uh strr error error no and in here we're going to just exit with non zero exit code so at the end we're also going to do F close all right so we're going to close the out uh so and I suppose one of the things we want to do we want to print some sort of like um what is it called prula for our program and prula is going to be essentially like this right right so the beginning of the function right so this is going to be just the beginning of the function uh right so let's actually copy paste this stuff in here so this is that and this is that and we can simply print F print F all of that stuff in there uh so this is going to be new line and this is going to be FR print F out uh like this so this is the prula of the function that we are generating and then afterwards we want to kind of close the function so it's going to be f print F uh out and we're going to just close it like that we're going to just close it like that for we can just test how it works so file requires STD lip right didn't I have where is the where's the compilation uh right so semicolon there we go so main. SSA there we go right so this is what we have in here so we generated like an empty function though it probably would make sense to also return something uh right so it's going 1 2 3 four return zero uh we we don't really have to put the semicolon in here right so let's go ahead and just do that and open this for good new line that happens that happens I'm sure uh let's revert boom so and we're going to be generating the code in here somewhere we're going to be generating the code in here somewhere um so I suppose we need to make start uh parsing things um right so we're going to have two kinds of maybe Expressions right so so let's create enumeration that is sort of like a tag for an expression right a tag for an expression and that is going to be essentially maybe expression kind and uh so if we look closely to what we have in here we have maybe three kinds of Expressions um maybe actually Four the first kind of expression is a number maybe an integer right so we can call it an integer then plus operation multiply operation and function call right and function call so essentially like all of these things so this is going to be an integer uh then plus um then multiply and then fun call right so these are different kinds of Expressions that we're going to have so and then we're going to have some sort of like a tagged Union right so but I like to have them as structures so essentially here we're going to have kind and then the kind in here and then the second field that I like to do for the expressions in C is uh something that I call uh expert as as and this specific expert as is already a union it is already a union and it could be one of those things right it could be an integer right so so um I don't know it would be better to actually call it maybe number so let's actually call it like a number uh right so then you could have maybe expression uh struct expression plus right struct expression plus uh so it's going to be implemented later right so to do Implement uh and we're going to put it like this so it's going to be plus plus and it is a union so the reason why I like to do it like that is because it allows me to do something very cool right so I can have an expression right so and I can check the Expressions kind I can check the Expressions kind is that expression is a number if it is a number I can say expression as number and I can view that thing as a number so it's sort of like a tagged Union right so this is the tag and this is the union and it's a two separate so sort of like fields and syntactically in see it looks kind of nice right so expression as number or if this is a plus I like to do expression AS Plus so this is sort of like the pattern that I developed for myself as I was programming these kind of things and see it's a little bit cuc cumbersome when you define all of these things but once you have them they kind of make sense um right so they do kind of make sense right so but this is a C programming language you kind of have to do this kind of stuff so yeah maybe we can even have like um you know a single expression kind in here which is called bin op right and uh this one is going to be expression bin op uh and in here we can have Bop kind right and Bop kind could be two things expression Bean uh like Bop kind so B binary operation uh Bop Plus multiply so and in here we can have like left and right things expression uh left maybe left hand side and right hand side right so this is Bob essentially this is Bob um so and here we're going to have bop bop uh and the last one we want to have is expression fun call right so fun call and fun call what is it going to be uh expression fun call H expression fun call so it needs to have some sort of a name write some sort of a name but to keep things simple to keep things simple maybe uh we need to yeah let's actually be very very much specific instead of fun call let's actually call it print uh I think it's going to be a little bit easier to work with because if you're going to do fun call right if you going to do fun call you need to have a name which is going to be probably some sort of a string maybe a string view right it must be a name then you have to have arguments and arguments are probably going to be expressions and that means you need to Define Expressions as the uh you know the dynamic array right so that's going to be another thing that you have to think about but maybe we can do that right so it's not that big of a deal right so and also it has to be a capacity because we're going to be reallocating that all the time um so and um yeah if we're going to have fun call okay so if we're going to have fun call um so how difficult it is to have a string view because string view is basically going to be identifier right so it's going to be identifier so we'll have to copy that string view somewhere uh right so we have to put it somewhere so maybe string duplicate and what not we can try to do that but yeah uh so that's basically the syntax right so you have Expressions expression kinds Bob kinds uh and that's like literally as the entirety of the as right so that's how we can kind of Define that in C uh right so obviously it's not going to compile right so because there's some uh definition missing stuff like that so expression for example here so probably want to forward declare this specific thing uh right we want to forward declare and uh let's actually find exper like this and uh let's put it somewhere here so let's actually put it somewhere here and uh let's recompile this in Stu so this has to be the kind the string view uh right so but again be using string view from Noob maybe right so okay so all of these definitions do in fact compile right all of them do in fact compile and uh so yeah we we can start with defining like um uh recursive descent pars right and the idea of recursive descent par is going to be the following right so essentially parse expression where just par in a certain expression we're going to be accepting stb C Lexa as the Lexa it's actually just stb Lexa as an input and I suppose we can accept the pointer to the expression as the out output right because I want to reserve a result for indication and error right so for indication and error so then it will be easier to actually unwind the entire recursive process if you know what I mean uh right so here we Pars in the expression so here we par in the expression and to be fair we can actually hard code uh precedence as of right now we can hard code the Precedence as of right now so and for each individual sort of like um thing that we have in here we're going to have a separate parsing sort of speak uh so for instance we're going to probably have separate like parse plus right so it's going be parse plus uh maybe exper plus uh for the consistency uh and this thing is going to maybe accept sort of like a plus in here right so here we're going to return false and here we're going to have X for uh Mt right expert mol and this is going to be also Mt uh and mold so and we could have like separate function for parsing integer and parsing function calls but integers and function calls are so-called uh primary Expressions if you take a look at the prod algorithm there's a BNF uh in a PR algorithm I think um so this is not not particularly great way to actually demonstrate this thing um so there was a I think it was a precedence climbing climbing uh Operator Operator presidence climing okay so was it yeah there we go so that's how we usually do that right so expression right so you have like a root expression you have a root expression then root expression is equal to the expression with the lowest uh precedence for example equality expression equality expression is additive expression with a higher presidence on the left equal not equal an additive on the right an additive is multip multi multiplicative uh which is uh you know higher priority and multiplicative is primary multiply by primary primary is sort of like an expression with the highest precedence if that makes any sense so as you can see sort of Rec Ely go down and sort of climb the Precedence so in that case usually people clamp together things like uh numbers and function calls as the primary Expressions right uh as the primary expressions and because of that we're can to have something like um I don't know parse primary right parse primary and it is going to essentially accept uh maybe just a regular expression right so maybe going to call primary so there we go there we go all right so when you parse an expression the first thing we want to parse is in fact um you know the plus because it has the lowest presidence um so to be fair par and a plus may actually result not in the plus itself but rather in its own left hand side so it's better to actually accept just an expression in here yeah so let's actually accept just an expression in here and for the consistency let's call this thing expression as well so here we just return parse expression plus Lexa expression Okay cool so in the plus we're going to be parsing multiplication as left hand side all right so we provide the Lexa that um actually not that but we're going to have expression left hand side right so this is going to be left hand side and we just pass left hand side in here right so that's what we do then I suppose we need to pick into the next token right we need to pick into the next token but how can we do that can you pick uh you can't pick in this really get token it doesn't have look ahead is it a sick what has okay wasting my time all right uh listen to chat again so uh it doesn't have look ahead it me up so badly it me up so badly because like look ahead is a special feature that you have to implement and we do need look ahead like at least one token ahead uh so and that is so bad but essentially what we have to do uh we need to somehow check the next token right so check token uh and if that token is equal to plus right somehow so I think it's a clex maybe just like plus right so maybe just plus and uh so if it is we have to introduce right hand side right we have to parse uh right hand side and effectively uh put those things together right so we have to put them uh together we can create a new expression right so we can create a new expression um right so let's call it plus so and it's kind is going to be expression plus and um so we actually introduced expression Bop right didn't we so it's going to be bop so then Bop equal to that and the kind of the Bob is going to be expression plus and here you have left hand side and right hand side but the left hand side and right hand side has to be poins and this is not going to work like that you you can't just do it like that uh right because left hand side and right hand side are allocated on the stack but as you can see we allocated them on the stack um which kind of means that we probably want to allocate them dynamically right so it's going to be Malo uh size of um expression right and of course you want to probably assert that this thing is not equal to n uh and um if since we are allocating the expression that we plan to return uh on the stack since we're doing that we might as well return it like that and indicate um and indicate an error with a null right if uh and some sort of an error has happened if some sort of error has happened we just return null right we just return null uh but that leads to a very uh Troublesome memory management right because essentially you allocated this thing and then an error has happened while while you parsed a sub expression now you have to delicate this thing right but you know uh after developing this kind of application for quite some time I realized that let it lick let it lick let it leak oh let it leak memory cost pennies let it leak just buy my Ram just buy my Ram essentially one of the um one of the things we can do we can use Arenas we can use Arenas so uh arena allocator is actually pretty cool idea so let me actually show you arena. H I think that's what it is oh it's not H it's without the the ha so uh and um yeah so I'm going to actually copy paste it in here uh so let's put it uh like this and in a description so the idea is that you're allocating the memory not on the Heap but in a special place that you control right a special place that you control called Arena so you allocate everything into the arena and after you done doing your thing after don't do thing you allocate everything that you put into that Arena at once so that way you never have to think about when to allocate when to deallocate just just let it leak just allocate just you know uh fire and forget as much as possible and then when you're done regardless of whether there was an error or there was no error uh you just delicate everything at once right so and that's what we can use in here so we can just like go there right so so this is the arena that I implemented quite some time ago for myself for my uh needs and I might as well actually use it so because why not and as far as I know uh Arena allocators are really like popular among the language Developers for that exact reason so then you can just recursively Traverse and allocate things and don't think about them and just allocate everything at once um right so let me now uh introduce another thing in here so this is going to be Arena and this is going to be Arena okay so here we can allocate maybe Global Arena right so Arena parsa uh Arena and we're going to just like do it like this so it's going to be static thing so that's it you don't have to initialize anything uh this thing being zero initialized is already uh you know initialized and you can right away use it right so Arena aloc right so you basically uh provide the pointer to the arena into which you want allocate things and you provide the size of the object that you want to allocate and it will return you the pointer to that thing um right so where is Malo so we can do Arena aloc Arena aloc uh Passa pass Arena pass Arena going provide the pointer so we don't even have to do aert remember Arena alog does assert for us I think it yeah yeah I think it does assert for us so there's a new region right so it actually linked list of regions uh and yeah there we go so it calls Malo underneath right and then underneath yeah this is how we say it and then if it fails it just asserts it right so we don't have to even worry about that so that's actually kind of cool so you can just do it like that and in here you can just do it like that so that's it that's pretty cool so and then once we're done once we're done we can just do uh actually we can even do something like Arena reset and that way it doesn't deallocate allocated memory but it allows you to reuse whatever it already allocated so there's two functions Arena reset which just resets all of the regions preallocated so the next time you allocate within Arena it will just reuse them very quickly without actually calling to Malo right or you can do free and it will actually free them it will just call free for them uh and in here we can actually have several backend implementations right so you can use leap sealoc uh as far as I know you can also use Virtual aloc I don't remember if we implemented M map uh right so Linux M map I think we didn't and it was also an idea to actually Implement a backend for was for web assembly yeah we also didn't implement it so there's two back ends for this Arena that are not implemented uh uh but somebody actually submitted virtual aloc right so uh yeah we do have do we have well wait current platform yeah yeah yeah we do have a virtual Alo so on Windows you can directly use Virtual Al it has a Windows it has a Windows support but it doesn't have a Linux support so it's actually kind of funny uh but anyway so that way we don't have to think about it too much right so just like don't freaking worry about it and we can just do it like that and at the end in here we can just return LHS uh all right so we can just return LHS so funny enough we can check the token and if it is equal to that we have to yeah we have to sort of like uh get it get the next token so and funny thing in here is that for the multiplication for the multiplication we kind of have to do the same thing except with the uh multiplication operator except with a multiplication operator and uh here uh we going to return expert uh like so right expert like so and since we're sted to do this very much dynamically you we might as well actually continue doing of that so you see how I don't write like a final signatures right so I just write the code and as the requirements and problems uh arrive I modify things and I adjust things so initially I want you to do return Boolean but now I see that it would be better to do it like that right so the code constantly changing it's adapting and everything and for the multiplication so here we're going to be actually checking for multiply but instead of calling to uh like a lower like actually higher presidence we're going to be calling directly to primary right so we're just calling directly to primary uh and within the primary we're going to be already parsing the either function calls either function calls or uh the numbers right so all of that is not going to compile because we cannot check a token in a Lexa without actually um how we say that uh without uh actually taking it out right without actually taking it out and that's a kind of a problem that's a kind of a problem so what I'm think think is that maybe we're going to PRX everything uh right maybe we're going to PRX everything just iterate through all of them and collect them into like a single thing um then um just use that when you parse things right so we can have maybe a notion of a Lex lexm right so or token of some sort so this is going to be uh lexm right go it like that and uh we're going to have it as a kind and maybe we're going to use the same kind as in here right so where is the token so yeah it uses long right so this is going to be the the token right so maybe we can also call the token uh right so they have the same name as in the library so this is going to be that but on top of that on top of that we need to also have um like an identifier right so so some sort of identifier um so we can have string and string length string and string length so definitely here we probably want to have um int number right so this is going to be int number and uh the string stuff so for the string it's a little bit annoying because um on the next call to get the next token the pointer to the string constantly being invalidated it is constantly being invalidated so that means as we get a new token uh we have to um so essentially copy it but we can copy all of that uh you know all of the strings into this Arena yeah we can actually copy strings into that Arena uh so I wonder if Arena has convenient functions to do that it has quite a few interesting functions uh actually um I think it uh no we didn't really have Str Str dup or something like dup n it doesn't have that um so but we can have uh we can create some we can create some functions to to sort of help uh anyway so as we iterate through all of that stuff as we iterate so we can actually keep track uh of other things we can keep track of the path right of the file path and uh row and column right so we can have row and column so and all of that is very useful because it's a location that we can use in the future and stuff like that so and uh since it's going to be a collection of those lexm right so it's going to be a collection of those lexm uh we might as well do something like this right so we can create a dynamic array of them so this is going to be basically item uh then we're going to have a count uh so this is going to be count uh and then it's going to be a capacity o all right so this is a little bit annoying but it is what it is and it isn't well it isn't so let's collect all of that into into the array of tokens uh lexm lexm so so we initialize it like that we just initialize it like that just initialize it like that so if we encounter um one of these things what do we do uh we create maybe a lexm that is um that is going to be like this all right so it is going to we going to assign the token regardless of anything so the token is going to be equal to that right the token is going to be equal to that uh if this is one of the tokens we might as well also right away Define the path uh then row as the line number right so this is the line number and column uh as the offset so we basically Define majority of the stuff here already right so basically defined majority of the stuff here already and uh now we need to yeah so in case of these things we don't have to do anything that's basically done right that's basically done in case of an ID we have to copy this string Len right so essentially we have to do lexim uh string uh Arena a Lo um pass Arena pass Arena Arena uh there we go B Arena and we need to allocate this amount Lexa string Len um so after that we're going to copy into this thing right so we copy in this can you see shed by the way you can't see shed right and how much we want to copy we want to copy this amount of stuff and then uh we are effectively assigned string L to that so that's how we're going to be doing that so we're allocating all of that into that Arena specifically uh and in here lexm int number is going to be equal to Lexa int number there we go so in here aert unknown token so it's we we should probably report like reported this thing but I don't want to do it right now anyway so after we uh create the lexum we have to do no da append and we're appending all of that into lexum uh that specific lexum we're just like copying it into that and we uh basically preed the entire file right so we have all of the lexm and in fact that allows us to look ahead as much as we want uh so that's actually pretty cool uh right so and we can maybe even check that we can even check that so I'm going to go and try to compile that it's not going to comp compile because it's going to have quite a few compilation erors so let's actually go through them so par primary uh right it's defined I don't really know why oh yeah so so basically that specific function does not accept any pointers anymore right so doesn't accept anything the next one yeah so there's no such thing in here there's no such thing in here so as of right now I would like to maybe maybe put false in here so I just want to make this code compile all right I don't really care about this code that much but I just wanted to compile so bin op aha so this one is really interesting because you have to do as uh right as bin op so I'm programming in JavaScript by the way chat I'm programming in JavaScript do I look like a JavaScript programmer uh so this is Bob plus uh so now now we parsing that so here we don't have to accept that here we're going to be essentially something like this so this is false uhhuh so I'm just going to the compilation errors uh you see uh I'm not afraid of compilation errors one of the things one of the skills that you have to develop as a software developer is to stop being afraid of compilation errors right so what I did I essentially laid down the idea the high level idea that I had and only then I started to go through the compilation errors in here uh right so there is a semicolon and there we go it finally compiles right so I sort of like roughly outlined the idea that I want without carrying too much of the compilation errors and then I just went through the compilation errors and fixed them up and sort of stitched them together and clean everything up so this is why by the way I don't like the LSP P or the systems that constantly remind me that I have some sort of Errors because um I have this pH of the development that I don't care about compilation errors or even syntactical errors right I just lay out lay down the idea that I have see what kind of works in my head and only then I uh focus on the actual compilation errors and why it is important for me because sometimes I make up so here I made up some sort of a function that doesn't make any sense and just doesn't exist and it's just like this is important for me uh because it's just a prototype code it's a prototype code so sometimes I may write in even a pseudo code right and pseudo code may not even compile and that's good because the places where sud code didn't compile are the to-do points right you're basically using compiler as an assistant to go through through the to-dos so you have a syntactical error somewhere that's a to-do to actually go through right so you potentially put some sort of syntactical error in there so the compiler will tell you to go back there at some point uh so you want your Ed to let me cook mode yeah exactly so uh let me cook phase of the software development and to be fair there is a very weird Trend in soft W development tools that is present like everywhere you can see that in all of these LSPs and linters and checkers and borrow Checkers especially in Rust right especially in Rust they all treat the code that you write as a production code as the final code all code is production is very toxic idea honestly it is extremely toxic idea like the code that I write like I didn't finish writing my code your code already breaks the build we need to punish you we need to punish you we need to punish you we need to put like all red uh lines all over the SP like dude let me finish my code that that's a very toxic mentality that a lot of development tools adapted all code even the one that you didn't even finish writing is a production code and if it doesn't compile we're going to punish him punish you this is really bad uh in my opinion it's probably one of the reasons why people hate programming right because they're constantly being punished for what they write uh even if the code is not is not finished so I don't know I don't know [Music] man so uh how we're going to be approaching all that so essentially all the passes are not going to be accepting the Lexa anymore anymore they're going to be accepting lexm right so they're going to be accepting lexm so an interesting thing we're going to be taking this um how to say that because I also need to keep track of what lexm I consumed and what I didn't consume yet so but here is an interesting thing uh since this thing is my own structure right so the reason why I added those specific field here this is very cool thing is that then I can use it with this di append macro it's a special macro it's a special macro that expects these specific Fields right so if we go into Noob di append uh right so where is the DI appended somewhere um nope di I pend yeah so this is how it works so it automatically searches for all these fields count capacity and items and reallocates the items if the capacity was exceeded so it's a very convenient way to do Dynamic Aras in C but the advantage of that is that you have your own custom structure and you can add more fields in here and this macro is not going going to break because it doesn't give a about these things so because of that I can add additional Fields like POS that indicate what's the current lexum that we are currently handling so this thing acts simultaneously as Dynamic array and a simple Lexa that just iterates through the lexm so that's pretty cool isn't it so but uh I want you to actually check um if this entire thing even works right so let me maybe add a little bit of the uh you know debug information right so uh let's actually go back to stb Lexa and we're going to accept the Lexa uh right so this is something that I really wanted to quickly check so I'm going to break on Main I'm going to actually run and let's just see so I'm going to go here right a boom so and let's take a look at some of this stuff right so we have uh lexm right so here they are we've got eight lexm we've got eight lexm we can even take a look at what they are so it's going to be items uh we're going to D reference those items and we're going to take exactly lexm count of those items and that didn't freaking work because I made a fuy walkie oopsy doopsy o all right so what do we have in here so uh we have token 260 right so we can even take a look at what it means mate um 260 so there's no such thing but I mean it's um it's a identifier right so I know that it it is identifier so then you have 40 which is probably like a open pattern uh then you have 35 right so we have in number 35 in number 17 in number and so on and so forth so basically we preached all of the tokens we preached all of the tokens so we can work with them uh right in a more eager manner uh right not lazily like extracting them one by one but we just preach them and now we can look ahead as far as we want right so we can now look ahead as far as we want not just one symbol ahead but as many symbols ahead as we want which is kind of cool which is kind of Po so any anyway uh so how we going to be approaching all of that so if I want to check uh the current symbol right so here I'm parsing expression plus I'm parsing expression plus uh and uh what I can do I can just do I can accept lexm lexm lexm so first thing I need to do uh I need to check whether we're not at the end if lexm pose is less than uh lexm uh lexm count that means we have some lexum in there and uh lexim item lexum items at lexum POS token equal to plus that means we have to continue so that's basically what with them it's basically what do we need to change Lexa to lexum and we go right so we check whether we have something in there right so and then we check what at that specific position if uh that thing is plus we Gucci where tamaguchi we can continue so this is for plus so it allows us to look ahead as you can see all right and what's interesting is that once we notice that it's plus what we can do we can actually increment by one so we we kind of consumed that token you see what I'm talking about you see what I'm talking about ah so we just consume that token so post plus one so and we need to repeat that for the multiplication might as well by the way uh right so just copy paste it and every time we encounter plus we can do malt right so this is maltt this is malt uh here this is the multip and as we go down right we actually don't go into the multiplication go into the primary there we go so uh this is one layer lower one layer lower and uh the most interesting thing here right is probably uh the primary right so this is going to be the primary and in the primary we only care about two lexm right so first of all we can do the following thing if for some reason we ran out of tokens we probably need to indicate that we need to probably indicate that so we can say something like uh error expected uh expected number or identifier but got the end of the file and then we just return null out of that and here we also return null for now right so that's very interesting so we checked that we at least have some lexm in here now we can switch upon different kind of lexm right so this is going to be items uh right and then this is going to be the position and we taking the current token and what kind of situations do we have we have a situation clex uh number right so this is a clex number uh so it's going to be one thing identify or others so and in the case of others we just have to throw an error we just have to throw an error but it's pretty straightforward um so we can actually do the following thing we can do lexm uh lexm and just take a pointer so it's not that huge of a thing to look into so then later we can quite easily just get the file path the file path and the row and the column saying that uh unexpected unexpected token right unexpected token and where we can get all of these things we can get all of these things from path uh right sry row column row column and we can even return null in here there we go we're almost finished parsing this entire thing uh so we only need to parse the numbers right so if it is a number uh what we have to do we have to just allocate the expression node so we already know how to do that uh so we can just do num uh here we can do by the way this one is interesting uhhuh speaking of in these things we have to also allocate all of that in the dynamic memory I didn't think about that right so expression Plus uh let's allocate that everything is going to be in dynamic memory uh but then we probably can do something like cliteral let's actually yeah we can do cliteral situation in here furthermore yeah something like this and then we can just return uh plus maybe even can I do nah so we can just do return plus yeah something like this something like this so this is going to be plus uh Bop blah blah blah so this is a right hand site yeah I think that's how it has to be done right I think that's how it has to be done so in here we'll just do that uhuh and then we can do another cital situation in here and then we can just return M that should be all right that should be all right it's pretty Dynamic but I mean that's fine that's C for you mind uh so kind is going to be expression num number uh num is going to be as number please thank you uh lexum int number and we just return that just return that and by the way so we already copied everything okay so I'm I'm really happy with that so expression um so identify okay identify is rather interesting so it has to be a function called then uh Arena allocator so yeah for the arena allocator for the function call um so what we have to do we have to do function called uh kind expression fun call right so this is going to be expression fun call and then as uh fun call we're going to have args so do we have expr yeah so here they are so that means we need to basically parse arguments maybe we're going to introduce a separate function parse args yeah parse ARS uh by the way every time we sort of consume the token every time we sort of consume the token I think we have to do plus one in here and definitely plus one in here right so we consumed that token might as well even do plus one in here one time and don't care about it anymore uh so parse ARS so par the args for the lexm um and that should not well I mean yeah we're going to be appending the arcs in here so it would be nice to actually take it by the point take it by the point if it's not equal just return null uh and that's about it so this is probably the last thing that we'll have to implement parse arguments uh so yeah so let's go ahead and just go to the compilers so clex uh UND declared is it Collex number so Collex integer int lit okay so it's actually called int lit uh what's next it's ID what's next pars AR all uh so we're going to do um Boolean and we accept La Sams in here plus the pointer to the arguments which we're going to be modifying uh what else do we have le what else uhuh plus ah and in here we have to accept lexm if we par in yeah so if we par in the expression we have to return it like that LE pass the Le okay so that seems to be compiling so the last thing we need to do we need to pause the arguments to maybe I'm going to par only one argument for now uh right because don't really want to deal with the um with the things so it would be nice to have a function uh that basically expects a certain kind of token right so if expect token Le SS uh collects maybe just something like this if this token was not encountered right something else was encountered we should probably return false and something like this right and after that we can just take uh one ARG um and by the way if we're going to be having like a list of Expressions we are allocating them as pointers so it makes sense to actually treat them as pointers so the thing we can do args no append um args pars expression Le right though this thing may fail so it's going to be expression a single ARG if uh ARG is equal to null we're going to return false otherwise we're going to append it to the list of ARs and then we're going to just return true so that's how I want to part it so we expect open Parn then we parse a single argument we check that the argument is not bogus then we append it this thing to this list of arguments that we passed in there then we expect the closing thing and then we return true so the last thing that we need to implement here is expect token uh which returns Boolean accepts Le SS uh and the token itself maybe more like yeah expected token so it's pretty easy uh first of all we need to check if the lexm even have anything uh so basically the count uh if it's actually greater or equal we have to throw some sort of error we already thrown an error uh right so we can say expected um I don't know we can say expected token L and we can just do token but got the end of the file so it's not going to be human readable it's going to be just indices and numbers but I don't want to spend too much time on human readable uh error messages right now so um lexum items uh right so let's take the position let take the current thing and let's take the token and if it's not equal to the Token uh actually here we have to have false uh we also going to do the following thing expected token that but got that and again it's not going to be human readable but that's fine for now uh right we can always look them up and after that we can just return true right we can just return true um so that's about it honestly but if we expect the token uh if it um yeah we should probably directly modify the pose so one of the problems here we may encounter is that we may forget to increment pose at some point right so we checked for that so we check for this we consumed the token we pared these things and we appended to the arcs and we checked for that so we support only one argument right now uh it's easy to add more but I just don't want to spend time on that so in the primary we check the first token right so we we consume it so here we just edit and that's fine so for the function calls so we just do that then we par the arguments and arguments uh do everything so here everything's fine so for the multiplication we consume the token everything's fine for the addition we consume the token all right and that's about it I think that's the entirety of the Passa actually I think that's basically the entirety of the pass oh and by the way here we may want to maybe uh print the location right so let's actually do lexm lexm like this let's print the location uh lexm token M looks some token and here we're going to have the S the D and the Double D uhm path lexm row lexm uh column there we go so let's go through the compers what do we have uh lexm uhhuh what else uh Lex uhhuh it's usually this kind of is not really that important uh par expression we need to forward declare it by the way right so it needs to be forward declared here somewhere uh parse Expressions uh actually it's not that it's that uhuh no d a append what else do we want uh boom cool so we basically learned how to par everything so let's actually try to parse the expression uh expression uh root and this is going to be the expression um parse expression lexm uh and see how it goes if root equal to n uh which is going to return one uh let's see how it goes so it doesn't like that because we have to pass it by a point there we go so let's go into the debug um so did it uh one more time yeah that was fine so let's break it m uh and run this entire thing so I'm going to stop somewhere here and uh I'm going to continue so let's take a look at the lexm and see how many of them we've got uh Lex Ms right so we got eight of them uh cool let's take a look at the root uh root is not known yet because we have not run this entire thing so I'm going to step once and it second folded right so that's actually kind of cool so it did in fact SE fold so it SE fold at strr Len uh so let me see it was trying to print something so expect tokens uh yeah and it didn't like something it's it's a wrong yeah I I did the same mistake again uh God damn it okay so let me restructure this entire stuffff it's actually kind of cool that I did all that in the debugger so I think it's it was very beneficial uh break main run uhuh so let's go here then I'm going to step once uh okay so expected but gut all right none of that makes any freaking sense I should have actually expect M mhm [Music] uh L yeah is that it has to be LD yeah all of these things has to be LD I don't know why I decided to make them like that all right so the next one uh can I kill and refresh a couple of times um break at Main run uh do I still have these things no okay yeah so expected but got 42 this is too much this is some sort of a I'm telling you actually I'm actually telling you brother uh how did we end up here I don't know uh so expected 41 um so maybe we can uh actually break at expect tokens maybe we can break at expect tokens uh right and maybe we can just go in here so let's continue so that didn't really work uh no uh let me see gf2 uh okay so run it what we've got what we've got um so we can take a look at the lexm uh and token so we paring the arguments which does in fact make sense right does in fact make sense um all right so I'm going to continue so 41 it expects 41 that's fine um I might as well observe what we got in here right so there is a position um yeah after five uh something bogus has happened right so it jumped very quickly to five which doesn't make much sense to me because if we take a look at the main PI right one two yeah it jumped very quickly there it jumped very quickly there um so definitely something with the expression and I think I know what exactly going on I forgot to return Fone call afterwards that's what I forgot to do so it was BAS basically yeah it was returning n and it thought that it yeah it couldn't par this thing even though it could par this thing uh it was reported as it it as it couldn't um so that's pretty cool okay so let's try to recompile this entire thing and maybe even restart so gf2 break main run uh and uh so let's actually continue and can I step once it still actually causes that that doesn't make any sense all right so I I think the time has come to actually go through the entirety of the compilation right so because that would yeah so that error should have went away but it didn't so let's go through the whole compilation right so let's break main let's run um so and let's just continue and step step uh boom that doesn't make any sense like why did I do it like that yeah I'm an idiot so it's funny how the compiler always forces you to go through your stupid code and just think about it more thoroughly uh right so essentially I didn't know why I was doing it like that because that's what I wanted to write instead uh right so that's what I wanted to write instead and in here because I changed the way I report errors and stuff uh right I might as well um maybe do something like this uhhuh so then here we allocating new stuff so this is multiplication uhuh like this MH like this so yeah this is only one allocation so this is fine LHS rhs then the own location here here everything's fine here everything's fine I hope I caught the last one I hope that was the last one uh okay gf2 break main all right so then go here uh continue finally all right we're good to go we're actually good to go and let's take a look at the roots okay so we have a function call so that is pretty cool so kind function call so if you take a look at the function call uh it doesn't have any name which is easy to fix but I mean you then you have arguments then you have arguments and it doesn't allow me to go further in the arguments maybe it's a limitation of uh GF right so but this is something that we can fix phone call uh name so Arena yeah this is a Fone call ID and I don't really copy but didn't I uh do strings I I could swear to God I was copying things somewhere okay so the only thing I have to do I have to get that stuff from the lexm uh right and in a fun call expression fun call okay so fun call name um and it's a data fun call name data lexum string uh count LM there we go so if I try to compile that fun call uhhuh as fun call All right we fixed that so that should be fine now uh I'll right gf2 break main run so let's continue everything seems to be okie dokie C Yuki so root so this is a fun call uh fun call name count Z did I up something right so this is a fun cool name so I take that from the lexm from the current lexm and when I'm Lexing all of these things this is the Lexa right I'm allocating string length into the lexum then I'm copying into lexum this stuff like as much as I put in there and I'm pretty sure I didn't I'm pretty sure I didn't it up like I mean I didn't see mistaken here unless say uh um yeah so that should be cool because if we take a look at the lexm the lexm did I did I forget to refresh something I could have actually forgotten refreshing something uh lexm items all right so this is going to be that at Le Sims count it's going to be that and the first thing yeah here it is string string Len zero yeah it it's already up in here as it is I can see that it's already up in here all right so that means the the mistake is in here interesting because that is kind of unexpected all right so that is kind of unexpected um okay so let me let me try to actually return one in here so um I don't really know what the is going on but print F clex ID was it was it string because I'm pretty sure it was so string Len and then Lexus string it's probably something super dumb I'm I'm like I feel it I feel it uh I feel it it is something super dumb yeah yeah yeah okay so it stopped working it stopped working it stopped working but why did it stop working why did it stop [Music] working did I forget so it's the Lexa Lexa told us so maybe I just do that no man we are just Lexing we are just Lexing we're not doing any okay so this is yeah it's a ID but it tells us that it's zero the Lexus stopped working for whatever reason I don't know why why did it stop working is that because of I'm trying to mess with locations um is that is that why no it's not because of that but I was this is the error of Lexing and we I thought we figured out Lexing I thought we we were done with it right weren't three I'm pretty sure we were done with lexon and it worked fine right and it's just like now it is not working and I don't understand what the is going on like what okay um it is something absolutely bizarre because I literally just wrote that code it worked fine it worked fine it stopped working um I have no idea way so maybe I'm corrupting memory somewhere maybe that's what's going but I mean I I don't didn't even start doing anything right so with the lexum we can go literally back to before we had any of that uh but maybe I I can't come do that it is something super dumb and just like very demotivating because like I know that I can fix it with like one yeah so do I not allocate enough memory for the all of that is stored within this thing it's a string store so all of that is stored there um I really need to start from scratch basically right so that's what's going on essentially um um so let's not do that um print F let's start from scratch I suppose right so maybe we'll figure it out oh boy uh Lea this lexer I swear I should stop using this lexer it's bad it's honestly bad every time I use it I kind of waste time so I should actually just start using my own Lexa or something um so this is going to be that this is going to be that uh and this is going to be the string umh so print F character and we can just do character Lexa token all right so let me try to see that um am I just doing that incorrectly like maybe this is not how you get an identifier so there is a pointer okay pointer does make sense did I accidentally modify what is going on how do we do the identifiers right so Collex ID so when I we do Collex ID we create um okay maybe I just do that incorrectly but somehow I have a false memory of doing it correctly this is lexer I I swear to God I don't know why I keep using it h I never had any problems with my personal lexer that I write right so this one it just kills me um line [Music] 10 what the it is is it n terminated how am I wait is it n terminated don't tell me it's n terminated man I'm I'm going to stop using this Lexa from now on it it kills me dude what what what's the point of having string Glenn if it's not terminated this is so bad like there's so many confusing things about this Lexa like dude okay maybe this thing is unfinished okay I'm I should probably I'm very unjustly complaining but I mean I wasted time oh my God so the thing like if it's not it would be nice to have some like a little bit of a comment and say here the here that says ignore this field if it's identifier like how like this is so bad like it's I know that this is I'm I'm complaining completely unjustified I understand that but I just wasted time because of this strange design decision that implies that this thing is always sized and not n terminated like who on Earth would think that okay in case of identifier just ignore this thing and use like am I complaining a lot because like this is really bad design like it's really bad it's just like holy I'm sorry I'm going to stop using this Lex like the next time I need to Le something I'm not using this one not using this one it's just like second or third time I'm wasting time with it because it's like something that is not properly documented or something confusing that like basically traps you into wasting time I just can't I I just can't um I'm sorry um so anyway um so here we can kind of substitute for that we can essentially maybe do something like this uh right EST L it's like it's intentionally it's designed intentionally to you up like it's like almost intentional because you that's what it feels like honestly it's just like an intentional you maybe that's that's meant to be like that I I don't know again the the author doesn't owe me anything uh the author doesn't owe me anything so but I'm not using it from from now on just like that's it I'm I'm done with it but the thing is that you want to have a ready to use Lexa like I I want to have ready to use Le it's like one of the things I want to have uh right and it's just like this one is very appealing option because it's just like yeah just like use it anyway I'm sorry so uh let's recompile this entire thing that seems to be fine um now I can go to gf2 break main run uh and let's go down um okay leims uh yeah doesn't matter so what I'm care about in here is root uh and what do we have in here is the function call uh this is a fun call the name all right so you can can you see that so you can see it now okay that's cool that lexer I swear okay I'm going to stop like I just like don't use it next time just simply don't use it next time um now um what do we want to do we want to now compile that right so we want to now compile that so let's actually go ahead and introduce uh compile expression right so we're going to accept the expression and we're going just say a compile expression simple as that so and in here we going to essentially have uh expression right we're going to dispatch on all of the possible uh kinds of Expressions right all of the possible kinds of Expressions uh so let me try to see I want to know what kind of kinds of Expressions we can have so we can have numbers fun calls and everything um so essentially here is the interesting part uh here's the interesting part so usually if I'm dealing if I'm dealing with a stack based machine if I encounter a number if I encounter a number I would basically push that number onto the stack but this is not a stack based machine but we can kind of simulate the stack with the variables and the numbers right so we can introduce something like a stack size uh stack size which is going to be zero and essentially what you can say uh you can say print f uh so let's actually ENT it like that um so s uh Zu uh equal W and the number that you're trying to push like so um so this is going to be stack size and expression as number there we go and then you increment this size of the stack right so essentially you can think of s as the stack where where the number after s tells you the um the element on that stack and you keep track of the size so you can kind of simulate the stack based machine with Temporaries you see what I'm talking about right you simulating a stack based machine with Temporaries which is rather interesting by the way we we have to print that into the uh into the output right so it would be nice to also accept the output in here right so we printed that into that specific file uh and uh now let's actually do the Bops right so let's do the Bops so here um we also need to switch upon what kind of Bop as uh Bop kind and we have several Bops in here Bop multiplication uh I might as well just do it like that and plus right so this is what we got uh break uh break uh all right so this is a function call and this is another one okay so this is a plus so how we're going to be doing that so essentially um we have to take two elements at the top of the stack and add the top one to the one below that's what we have to do the top one to the one below so that means that the stack size has to be um like at least two arguments it needs to have at least two arguments so and how we're going to be doing all of that how we're going to be doing all of that we're going to do an assignment of add s zuu s zuu so we're adding the top one the top one to the one below uh right the one below the one below uh so here the one below is a stack size minus two because we inding the elements of the stack from zero right so essentially if you think about the stack like this like 0 1 2 3 four you have five elements so the stack size equal to five so that means the top one is the stack size minus one the one below the stack size minus two right so we're adding to the one below and what we're adding we adding the top of this stack right we're adding um the top of this stack um and to be fair right it doesn't really matter in which order we're going to be doing that we can just like do it like that uh right so to the one below we're adding them like this so and since we consumed two elements and pushed a new one so we effectively consumed one element so we're subtracting one from that we're subtracting one from that and you can repeat that for multiplication as well you repeat that for multiplication as well so we're done with these things and the last thing that is left is a phone call right so the only thing that is left is a phone call so fun calls right now fun calls right now support only one argument right and furthermore we can even check that maybe uh expression as uh um name right so we have to use no SV equal uh SV per so it has to be equal to that um so we're going to just H code this thing and essentially what do we do we have to perform a call to print F that's how we do that right so that's how you do that so you just do might as well do something like this do print F fmt integer right and we're actually um you know calling the top of the stack right so we're calling the top of the stack and the top of the stack is a stack size uh stack size minus one and we consumed the thing on top of the stack so we can just uh subtract one afterwards and all of these things have to be at the end so that also means that at the end in here and uh in like after the code we need to have the data uh that looks something like this so it's going to be out uh like so replace [Music] M and then new line so in here we can just do D but with a new line all right but that new line has to have extra escaping so that's it's about it I think so it's actually called fmt int all right and here we're going to be doing the compilation of this thing so we compile an expression we compile an expression root expression into the out so that's what we are doing let's go through the compilation errors um so fun call name what else do you want s v um implicit declaration didn't I have uh SV no I don't think so so there is from uh sister that's probably what one have all right and if I now try to run compiler it created main SSA and that is so funny uh all right so first of all we have to put it in here right so first of all we have to put it in here uh and uh one important thing we forgot is that we also need to compile we also need to compile all the previous things right so if we are compiling Bop right so we might as well actually first compile uh left hand side and right hand side so we have to do compile expert out expert as Bop uh left hand side and then right hand side right so we compiled both of them um so yeah left hand side and right hand side both of them and uh in case of an argument in here we also have to compile uh compile expert out expert as fun call um fun call args items zero and another thing we can ass certain here is that uh args count is equal to one right so we just had coded all sorts of things in here because we're have a limited time right so and is it going to work um so that didn't work and this is because uh the stack size has to be preserved across the calls right so this is going to be stack size like so so that will require to go to the compiltion errors but I think we can afford that uh so here we have to do stack size stack size uh so here has to be like that uhhuh so uhhuh stack size uhuh so stack size so that means at the end we'll be able to even check if we even have anything in stack size and another thing that uh I think compiler didn't catch is this one I think this is very important and since it's a pointer for the compiler it was very difficult to catch actually right so yeah now another thing that we also need to yeah there's a lot of things there's a lot of things the compiler literally didn't check uh and that's kind of bad freaking c i swear to get freaking C uh so let's actually go manually through all of them so this is correct this has to be the reference all of these things have to be the reference all of that stuff has to be the reference so this one yeah this one is fine all right so are we going to finally do the trick uh boom so yeah it's a little bit of an Overkill it's a little bit of an over kill but we are using backand and we're using backend which means that maybe it will optimize everything out so yeah we forgot the dollars and everything uh so let let me quickly do that it's not even like dollars um it's percents right if I take a look at uh MSA 1 + 1 uh SSA yeah it has to be percents it has to be percents um so let's do quickly like this mhm so and In Here Also percents so um let's actually do make minus B okay and then there we go so that's what we got that's what we got pretty cool um so now uh the what the is this sh by the way uh this is not b I'm telling you um yeah I see what's going on it has to be yeah escaping freaking escaping Madness uh yeah there we go so now we're talking now we are freaking talking so the next thing we can do essentially uh we can maybe create like a main um main SSA right so essentially we do compiler we depend on the compiler we just run the compiler that will create a main SSA and we can then do main s main SSA uh QBE uh Qbo um yeah main s main SSA so let's actually try to generate assembly out of that uh invalid instruction o so you can't just just assign this okay so let's actually go to into the documentation uh I suppose you don't have to put W in there I'm confused can you just like assign an integer like I'm pretty sure you could maybe what if I just don't do that like yeah it didn't like the instruction which is bizarre invited class specifier huh so this was [Music] fine you can't really you you just can't assign numbers to those things you just can't do that um that is like I I don't believe that honestly so is there am I missing something by the way am I missing something because if I do something very freaking dumb all right like add like I'm pretty sure or and um examples examples okay I'm confused like what the do you want from me okay so here it wants that uh really really if I want to just assign a number to a temporary how the do I do that like I mean I understand that I'm probably using that incorrectly like I I do understand that I'm probably using that incorrectly but if I just want to assign a number to attemp like how do I do that that would have been that would have make so much sense so okay there is a copy uh what is copy instruction um custom copy copy instruction Returns the bits of their argument verm okay we can try copy okay um we can try copy and when we do this that uh okay so this is a copy and also I forgot the uh the commas uh okay I I do understand that I just do that incorrectly right so I do understand that but I I want to do it this way all right so we've got this stuff um so this is a main um it actually pre competed that at compile time it literally pre competed that at compile time that's actually so cool so it did optimization and now um all right so I want to try to do the actual compilation so we're going to depend on this and we're going to do CCO main main uh mainy and let's just make main all right so as you can see what we did in here uh we compil the compiler We Run The compiler which generated main do main. SSA then we passed main. SSA into QBE that generated assembler and we compil the assembler and we got the main executable which we then run and it prints 69 it in fact prints 69 and in fact so this thing is modifiable I can change it to 420 right I can change it to 420 and I can recompile the whole thing and if I run main it is 420 we developed a very simple programming language it was a little bit hard because because uh I wasted too much time on Lexing so I'm not using this Library again stb libraries are great don't get me wrong it's that this specific library is confusing and it's probably not really maintained that well anyway that's probably why uh it is so confusing uh right so that's probably why it is so confusing so and I suppose maybe it's not meant to be even used at all just like kind of abandoned and everything so maybe at some point I will just develop like a library like a Lexa that I can reuse uh myself because like having a Lexa apparently for me is a very common thing because very often I just want to quickly throw up like some Lexa to parse something like quickly and not really think about it too much um so yeah uh so let me let me see now we but what's interesting is that we are only having like one thing in here you know what would be cool it would be cool to actually have several statements in here like this like several statements I could say like 69 and 420 so can we add that can we add that that's a very interesting question uh we might as well maybe um try to do something like this this uh if I try to compile this entire stuff so it does not complain and if I do something like this it prints only 69 right so here's our program it prints only 69 so what we can do what we can do we can actually do that in some sort of a loop I think um right so we can put this stuff in here and say while lexm position is less than the lexm count we parse a root and we compile it we might as well keep the same stack throughout the execution I'm sure if that's useful but I mean we could do that so and now if I try to compile that uh if I try to compile that so um it's not a point per se and let's take a look at the SSA yeah so that's the original code that we have and this is the one so we might as well actually do 35 + 17 * by 2 and this one is maybe 2002 + um you know 10 to uh something like this and we can try to run this entire thing and then revert and that's the generated code right so this is the first print this is basically the computation of the first print and this is the computation of the second print so we got that uh and then uh if we try to run this entire thing it will print 69 for 20 right and it was compiled from this code we already have uh a simple compiler so how do you do jumps by the way how do you do jumps uh is there any jumps so there's a GMP so you have to provide the identifier is there any examples though uh unconditional simply jump to another block of the same function uh doesn't really explain but I would like to have some examples but yeah so that's pretty cool uh we can even take a look at the yeah so in the comp and the back end actually competed all these things at compile time as well right so it automatically provides all these optimizations for us so we don't have to think about that [Music] um that's pretty POG that's pretty POG I think all right so uh that was interesting um so I think QBE is kind of a cool thing uh this particular Intermediate Language G gives me a lot of ideas for my own Intermediate Language because from time to time I'm writing my own languages all in programming languages and I do have intermediate presentation and this particular thing gives me some interesting idea that I would like to personally like explore myself so yeah that's basically it for today thanks everyone who's watching me right now I really appreciate that have a good one and I see you all on the next Recreation programming session with am Aus love you
Info
Channel: Tsoding Daily
Views: 63,953
Rating: undefined out of 5
Keywords:
Id: JTjNoejn4iA
Channel Id: undefined
Length: 173min 17sec (10397 seconds)
Published: Mon Feb 19 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.