Elvis Adomnica - Memory Management and Garbage Collection in V8

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so my Elvis I think I need to click on that one and test this yeah it works so few facts about me I've been working in the industry for like ten years about ten years I think like any of you I still have no idea what the heck I'm doing let me try to Greg drag these to get my slides in front of me bigger yeah there we go yeah I've been working with the electron which is a kind of a chromium framework actually it is a chromium on top of which you can just make web applications and that's like a cross browser cross-platform thing so basically you write the code once and then it runs on Linux and Mac and Windows of course the promise is that there's a bit harder I started with C and I was actually working in networking for industrial printers and then somehow life took me to JavaScript so I really have no idea what the heck happened but yeah I really enjoy it so because of that I actually want to do a bit of a combination to talk with you about the internals of the browser and how the memory works so how the memory is managed how the heap it's organized what is garbage in terms of memory and then how garbage collector algorithm works and then we can look a bit at v8 so if I go back to my old days I used to allocate memory using a malloc usually or C a lock and out free it's at the end and then I will get you know all those bugs segmentation fault and all those things then in later on I worked a bit with C++ as well so you'd new things and then you will delete or you'll write a destructor that will do it itself and then I went to JavaScript and then JavaScript you feel like gods because you just great put the curly braces and you just created an object right out of nothing no constructor nothing is just amazing you didn't knew anything we used two new things in es5 using construction functions later on we introduced class I don't know what that is about but like yes we did it and then why were like hey but what about delete where did that go right well it turns out it's automatic actually so there is something called a garbage collector which is a part of the runtime system which actually tries to free up memory when it realizes that there's not enough memory to allocate memory or sometimes at a certain threshold let's say that the memory consumption it's 80% is going to try to collect to make space for the new objects and it's kind of nice that's the program itself it's called a mutator because imitates memory right so they are kind of concurrent you can think right one tries to clear the memory the other one tries to create more things right so there's a bit of a game between the two the way you can organize the heap it's quite simple right we all did three lists in computer science 101 you can just have a free list first you have of course you have entire heap and one node and as you allocate chunks you can keep track of the free space and then the occupied space you know of course it's going to be in between but as you this process goes and goes actually you're gonna be suffering from fragmentation right so the memory is gonna get fragmented over time and then it can get so fragmented that you want to let's say you allocate one megabyte but there's no space for one megabyte you have like actually one megabyte of available memory but it's just in chunks of one kilobyte let's say right so you cannot actually allocate anything right and it can become very expensive depending on the heap size so if you're like on a server and then for some reason you have like a 4 gigabyte heap size and that's very fragmented to find that free space it can actually take awhile so write it so n where n is the size of the free space list right and then the solution is download more ram right we all do it the server doesn't scale yeah and I just put more RAM there but actually the smart people I did garbage collection they came with some solutions and then that would be periodically tried to compact so try to move objects around and move them on the on a size of the like let's say the left of the heap so you basically have bigger chunks or in order to avoid this oh when you can actually keep your memory like some ranges in multiple lists so instead of having only one list with all the free space you can actually have multiple lists one let's say has objects like chunks over one megabyte one and so on you know like so you can just have some certain ranges so you can look up faster were the free spaces but then there's a second way of organizing the heap which is you just put it in two halves and you start allocating in one half and then you still allocating that half right continuously and then let's say that you want to allocate that little guy what you'll do you're gonna start hoping what's alive we'll get to that in a minute into this half then you've now have the space you're gonna put your object there and then you just swap the two things right now this becomes the the left one one it's more like right for you guys the right one is now the from space the water one is the two space right this is actually kind of like even with how the process scheduler works that it has an active and a passive queue and then they also do the Flip actually the guy that did it I think if I'm not mistaken is that guy that did it he's actually not a computer scientist he's an ass desist from Australia and he wrote the most performant scheduler in Linux just another fun fact and then yeah there was a bit of a spoiler alert with this thing because I showed you a bit like how the things get copied and this is actually called called a copied copying garbage collector we're gonna get to this you know in a minute so if we look here you'll ask yourself what is garbage right how does the system know that a certain object is not it's garbage now and can be collected right and then if some of you guys did C++ and smart pointers you'll know that there is something called reference counting it's a very naive but very simple way of keeping track of garbage all you have to do is just keep track of how many references you have there right so the this route here it's actually the global object right is the window object in the browser and is the global in no js' and then from that one you can actually track the pointers and then you'll know like references and then you'll know like how many references they are and how they reference to each other and then you keep track of these and let's say that object goes out of scope now you remove that reference so you decrease the counter now you know that that's went to zero so you can just delete it right oh you got rid of the garbage right and then you decrease the other one of course and this is what is nice about it is that you have incremental garbage collection right the moment it reaches zero you just take it away you take that space and put it back into your free list right it's very simple well there's an issue here right and the issue is that guy goes out of the scope that gets collected but now you just ended up with one one and one right so oops that's a memory leak so reference counting was works very nice but if you have a circular reference it doesn't matter how huge it is it can be just too object it can be 10 you cannot collect it because the counter will never reach zero right so that's kind of the limitation of it it's very nice it's very simple Internet Explorer 6 and 7 actually used to do this so yeah we all know how good they worked but there's another way and that is called mark sweep or mark and sweep it depends on what the kind of documentation you look at and that is this object goes out of let's say scope and then what is gonna do is gonna start from the root and it's gonna go for every single reference and then it's gonna start marking so it's just gonna explore this and then basically put a mark like imagine that this object has like some beats somewhere that you just put it from 0 to 1 and that means that yes this is a life so it's reachable from the root I need to keep it in memory and then you just go and go and then of course when you go in the circular one you already know that the bit is set so you're like okay I mean the circular one I'm not gonna go in on infinite loop right so you're like just and then let's say that guy goes out or like brother he's out and then you scan and then you saw that they think right like this one there was no reference to it so afterward after you did all these basically traversing you just take the heap from one side to another one and then you just look for objects that haven't been marked and then you just clear them right and yeah let's take another example with a circular one so that goes out of scope now we are scanning the route we mark that one right and then that's not marked so we can remove it that's not mark we remove it I kept that thing there too just to illustrate the fact that now this pointer of course they're still a pointer there if it actually points to something that was collected but it doesn't matter anymore because it's not reachable right so that goes away this goes away and then you're all safe right so that's kind of like how this mark-and-sweep you can see that is nice it doesn't leak memory but the problem is you saw like in the first cycle I actually went and I scan all this circular loop for no reason right so the objects were alive and I had to go and do all the traversing and imagine how much traversing you can do so this mark part can be very very inefficient in some ways right or you can take a longer time so a bit of recap yeah using free list fragmentation semi spaces is nice it actually compacts you so by definition in compacts because you're just copying from one side to another one so it's all good yeah they say don't put words on the slides people will read them so I shouldn't have put anything the problem with semi spaces is that yeah you basically have the area where you can allocate right so let's say you have 2 gigabytes to allocate memory now you have only one gigabyte and one after one gigabyte you need to collect so that means more often collections and if you've tried to make some kind of a game oops maybe that's not ideal for you reference counting it's nice but it leaks memory mark-and-sweep spend some time in just traversing live objects and that kind of a copy thing that you saw and then the marks with what usually in traditional or back the 90s early 2000 you actually need to stop the program from executing when you collect memory because otherwise the program will try to allocate more memory and is gonna start up with your stuff right so it's it's not okay so you need to stop it's actually in garbage collection technologies called stub the world so you need to stop the world the program and then collect and then you resume and then you collect and another fun fact I actually did my master thesis on this to try to calculate how much of the execution of a program in Java hello have Oracle it's spent on collecting and what's nice about Java is that it has different policies so you can actually tune the garbage collector to use all kind of algorithms I think they were like eight back then in Java 7 which is nice because then you have on the same DM different implementation so it's very nice field to compare them right and then it's about 20% that a program in Java spends in collecting even though they have this amazing single thing which is called G for GC so generation for garbage collector and it kind of usually takes advantage of the pagination and things like that to try to almost like hardware-accelerated garbage collector and that one was a killer but I could that one was like you know where's more my reference point so that one is it's a bit different because it knows how over trash all how to do it so it's very performant because it knows to collect in parallel will get there so yeah you can do some optimizations so even using free lists from time to time you can try to move things around maybe it's expensive to do it every time but from time to time you can try to do that I mean why not use reference counting it's incremental you get rid of objects when they are 0 yes you do like some memory but you can actually measure that maybe you know like and then do a mark sweep from time to time right why not so that's a nice optimization and Firefox used to do that at some point nowadays they moved but like back in the day so you actually used to do that and maybe you can do this marking in parallel right so maybe you can mark things while the program still executes and then you're gonna try to interrupt it only for a very short period and I'm gonna show you in the next light how you actually do that and then there's a observation which is not rocket science one we use lots of temporary objects right everywhere especially in JavaScript you do one map you do a reduce boom you've done it right why not split a memory and then try to handle these temporary objects in a more efficient way than the long-lived ones right so you can split these in something called generations which is actually the more than way of doing garbage collection and then we you can actually combine things right young generation can be collected using the copy one because it's very efficient and then the long one that you know anyways the objects are mostly going to be alive from time to time you can just go into a mark sweep on them and this copy collector could also be implemented on the fly so why would you first you're in these two semi spaces right why would you stop the execution and just mark and then go and copy when you can maybe as you mark copy them right and that you can do that using a forwarding pointer so what you'll do you'll copy something but in your address you'll say like where's the new one and then when you have a reference so this way you can keep the references by having this kind of a forwarding pointers and now I will try to show you this is gonna be a gif and it's quite fast actually it's from Wikipedia but this is how three color marking works so basically what you do you consider all the objects white which means that all of them are prone to be garbage and then you go from the route set and then from the routes that the first objects that you meet you mark them as gray and then you stop from you don't look at the route set anymore now you take the heap and then you go from the gray ones and then when you reach another object is gray but when you finish all the references of this object you move it to the black right because you know that is reachable for sure and then you go continue doing this and at some point all the black objects are the ones that are alive and why is this interesting is because you can kind of do it in parallel with the execution of the program because you can go from the route sets you mark some things as black you continue doing all these things but if something is written like your object black object is modified you just make it gray and you continue right so that's the advantage of it and object is black you know for sure that the life you modify it it might it might not be a life you make it gray you continue yes maybe sometimes in the next cycle you still have an object that was garbage before but it's not that big of a deal so in this way you can actually have kind of parallel or concurrent garbage collector that works concurrently with the mutator right Jesus that's fast am I talking too fast okay so now we go into the v8 world right so as I told you you can actually split this the heap in some generations and basically the guys at Google what they do is that they split it in something which is called a nursery that's where your locates in Java it's called Eden that's a very nice way of calling it so basically what you do this gets full you take alive objects you put them in an intermediate one so it's one copy this gets full you promote them into the other one so it doesn't go back and forth too many times and I think in Java you can configure it actually how many times you want to do this dance but in a v8 what they do your allocate here this gets big you move them this gets big too big you move them here right because you know like if an object survives one copy and then the second one it kind of gonna stay there for awhile right and so you obviously use here copying collector right and then here you use mark-and-sweep so the whole generation is marks way port marks with compact so from time to time you compact it maybe not all the time and then in this one you use in the young generation you use semi spaces that I showed you like the way the two semi spaces so basically in the slides and also mentioning that before v6 point two they were using this primitive way of copying as I told you that they were stopping the world and then they were like marking and then starting to copy but then later on they just combined all these steps like the mark and the copy and the updating of the pointers and I breach this which is called parallel scavenger which is like they are doing it on the fly so basically is just an optimization to the algorithm as I told you before and then what another interesting thing that Google actually does they try to preserve 60 frame per seconds in the browser so actually they have when the process is idle they know then you know like you of course we have all request animation frames and all that kind of stuff but it can measure when it's idle I mean it's idle and poof they go and they invoke the garbage collector for a while and then they stop it then walking for a while and they stop it invoke it for a while and they stop it right and because of these three color marking you can do it right you just go like mark mark mark mark mark open stop right so it's a very very efficient way of actually maintaining this 60 frames per second and that's why nothing beats Google Chrome I'm a Firefox users user you saw that when I drag the tab but yeah nothing beats Google Chrome and I work with electron which is chromium right so is the grant the kind of the father of Chrome or the open source part and then you can actually visualize all this if you want there's some exposed flags that you can that I exposed to no js' so in no js' you can actually create even a program that has a closure that has a memory leak and see how it works you know like just and you can actually make it in real time this is just a snippet so you can actually this is going to be in real time showing you how the memory works and you can I don't know if you can see in the back but in the first part 60% rate was survived and later on it was only 3% that survived and so on so you can actually play around with this there's even a no js' project which is called I think nodejs garbage visualizer I don't remember but like just search for it and you can actually get some nice graphs of how your program works so if you are doing something I don't know like try to do a fancy thing you can actually take it and see hey am i suffering from garbage collection because if you want to have some real time thing right and then you can actually understand if you are consuming too much memory maybe you're allocating too many temporary objects maybe you should just reuse the objects so you just create a pool of objects and instead of destroying them you just modify them you destroy the properties but not the objects themselves right to avoid garbage collection and so on so you can actually tune in different kind of ways so yeah that's it that's me and thank you very much any questions [Applause] you
Info
Channel: BudapestJS
Views: 1,668
Rating: 4.9354839 out of 5
Keywords: JavaScript, Garbage Collection, V8
Id: -NpUG8OJSf8
Channel Id: undefined
Length: 21min 15sec (1275 seconds)
Published: Mon Feb 24 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.