Let’s get right to the point, shall we? Is JavaScript’s garbage collector as bad
as some people think it is? I have no idea, so let’s poke at this a
bit and see if we can learn something. Basically, when you allocate memory, in languages
like c++, you have to explicitly free it, otherwise you have a memory leak. In languages like javascript, c#, java, go,
these are all garbage collected languages, they handle a good chunk of the memory management
for you. When you allocate some memory, at some point
javascript has to decide you’re done with it. So if I allocate an array, here’s my array
somewhere in memory, and then I fill that array with stuff, so a number, a string, etc.
it really doesn’t matter. If I now null out one of the entries in the
array, this little blob of memory here kinda gets left stranded now, it has no connection
to anything anymore, nothing refers to it In other words, it’s not reachable, if we
start at a root object, we can’t reach that little blob and thus it’s garbage, it’s
safe to reclaim that memory because nothing can possibly be using it anymore. That’s the general idea anyway, let’s
see it in action with some code. What I've done here is I've written a really
dumb test that just spam fills an object with random strings and other objects. This is called via requestanimationframe,
kinda like you would in a game loop. So see here, I create data, set it empty to
start, then I loop over and create a bunch of junk in there, basically a linked list
of objects with random names. And these all have to allocate memory. The strings take up memory, as do the objects
themselves. This “data” variable isn’t used anymore
at the end of each loop, so we’re achieving our goal of wasting memory and forcing the
garbage collector to act. So, in order to actually test this, you need
to be able to measure how much time javascript spends in garbage collection. You could pull up dev tools and hit the performance
button and dig around there, but there’s actually another way that’s slightly less
known. This is chrome tracing, the backbone of chrome’s
performance testing framework, hundreds of benchmarks measuring god knows how many thousands
of metrics are built on this. In another life, I was an owner here and even
led the infra team monitoring and processing all this data. It’s pretty cool, you can see and attribute
time to various calls, see ipc’s between processes, in general it just gives way more
info. For our purposes, I think it’ll just be
easier to focus on what we want to know, garbage collection events. So from here, you can start a trace, choose
which events you’re interested in, in our case, we only care about v8 stuff, and then
start recording. You just need to wait a bit to capture enough
of a trace, just capture however long you want, don’t to wait to 100%. This uses the doom keys to move, WASD, so
let’s just zoom in and we can look a bit closely at all these garbage collection events. And we can see we were successful at provoking
the garbage collector into action, all these minorgc calls are v8 in action reclaiming
memory. If we’re lucky, maybe we see some majorgc
calls in here too. These are both versions of v8’s garbage
collector in action. So that brings up an important question before
we move on, what is a minor gc vs a major gc. That’s a good question Simon, thanks for
asking that. V8 has a great blog post that talks in depth
about Orinoco, their garbage collector, and goes through some of the high level details
on their collection strategy There’s this idea, the generational hypothesis
in garbage collection, that most items die really young, also called infant mortality,
because why not just use super morbid terms for memory management. But as a super handwavy explanation, v8 basically
divides allocations into new and old. You have the new generation, and the old generation,
and in essence, allocations start in the new generation. So if I allocate a string here, it starts
out in new generation, specifically the nursery section part of this new generation. So then a garbage collection pass happens,
and let’s assume that some objects are reclaimed and others aren’t. If that happens to survive the first garbage
collection attempt, it gets copied out, or evacuated, into a sort of intermediate area. It’s not old yet, but it’s not brand new
either. Another garbage collection event occurs, bunch
of other stuff gets reclaimed. If at that point we find that our object survives,
if it’s still in use, it gets evacuated out to the old generation. V8 makes this distinction so that it can apply
separate strategies for the new and old generations. A minorgc, the scavenger, is optimised to
collect from the new generation, while a majorgc, which doesn’t have a cool name like the
mutilator or anything, does the heavy lifting of sifting through the old generation. Maybe you could call it the reaper or something. So by looking at the trace of our first test,
we can actually select the first 10s or so of the trace, and we can see there were 384
occurrences of minorgc, average about 1.079ms each, and 56 occurrences of majorgc, at around
0.838 ms each, so our test is getting v8 to do both minor and major gc’s. So we can change this test around just a tiny
bit to try to poke the majorgc into action. So what I’ve done here is I just take a
random subset of the data and I stuff it into this.stuff, overwriting whatever happened
to be there. this.stuff is permanent, never deleted for
the duration of the test, so this should in theory keep a bunch of random things for a
lot longer than a single call of DoSomethingStupid() and promote things to old generation When we get a trace of the whole thing and
we’re just going to select the first 10s or so of the trace, and we can see that this
is indeed what happens, we manage to force the garbage collector to work a bit harder,
and we’ve added some time to both minorgc and majorgc. To try and understand the relationship between
allocation patterns and minorgc and majorgc, I fiddled with this a bit more. We’ll try creating fewer objects, but make
them more complex. This is the trace here, we’ll select the
first 10s again, and looking at it, minorgc isn’t changing a whole lot, but majorgc
is changing quite substantially, nearly doubling the time taken vs the simpler cases. I poke at this a bit more by adding a setup
step to the test, where I just allocate a whole buncha random crap and store it. It’s never referenced in the test, but it’s
there. When we look at a trace of THAT, we see our
majorgc calls getting much more expensive. Although there’s fewer calls to majorgc,
each invocation is taking 26ms, which for a game would be basically a skipped frame,
something that’d be inexcusable. So that suggests there’s a relationship
between the just amount of crap you have allocated and how long the gc takes, which I mean, is
hardly surprising, it has to figure out what you’re using and not using, it’s not magic. If you were using another language where memory
management is explicit, such as c++, it wouldn’t matter how much or little you have allocated,
there’s no on-going cost for having it allocated, you only spend time allocating or freeing. Whereas in JavaScript there’s definitely
like a tax on what’s currently allocated. I tried a couple more experiments to see if
I could make minorgc’s more expensive, so what I did here was I allocated just random
stuff in a loop. Looking at the trace, we DID spend quite a
bit of total time in minorgc’s, wall time is almost 10% of the time in fact, although
each singular call to minorgc was just shy of 2ms, so not quite the crazy long time majorgc
took. I took another stab at it by allocating sparse
arrays, and just randomly setting values in there, and v8 kinda laughed that attempt off. Nearly 800 minorgc calls, but they ran in
a pretty negligible amount of time. I consider 10% to be pretty bad, but simultaneously,
it’s a ridiculous setup, so maybe instead let’s build a somewhat more plausible game
setup to see the effects of gc in action. Here’s a quick and dirty game setup, so
what I've got here is a super basic bare bones “game” in quotation marks. I don’t bother actually rendering any of
this, but we do simulate the entities. So you’ve got this list of entities, we
call update on each one, when they die we destroy them, then filter them out. And we keep a stable population by creating
new ones in their stead. The entities themselves do some legit work,
so they have transform data associated with each one, along with values like energy, health,
and they have a spatial hash grid client to track their position in the world and query
other entities. They also have this update function, and then
I force them to do some random stuff. I didn’t feel like making the logic extensive
so I just repeat it a few times. When they’re fully charged, they shoot off
a bullet, so they create a bullet, and that’s added to the world. And bullets are their own thing, they have
a full transform, their position is updated according to velocity, and eventually they
explode and kill nearby things. So it does some stuff. I took a few traces of this running, chose
a bad one, and we can poke around in here a bit. If I select the first 10s of the trace, we
can see some numbers for minorgc and majorgc. We got around 500 or so minorgc calls, taking
on average less than half a ms, but adding up to around 240ms or so. There’s not many majorgc calls, barely any
actually, and the ones that do fire take a little over 1.1 ms each. There was always going to be some overhead
for garbage collection, but at the same time, this is written super poorly, realistically
an experienced game developer will code with memory allocations in mind So let’s make it a tiny bit better. I’ve got a slightly improved version of
these entities that attempts to minimize allocations. In the main loop, we actually reuse entities
instead of allocating new ones, and the entities themselves use pre-allocated matrices and
vectors to perform their calculations. Nothing super crazy here. Looking at a trace of that version though,
we can see a monster of a difference. Whereas before we had around 500 minorgc calls,
this version gets 110 or so, and a touch over 50ms total time spent there. There’s 1 majorgc call accounting for another
ms of time. One really interesting thing that stands out
here is that, in the first example, let’s zoom in and look at these minorgc calls. They’re happening DURING the frame, sometimes
multiple times, meaning they’re contributing to longer frames. In the 2nd example, they’re happening in
the idle time between frames, which is totally by design. V8 specifically built this feature into Chrome,
when it detects there’s an idle period, that’s when v8 has the goahead to quickly
run gc. What I kinda got from all of this was that
short lived objects are often pretty cheap from a gc perspective. That doesn’t mean they’re free though,
but if you can get them down low enough, they may just run in the idle time between frames
and you may not see much in terms of overhead for it. The more crap you keep in memory, the longer
lived cached stuff, the more expensive your majorgcs are, as we saw with my stupid synthetic
test. I know my own code though, usually, so this
could probably be at least partially mitigated in the future by just letting me tell you
via some sort of api that gives hints to the garbage collector. Something like that, I dunno, I’ve given
this a full 10 seconds of thought, you’d need to actually plan this out more. One major problem with javascript though is
that, although for your own app you might be able to control your allocations, with
3rd party stuff, you have very little control, as far as I know. I mean, in something like C++, its the sorta
same deal, but at least you can override memory allocation at a global level, and depending
on your application, extend this to 3rd party libraries as well. In JS, no such luck, I can’t isolate a third
party library to its own memory pool, so that’s just something you have to live with. Another is that, yeah of course there’s
a cost to having gc, direct management will be more performant when done right But in reality, decently made games don’t
just allocate and free on a whim either, most do it in a very controlled way. I’ve spent enough time moving systems to
different allocation patterns, linear allocators, pool allocators, LIFO and scratchpad style
frame allocators in order to wring out extra performance. C++ programmers may love to toot that horn,
but even among gamedevs, that was usually handled by just a few people at the studio,
not everyone. In JS, you have some control over memory,
but it’s pretty tenuous. Mostly it’s down to your pattern of allocation. Minimizing allocations by reusing objects,
that sort of thing. Maybe you could go crazy using typed arrays,
but let’s save that for another time. And lastly, how much memory overhead does
js incur for doing all of this? How’s performance on FireFox? etc. etc. Great questions! No idea, didn’t test those. Maybe next time. Throw some comments about what you’d like
to see tested in the future. Until next time,
Cheers