Thank you very much, Rachel, and welcome to
my talk, JavaScript engines, the good parts. Initially, I was going to give this talk together
with Bryan Tilson from Microsoft โ you know, nice guy, blond hair, working on a JavaScript
VM. Then life happened, and now I'm standing here
alone. >> What are you talking about? I'm right here, man! [Applause]. [Laughter]. >> This is weird. BENEDIKT: Now we have two Bryan lookalikes. I can't do a talk with both of you. >> Thank you for the cameo. >> Thank you. [Laughter]. >> Applause for Peter! [Applause]. With that out of the way, let's get started
with the actual presentation. JavaScript engines, the good parts. Nowadays, JavaScript runs not only in web
browsers but also in Node.js, in react Native, in Electron, and on IoT devices, and I've
even been told it runs in space. It is everywhere nowadays. As a developer, learning JavaScript, or improving
your JavaScript skills, is an excellent time investment, so, let's take a look today behind
the scenes of how JavaScript engines work. This is V8. It is the JavaScript engine used in Google
Carol, Node.js, and Electron, for example. Recently, we've been speaking about how V8
works behind the scenes. However, it's not just V8 out there, there
are more out there. We are going to do a little bit different
today. >> Yes, actually, we are going to look at
some of the fundamentals that are common to all the JavaScript engines out there. These fundamentals make it possible to write
these amazing applications we see today and of which we've heard a lot at this conference
already. Starring with that, it's not just V8, there
is an engine called SpiderMonkey which powers Firefox, and fun fact: there is even a node
fork using SpiderMonkey called SpiderNode. >> Then there is Microsoft's in Edge called
Chakra and the main part is open source. There is a node chakra core project powered
by that rather than V8. >> There is also JSC, or JavaScript core,
which is the engine that was originally built for webkit, and is nowadays powering Safari
but React Native applications. These are the major JavaScript applications
out there. If you want to play around with any of these
engines directly, which is without going through an embedder or Node.js, then you can use โ you
install it globally, you follow the readme instructions before after that, you can run
V8 SpiderMonkey, Chakra, straight from the command line. The JavaScript you run in there will run directly
in the engine itself. Now that you have these JavaScript engines
installed on your system, let's look what they have in common. It starts with the JavaScript code that you
write. The engine parses it into a source, the AST,
into the interpreter which can do its thing. At this point, the engine is running your
JavaScript code. However, to make it run more smoothly and
efficiently, there is also an optimiser compiler pipeline, so the byte code gets fed into the
optimiser compiler, alongside profiling data that we collect while the code is running,
and that way, the optimiser compiler can make some assumptions based on the profiling data
and generate highly optimised machine code that runs more efficiently. If one of those assumptions turns out to be
incorrect later, it's no big deal because we can de-optimise and fall back through the
byte code through the interpreter. So let's focus on the important part here,
which is the part where the code actually gets run. Each JavaScript engine has some kind of pipeline
with an interpreter and an optimiser compiler pipeline. The interpreter generates byte code and the
optimising compiler generates highly optimised machine code. >> You just stole my punch line. This was the V8 slide. So, actually, that's exactly how V8 works. Our interpreter is named Ignition, and the
interpreter is responsible for generating and executing the byte code. And, as it is executing the byte code, it
collects profiling data, and, when a function gets hot โ so let's say you call it a couple
of times, then we feed it to our optimising compiler, and that using the profiling data
to generate highly optimising machine code. >> I love hot functions. SpiderMonkey does it differently. They have two optimising compilers, so the
interpreter optimises into the baseline compiler which produced at a somewhat optimised code. While that is running it gets run through
the iron monkey optimising compiler which produces even more highly optimised machine
code. >> You did it again. This is exactly Chakra core, except they have
different names. In Chakra core, the first optimising compiler
is a just-in-time compiler and generates somewhat optimised code. When a function gets really hot, then it feeds
it to fulgit and this generates awesome code. >> Then JSC uses three optimising compilers. It starts with LLint, the low-level interpreter
which produces the byte code, and, from there, they can optimise into the baseline compiler
which produces somewhat optimised code. From this baseline compiler, based on some
heuristics they can optimise into the DFG compiler or the FDL compiler. So, based, let's look at all what else all
these JavaScript engines have in common by zooming in on some aspects and how they implement
in common, because, besides these difference that is we just discussed at a very high level,
all engines follow the same architecture, they have some kind of parser, and an interpreter
compiler pipeline. For example, how do JavaScript engines implement
the JavaScript object model? And which tricks do they use to speed up accessing
properties and objects? As it turns out, all major engines implement
this more or less similarly. >> But, wait, isn't it the case that all objects
in JavaScript are just dictionaries? >> That is true. If you look at the JavaScript pack, objects
are dictionaries with string keys, and the string keys map to not just the value but
to something the spec calls property attributes. In this example, the X and the Y are string
keys in a dictionary, according to the spec. The five and six values are just values within
the property attributes for.property. Other than the value, attributes can store
whether the property is writable, numerable, or configurable. This is not something that JavaScript make
up but part of the spec. Writable determines whether the property can
be reassigned to, innumerable, means that the property can show up in loops, and configurable
means it is a deletable property. You can get to this property attribute in
JavaScript for any object and any property by using the object.getonlyproperty descriptor
API. Interesting. >> That's how JavaScript sees objects. What if we take arrays? >> You can think of them as special objects
with one difference: they have special handling for array indices. >> What is an array index? >> It's a spec architectural. They are limited to the two to the power of
32 minus one item, so that is the maximum array length you can have. An array ending is any valid integer within
that range to the power of two to the 32, minus two. >> Not every integer is a valid area index? >> Exactly. >> You mentioned more differences? >> Another difference is the imaginal length
property that arrays have. >> Magic? >> It is pure magic. If you look at this example, the array has
a length of two in the beginning. Then we add another item to the array, and
automatically, the line property is updated. JavaScript's spec defines how this engines
have to do this automatically in the background. As a JavaScript developer, you have to manually
update it. So let's take a look at how JavaScript defines
arrays. This is stored similarly to objects. For example, all the keys, including the array
indexes are strings, so the first [sound feed distorted]. >> Okay, that looks really spectacular to
the object. [Sound feed distorted]. Most common operation in these problems is
the property ... I guess we had better make that fast. And, in fact, what we see in the wild is that
most objects in the same programme tend to have the same property keys, at least there
is a set of objects that have the same, so you could say that all of them have the same
shape. >> Right. That makes sense. It is also very common to access the same
property on different objects that have the same shape. So, with that in mind, JavaScript engines
can optimise object property access based on these shapes. So let's take a look at how that works. >> Okay, let's assume we have this one object
here which has properties X and Y. It's represented using this dictionary data
structure that we saw before. So be the X and Y are stored as strings in
the dictionary, and point to the property attributes for these individual properties. If you then write something like object.y
in your programme, the engine has to reach out to the JS object, find the key inside
of it, reach out to the property attributes, and eventually, load the value from it. >> Okay, so where are these property attributes
stored? Stored them as part of the JS object data
structure itself? That seems wasteful. If we expect more objects to have the same
shape, you would end up duplicating that information for every single object. That seems wasteful memory-wise. >> That's a good observation. I think some people made this observation
already. At least this is kind of how we represent
it in JavaScript engines nowadays. What we do is that the engine stores the shape
separately from the value of the attributes. In this case, the shape describes which properties
you have, and which property attributes you have on this, except for the value. So, instead, the property information contains
the offset where you find the value inside of the JS object, the offset where you find
the value. It totally makes sense when you start to look
at multiple objects that have the same shape. Now you only need one instance of this shape,
and the object only contained the information that is unique to this object, so you don't
repeat this information that is common anyway. >> So even if we have a million objects, there
will only be one shape as they all have the same shame and point to the same one? >> That's true. >> It seems like it would save a lot of memorial. >> I hope so, yes. This is not even something that JavaScript
engines made up, this happened before. So, like, there's been a lot of research on
this. If you look at academic papers, they're not
called shapes but hidden classes. >> That would be a confusing name in a JavaScript
engine because they're already a name in JavaScript. >> V8 calls them maps. >> It is the same problem. It is a terrible name! >> How about times for Chakra core. >> That's not confusing at all. >> I can offer structure for JSC. >> I like the name "structure", with , that
is what makes sense. I will keep calling them shapes because that
is what SpiderMonkey. I like the nail. It seems like a common thing if you have an
object to add a property to it. So what if you have an object with a certain
shape, you add a property to it, how does the JavaScript engine find the new shape? >> So, what JavaScript engines do is that
that the shapes inside of the engine form so-called transition chains. Let me run you through the example quickly. Let's assume we start with the empty object. This object initially points to the empty
shape that doesn't have any property on this. Now you start adding a property to it, like
in our case, we add X. It means we transition to a new shape that
has the property X on it, with the property information, and away append the value for
this property to the object, and then we record inside of the shape, inside of the property
information that this value can be found at offset zero now. So let's say we add yet another property to
it, like this case, Y. Then we do the same as before, we introduce
a new shape that contains this new property, in addition to the X property, and we append
the value to the object, and we record to the value can be found at offset 1. However, if we do this, then we might waste
a lot of space, because we repeat the fall table all the tile. >> Repeating X like all the information for
X is duplicated. >> All the information for X is duplicated. That's not really what engines do. Instead, what you do is you just remember
the information about the new property that was added. >> Right. >> We don't have to repeat the information
for X because we can just find it earlier in the chain. >> That's the trick, so, we introduce a back
link to the previous shape so you can walk the transition chain backwards until you find
the shape that changes the property and then you know where to look into the object. Just looking at this example, if you now need
to find X on it, you would start at the last shape, and you see okay, this is not the one
I'm looking for because it introduces Y. You walk back once, and there is X. Awesome. I know where to find it. >> That's what happened when you type o.x
in JavaScript. What if you have two objects with the same
shape. You have two properties to each of them. There is no way to chain the shapes? >> We have instead transition trees, and we
branch off in various ways from shapes. Let's look at this example. This is what we learned before already. We have the MP object. We introduce X on it which means we introduce
two shapes from a single chain to the shape that contains X. If we ran the second line of code, if we start
again with an empty object, and we add a property Y to it, we branch off the empty shape with
the property Y, and we end up with a tree that ends up with a total of three shapes
in two chains. >> Right. So does that mean if we walk up the shape
tree, we always end up at the empty shape? >> No. Not necessarily. There is always an exception to the rule. And the reason is that JavaScript engines
have special treatment for object literals that already introduce properties from the
get-go. Like in this case, the first thing is what
we saw before, we start with the empty object and then add X to it, and the second line
is we start with an object that contains an X. And you can imagine that it might be a bit
faster just to construct objects with the property on it already. So, what we end up, for example, in V8, and
in SpiderMonkey, is this is the first case that we know, so, here we start from the empty
object, and we add X to it later. That is exactly what we had before. But now, the second example we start with
an object that already contains X from the beginning, so we introduce a new root shape
that already contains this X. And we don't branch it off the empty shape. >> We can skip over that empty shape altogether
in this case. >> That's true. The reason why we do this is to keep the transition
chains short, because otherwise, it is a lot of metadata that we waste, and also because
it is more efficient to construct objects this way. >> It sounds super familiar. Didn't you write a blog post about this? >> Shameless plug โ yes, I did! I actually published a blog post last year
about this where I tried to highlight how these subtleties with the shapes can have
effects on real-world performance, especially to be โ for common applications. >> I read that blog post. It talks about these things called ICs. What are those? >> That is actually the magic inside of the
engine. No, it's the not! It is more magic. It stands for "inline cache" which are the
key ingredients which is necessary to make JavaScript run fast, and also the main motivation
for actually having shapes. >> So how do these ICs work exactly? >> So JavaScript engines use ICs to memorise
information where to find properties on objects, so that we don't need to repeat expensive
property look-up on each property access. >> Okay, so how does that work? >> Okay, let me run you through this example
we have a function get.x which takes an object o and logs the property x from it. >> That seems a common thing to do, get a
property somewhere? >> Yes, I think I've seen it in the wild. If we feed this to a JavaScript core, then
it generates the following byte code which contains two instructions: the get by ID instruction
which has the property ID look-up, loads X from arc one, first argument, which is 0,
and stores the result into lock zero, and the next instruction just returns whatever
is in lock zero. >> This makes sense but how do ICs come into
play here? >> It's not just that. JC introduces an inline cache into the byte
code, so, in this case by the get by ID instructions. This IC contains two un initialised slots
initially. When we call this function, let's say we call
it with an object, X:a. In this case, the property value for X is
founded opposite zero. Now we invoke the function with this. We need to reach out to the shape of this
object, search for X inside, load the property information from it, determine the offset,
then go back to the object, and load whatever is set at offset zero. >> That sounds like a lot of work. >> It is. Since the engine already did this work, it
makes a lot of sense to memorise the information we can reuse on the next call for this function. Let's say we call it โ it makes a lot of
sense to recall the information for the next call. What we do is we memorise the shape that we
have seen, and also the offset at which we found X inside of this shape. When we then call this function again with
an object that has the same shape, we only need to check, "Oh, it's the same shape. No, it is already offset zero." I don't need to reach out to the property
information at all. >> Wow. We can get rid of that expensive loop altogether. It sound great. >> Yes, and it's significantly faster now. >> How would this work for arrays where you
can expect most elements to be array indices, most properties. You wouldn't want to store property attributes
for each and every array index in your codebase. You know they're going to be rewritable, configurable. >> Totally, that would be a total waste of
space. >> What happens instead? >> All engines make use of the fact that all
array properties are writable, configurable, innumerable data properties. Let's look at this array. The array has a property lang, which this
โ length. Let's say the length is stored stored inside
the array. For all the elements that in the array indexed
by array, we store them in a separate elements packing store, and it only stores the values,
and it has this implicit tag attached to it which says whatever you find in here is rewritable,
innumerable, and configureable. >> Away don't have to store property attributes
for array elements because they match the default. >> They're default values anyway. >> What if someone overrides the attributes? This is JavaScript. JavaScript is wild. >> I know you're wild. >> You can totally do this, though. What if I use object or define property on
an array element, and I set one of its attributes to the non-default value of false? >> You know, Mathias, whenever you do something
like this, you kill a kitten! So, like, just look at this one. >> Awp. >> You cannot kill it. You want to kill this one? >> So cute! >> Come on. >> My heart melts! A whole bunch of them. >> It's a family of kittens. Remember, don't do this! Kittens aside, for these edge cases, the engine
represents the entire backing store for the elements as a dictionary which map from indices
to full-fledged property attributes like in the JavaScript specification. >> If I use this define property on one array
index, the whole array gets stored like this? >> Yes, because the basic assumption of the
engine is that you don't do this. Also, remember the kittens. Come on! >> Got it. I think you're saying I should avoid using
object define property and array indexes which is a weird thing to do anyway? >> Yes. >> Okay. I won't do that again. We've learned a lot about JavaScript engine
internals today and we got coding advice out of it as well. Let's recap the main takeaways. Always initialise your object in exactly the
same way so they don't end up having different shapes. And second don't mess with property attributes
of array elements so that they can be stored and operated on efficiently. Think of the kittens! That's it. Thanks for listening. [Cheering and Applause].
Oh these moments, when developers trying to joke :)
That joke at the beginning is so unnecessary and bad that it hurt.
You never wanna go Full JIT