The secrets behind Angular’s lightning speed | Max Koretskyi | #AngularConnect

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
My name is Max Koretskyi. I'm known as the Wizard because I like to dive deep into the internals of frameworks, other stuff, and share my find I can say with the audience like yourself. I will be talking about what makes Angular fast. It's similar to what Miško talked about but more in a conceptual way. I hope this talk, valuable for you. I started Angular In Depth. How many of you know of Angular in depth? I work for ag-Grid. Our mission is to build the best data grid president world. As far as I can tell, we are doing a great job following our mission. The next slide is usually where I discuss the slide is about me where I discuss my writing activities, speaking activities, community engagements, and it's funny that I went online on YouTube to check comments for one of my talks, and the first comment was it's interesting that this guy has never written a line of code, and yet he teaches us how to write code, how the code works. I realised that, although I discuss and talk a lot about my writing and speaking activities, I've never said that I worked as a software engineer. So changed my slide, and I can assure you I've been programming for about six years. I know what JavaScript is! Besides that, I run Angular In Depth publication, we get over 8,000k views a month. Recently, I won the international IT award in Ukraine in education category. So, let's talk about what makes Angular fast. And these are the three components that I distilled. So, you need to write optimised code, the code that is optimised for the execution by virtual machine, VA, this Chrome usually. You need to use correct data structures. Particularly Angular uses a very interesting kind of data structures that is known as probabilistic data structures. And you need to have a compiler, right? As we all know, Angular does have a compiler. So let's start with the optimal code, what kind of code you can find if you explore Angular sources. There is a document, actually, in the repository called Perf_nodes, and you can see all the interesting stuff that you can find in Angular. For example, you can find this there they don't use recursive functions because recursive functions can't be inlined, and Miško talks about how VM optimises code. They don't use these kind of functions. Sometimes, they use regular for-loops instead of for each methods on the array. They use link lists instead of arrays, for short arrays. They don't use top-level variables because usually minimum fires cannot rename them, so it results in a bigger bundle. They use bitfills, and, importantly, they ensure monomorphic practices. This is something that Miško talked about. The perspective I will give you on monomorphism will complement what Miško was talking about. If you will be rewatching these talks on YouTube, I recommend that you first watch my talk and I will give you a conceptual understanding of what monomorphism is, and then watch Miško's talk where he shows you how it manifests itself in the code. Okay, so monomorphic access is very important, like Miško said. This is what I found in the docs. It's about 100 times faster than megamorphic property access. What is this access? It's a magical phrase. It's just a property access. Like, this function, a node. It takes an object, and inside this function, we access the property X. So this kind of function we write every day. Interestingly, these kinds of functions are part of the code inside the framework that are executing very often. Usually, they're part of the change detection cycle and can be executed several times a second. Is this property access monomorphic or not? Well, it's impossible to tell by looking at this function, because whether the property access is monomorphic, polymorphic or megamorphic, it's determined at run time by the VM, and I will show you now and plain to you how it's termed but first, we need to understand that property access in VM is pretty complicated, right? Miško talked about shapes and hidden classes and how VM stores objects, so, if you're looking for more in-depth coverage, you can check out any of my previous talks on YouTube. I went pretty in depth into how monomorphic works and how the VM store shapes. This is a complicated topic, actually. One article I came back to it three times to re-read it and I still couldn't figure out some parts. Luckily, I had some help. Benedict from the V8 core team helped me a little bit to figure out some parts of it. If you don't understand all of it, or some part of it, don't worry, because I was in the same boat. It's not easy, really. So, what we need to take from this is that property access is pretty complicated. So, as we all do, and VM maintainers, they want to somehow optimise that process, and, inline caching, this is again, Miško talked about it, is the process of optimising this property access. And the key word here is "cache". So, instead of going each time trying to figure out where exactly the offset is stored, trying to resolve the shape that corresponds to an object, we can go through that long, long process just once and then cache it all, right? So, let me show you an example. So here we have this function. Get x that returns just a property, okay? So somewhere inside the virtual machine, particularly inside VM, every function is represented as the closure object. So, closure get x, and every function has cache, it's called feedback factor. This is the terminology that VM uses. You can think of it as just an array - a very, very big array. For each property access inside the function - this is important, VM creates a cache for every single property access. There is a cache there, and this is called inline cache. So, what happens? Suppose now the function starts running. Your code is being executed by VM. The virtual machine collects this information. It knows how many times your function has been executed, and it is all this information is stored in this cache, but besides that, there is something else that the virtual machine stores there. So, we call this function getX, we pass in an object. If we call it for the first time, the virtual machine needs to go through the process of trying to resolve the value. First, it needs to find a corresponding shape. This is where the metadata is stored, particularly the offsets to the property. Once it has gone over that process, it knows the information that it can now cache, right? The first thing it can cache is the shape, so, it stores the shape that is a response to that object, basically the description of that object. It can also store the offset. This is the information that indicates where exactly in memory you can find that property. Okay? So, the next time when we go through that process, we need first to compare the shapes, and Miško show that code "if", he called that ID something. Yes. But hidden class ID. Yes, compared to hidden-class ID, if it matches, then there is no need to go through that lengthy process again. Right, you can just pick the information from the cache. So we stored that, and there is also an interesting thing called state. So VM determines the state of the cache. And, inline cache is first and fore most a cache with its property, so every cache has entries, right? Length and capacity. So, if there is only one entry in the cache, the state is defined as mono. Monomorphic. This goes from here. "Mono" means one; "morphic" of a form, it allocates shape, or hidden class, right? It's a general notion. This is where the phrase "monomorphic property access" comes from, right? So, monomorphic property access is the access if the function has been called, it doesn't matter how many times, but as the parameter, you're passing an object, or many objects, that share the same hidden class or the same shape, right? So it means that, in the cache, there is only one entry. Okay? Now, as we've seen that, so we know now about monomorphic property access, and it means that there's only one entry in the cache, and here I'm calling the same function three times, and I'm passing objects with different values, but they all have the same shape because set of keys is the same. Now, what happens if you pass in another object, right with a different shape? The state transitions from the monomorphic to polymorphic. It means that there is more than one entry in the cache. There's more than one shape in the closure. It goes to polymorphic state, and, as I told you, every cache has the capacity, so in V8, this inline cache, the capacity is four. Passing and cache not more than four different shapes. Okay. So what is interesting here is that I didn't know that, but I was running some experiments, and it appears that if the object has the same set of keys but didn't prototype, the VM considers it a different shape, so I'm passing, calling the third-time function here, and I'm doing object-create to create the prototype. Now our cache is polymorphic. And, if you continue calling the function, and passing in different types of shapes, right, the state of the cache will transition to megamorphic. Megamorphic state, it means that the virtual machine still uses the cache, but it's the cache for all object accesses in all the application. Miško showed the problem with sharing one big cache, right, is that the new values will override old ones, right? We saw that what happened when we called too many times, like 10,000 tiles, right? The time increased dramatically. Okay. So, how does Angular benefit from it? I'm going to quickly go over that. We have different types of view nodes and potentially, if you would follow the object-orientated programming principles, we would need to cleat different classes for each different type of a view node because they all have different fields, but in order to ensure monomorphism, Angular creates a single class for all data structures, and just using some fill, for example, type, filled to distinguish between what kind of node that is in run time. Okay, that is monomorphism. Now, let's talk about data structures. Another interesting topic. Before I started working on this talk, I didn't know that such data structures existed, because I don't have a proper computer science background. I learn as I go, and my hope is that I touch a little bit here and you all will also become interested in what kind of data structures exist beyond regulated structures, like array objects. Probabilistic data structures. Why are they needed? They are usually used as an optimisation layer. They can't be used by themselves because they have some limitations. I will present some of the limitations later. But they have very little requirement and memory, and they're very, very fast. These are two characteristics of all probabilistic data structures. Now, bloom filter is the data structure that Angular uses and seems to be the most iconic data structure in the world - at least that's what the article I said read, so I believed it! Yes. So, so they're using databases that are the hard-core usage. They use the networking end. They even used increased occurrences. The bloom filter is designed to ask a question, is there an element or not? If you have an array to find out if there is an item in the array, you would just go, you would need to go by one, by one, and do the comparison, so one million elements, it's a long way to figure out if there is an element or not. Bloom filter allows us to do it much quicker. It naturally can have two answers, no. - two answers: no and yes. It's tricky because probabilistic data structures don't give you the answer yes. The answer is maybe that's why they are called probabilistic. And you can do some tuning. From what I looked at Angular, they ensure it's always yes. I won't have time to cover how they do it. It's interesting. So, it's maybe a no. I hope that, by now, you're wondering how is it possible that you don't know for sure if the element is the set or not? Yes, this slide. I forgot about it, right. All probabilistic data structures, they share two common characteristics, and bloom filters as well: they use hashing functions to encode items. You feed them a good item and you get a hashing function result which is just a number. And they don't store entire items, so you cannot use probabilistic data structures as a replacement for regulated structures like array. They usually go one with one. So we take a set of three elements - I'm going to show you how bloom filters work, a quick overview. Bloom filters are just bitfills, so, you run some value through the hashing function, in our case, for example, this is very, very primitive hashing function, takes the first value and produces the number, and what you do is you set this bit using bitwise operators. John is the whole value, right? We got just number 2, so we set the number 2, the second bit in the bitfill. So then, if we need now, if John in the set or not, what do we need to do? We take John, we pass it through the same hashing functions, we get the same number, and what do we need to do now? Who can tell me? Nobody? We need to check if this bit is set or not. Right, because we know that, in the beginning, we set this bit. Now we need to check it. If this bit is set, then John must be in the array, or in this set. Right, but I told you that it's never yes, right in it's always maybe. The problem is that your hashing function can produce the same number for different values, right? The collisions are possible. So, it may be that some other value produced number 2, and thousand the other value is there. So you always first check the bloom filter, and, then, if you get the answer, maybe you can go and check the data structure that pairs with it. This is exactly what Angular is doing. And it's using this technique in its dependency injection system. So, we all know that, in Angular, we have hierarchical injectors, right? That followed the pattern, the hierarchical pattern of components. Okay? So, for each component, Angular creates an injector.? And now, if we need to resolve the value, for example, the bottom-most component requires a widget manager that is registered somewhere at the top. Now, in order to get that widget manager, the run time has to go through multiple injectors until it reaches the top most. In each of those injectors, it needs to check if there is item in the set or not, if this widget manager is declared inside site analytics component, widget component, and so on. And, again, the components, injectors, contain large number of services, and you have quite lengthy hierarchy, it takes some time. So, here, we're using just regular data structures, and we learned that we can use complement data structures with probabilistic data structures to perform that check. This is exactly what similar is doing. It adds a bloom filter to each injector. And we know that, if injector gives us the answer no, it means that it's 100 per cent correct. We can now go and check the other injector. And, it's very fast. It just one bit-wise operator. It's extremely fast. What Angular does, it first checks the bloom filter. The first bloom filter. If it gives this answer no, that's okay, we believe it's 100 per cent correct, so we go up. And we go and check the second bloom filter. The answer's no, and only when we reach the latest, the bottom-most injector of bloom filter we get the answer maybe, and we now know that, okay, maybe not there, so we will just go and check the injector, and that's what Angular does. It checks the injector, finds the service that is required and then injects it down. Okay, so that is probabilistic data structures and particularly bloom filter. And the last piece I want to talk about is Angular Compiler. So Angular, as it appears, has a compiler. And why does Angular have a compiler? That is actually a valid question. Why does a framework need a compiler? I asked this question on Twitter. What do you guys think? Why does Angular have a compiler? The most common answer was, well, to translate HTML-like syntax that the way we define templates to a run time code, because the browser doesn't understand interpolations, for example, directives. And that is correct answer. Here we have Angular-like template, and then we run the compiler over that, and it produces some JavaScript. This is the output of the new compiler. However, other frameworks have compilers, too. For example, in React, we have JSX compiler. Does it perform the same role as in Angular? This was the question that I wondered and tried to discover. This is similar template-like view definition in the React, so we run JSX compiler over that, and it also produces JavaScript. Now, if you compared the output, however, you will see the real difference between JSX compiler and Angular compiler. The JSX compiler defines a view using JavaScript. There is no run time code that is produced by the compiler. So it's a view definition, it's a particular definition of H1 element using JavaScript objects. Whereas Angular as a compiler produces run time code which is more important, it reduces direct render instructions. These instruction s they're very fast and they allow you to process changes in the framework in the very short amount of time. For example, in React, you would need to go through every single element and check if it changed or not but with direct-render instructions, each instruction knows to which element it corresponds. For example, in our case, the instruction, the instructions produced by Angular compiler takes interpolation, they can just check this particular node. And, as you can imagine, it's a lot faster than checking every single element that you have in the definition. This is of course enabled by the static nature of HTML templates in Angular, because, when you define them, Angular compiler has a lot of information to work with, so it can produce this kind of code, right? We used directives to signify which parts of the template are dynamic, so that it needs to be checked during change-detection run. Okay, so, direct rendering instructions is the output of Angular compiler and that is the biggest between JSX compiler in React. Svelte would be the other frame that follows this pattern as well, also produces run time code that is optimised for execution, but in Angular compiler, it's very sophisticated, and it allows the framework - particularly what is important is that it produces the code that is run during change detection cycle, very, very often, and this code must be most optimised. There is the will the other benefit to having a compiler, and the Angular team has talked a lot about that, and this is of course smaller run time. Right, because, if you have just one text wining, the compiler needs to produce only one, or the single instruction to process that binding. Right? As soon as you start adding more stuff to the template, for example, you start acted directives, like ng style, it needs to produce more elaborate JavaScript. The same goes for other directors like *ngFor. If you don't include them or use them in your template, you won't see the instructions or the run time. The code that is responsible for so actually its function bodies because each of those instructions is just a function. So Angular doesn't include this logic for these functions into the resulting build, and that results in a smaller bundle. Okay. So, this is a mountain, my mountain. I always give these talks, these deep technical talks, and maybe one or two people in the audience sits and nods. The other ones, people just look at it, because I understand that this is difficult to get it. It's very difficult to produce, to try to elaborate and explain these technical details, but my hope is that you don't just watch this talk today and think this is something that I want to do, or I want to look at, my hope is that you will go home and re-watch my talk, Miško's talk where he showed how exactly to track this monomorphic property accesses, and the facts that it can have, and then maybe you will buy a book about data structures because we as regular JavaScript developers, we use arrays in objects. We don't need to know what is hash map and trees, and all these kinds of other data structures that server-side developers use. My hope is that this talk will inspire you to go and buy that book, go and read that article, re-watch that talk, and, by doing so, you will all become extraordinary developers. Thank you for your attention!
Info
Channel: AngularConnect
Views: 4,352
Rating: 4.8490567 out of 5
Keywords: angularconnect, javascript, angular, angularconnect 2019
Id: nQ8oJ1rpwIc
Channel Id: undefined
Length: 30min 26sec (1826 seconds)
Published: Fri Sep 27 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.