Lin Clark - A Cartoon Intro to WebAssembly >> All right. We are going to wait for a few people to get
settled and continue on with our next talk. How many people use various different ways
to get their information about how things work on the internet? I know I use comics pretty often, I find them
accessible, and I've read Lin Clark's comics. Let's hear it for Lin! [Applause]. LIN: Thank you, and hi, everyone. I'm Lin Clark and I make code cartoons. I work at Mozilla. The things like the Rust programming language
and Servo and WebAssembly which is what I'm going to be talking about today. Since this is JSConf, I'm guessing most of
you are JavaScript developers, so you know that, in JavaScript circles today, there's
a lot of hype about WebAssembly. People are talking about how blazingly fast
it is and how it is going to completely change the way we do web development. But a lot of these conversations don't go
into details about exactly what it is about WebAssembly that makes it fast and I hear
this kind of rhetoric, but I don't hear the details to back it up. The inner sceptic in me comes out. In this talk, I don't want to tell you about
how fast WebAssembly is going to be, but I want to help you understand what it is about
WebAssembly what makes it fast and in what circumstances it is fast. But first, what is WebAssembly? WebAssembly is a way to run programming languages
other than JavaScript in your web pages. So, in the past, when you wanted to run code
on a web page, you had to use JavaScript. If you wanted to change the DOM in response
to an event or run a calculation, you were using JavaScript. With WebAssembly it will be possible to do
these things with other languages besides JavaScript. So when people say that WebAssembly is fast,
what they're comparing it to is JavaScript - that's the apples to apples comparison. Now, I don't want to imply it is an either
or decision, you're going to be using WebAssembly or you're going to be using JavaScript. We think that people will be using these two
hand in hand in their applications, but it is useful to compare the two, so that you
understand what this improved performance of code running on the web could mean. In order to understand this, let's look at
a little bit of performance history of code running on the web. JavaScript was created in 1995, and it wasn't
designed to be fast. There are a number of features in JavaScript
that make it hard to make it fast, and - types where you have a string or an integer, you
don't know, that even at runtime, that variable could change. But these features also make it easy for developers
to get up and running with JavaScript really quickly, so JavaScript developers accepted
this trade-off. They accepted that their code was going to
run a little bit slower because of this ease of use. And for the first decade of JavaScript, that
was true, that JavaScript was pretty slow, and then the browsers started get more competitive. And about 2008, a period started called the
Performance Wars, the browser vendors started improving their JSLint engines to make things
of faster. The technique they used was introducing JIT
compilers to the JavaScript engine, and I will explain more about that later. Let's look at the impact that the JIT compilers
had. With the introduction of the JITs, you see
an inflection point in the performance of JavaScript. All of a sudden, JavaScript code was running
about ten times faster than it had previously. And these performance improvements continued
over the next decade. With this improved performance. You start seeing JavaScript being used for
things that you never expected like Node and Electron. These new applications are possible because
of this improvement in performance, because of this inflection point ten years ago that
we have the applications that we do today. That's why it's interesting that we may be
approaching another one of these inflection points and the speed of code running on the
web with WebAssembly [sound cut] to do this, I need to explain a little bit where JavaScript
spends its time today. Here's a diagram of where the JS engine spends
its time for an hypothetical app. Any app will be different. We can use it to build up a mental model. You may have seen diagrams like this one before
and be confused why there are fewer categories in this one. I've condensed the number of categories so
that is it easier to talk about it. These categories are parsing, compiling and
optimising, re-optimising, executing the code, and garbage collection. Now, let's look at what this diagram would
look like for WebAssembly. You will notice that some of the bars are
shorter, and some are missing. In this talk, I want to explain what WebAssembly
changes, how it makes the amount of time that the engine spends in these tasks shorter or
gets rid of them altogether. But first, let's look at where JS engines
would be if we had not introduced the JIT. In the early days of JavaScript, this diagram
would have looked more like this. There was parsing, running the code, and garbage
collection. We're maybe an execution bar shorter. What made that rub faster was the introduction
of a JIT, the overhead it added, the compiling and optimising. Now with WebAssembly, we want to make these
bars even shorter in in order to see how we can do that, we are going to need to dive
into the work that the JIT does. I'm going into a quick crashing course of
Just In Time compilers. This is an overview. Different engines have different architectures
and those architectures have changed over time but most apply to most of them right
now. This is be review for some of you but I will
be quick. I want to make sure we are all up to speed
on this. When you're developing, you have a goal and
a problem. Your goal is that you want to tell the computer
what to do. The problem is that you speak a human language
and the machine speaks a machine language. Even if you don't think of JavaScript as a
human language, it really is. Because it's been designed for human cognition,
not for machine cognition. I think of this like the movie Arrival. We have aliens and humans trying to communicate
with one another. It's not as easy as translating word-for-word
from one language to the other because the two groups actually have different ways of
seeing the world, and that's true of humans and machines too. I will explain more about the differences
in the way we think later, but let's look at the process of translating. In programming, there are generally two ways
of translating. You can either use an interpreter or a compiler. With an interpreter, the translation happens
pretty much on the fly, line-by-line. A compiler, on the other hand, doesn't translate
on the fly. It takes time ahead of time to create that
translation and then hand it off. There are pros and cons to each of these ways
of handling this translation. So, for an interpreter, some of the pros are
that it is quick to get up and running. You get that immediate feedback loop. So an interpreter seems like a natural fit
for something like JavaScript where you want the developer to see their progress really
quickly. And that's why, in the beginning, browsers
used JavaScript interpreters, but the trade-off is that, when you're doing something like
a loop where you have to run the same code over and over again, you're doing that translation
over and over again. The compiler has opposite trade-offs. It takes a little bit more time to start up
because it has to go through that compilation step ahead of time. But then you don't incur that translation
cost in loops where you're running the code over and over again. And another difference is that interpreters
are running during the execution of the code, so they can't take too much time to actually
think about how the machine thinks and what the optimal way to communicate with the machine
is. Since compilers are working ahead of time,
they can take that little bit of extra time and think about how best to communicate with
the machine. You will hear that referred to as optimisation. To get the best of both worlds, browsers mixed
compilers in. They added a new part of the JavaScript engine
called a monitor or a profiler. The monitor watches the code as it runs. It keeps track of things, like how often a
function has been executed. At first, the monitor just runs everything
through the interpreter. If the same function is run a few times, that
function is called "warm". As a function warms up, it gets it off the
baseline compiler to create a compiled version of it. The baseline compiler will do it in chunks. Each operation in the function is going to
be compiled to one or more Stubbs. So, for example, the plus and equals sign
will be an operation. The compiler would create a stub for that
and the stub would be specific to whatever types are being used on either side of that
operator. So, if the sum in the array element here were
integers, it would compile to integer addition. If the monitor has set operation again with
the same variable time, so with integers again, it pulls out the stub it has and uses that. If it runs into operation with different variable
types, it will create another stub and store that one as well. As the code runs, more baseline Stubbs for
more operations will be filled in and this will save on translation time and help speed,
up. Like I mentioned, there is more a compiler
can do. It can take some time thinking about how the
machine thinks go and how best to think with the machine. The baseline compiler will make some optimisations,
but it doesn't want to take up too much time because the code is executing at the same
time. But if the code is really hot, if it has been
run a whole bunch, then it can be worthwhile to go through and take the time to make that
optimisation. So, when a part of the code is very hot, the
monitor will send it to the optimising compiler and this will create another even faster version
of that function. In order to make the faster version of the
function, the optimising compiler has to make some assumptions. For example, if it can assume that all of
the objects that are created by a particular constructor had the same shape, so the object
has the same property names, and they've been added in the same order, then it can cut some
corners based on that. So the optimising compiler uses the information
that the monitor has been gathering to make these judgments. If something has been true for all previous
passes through the code, then it assumes it's going to continue to be true. Of course, with JavaScript there are never
any guarantees. You could have 99 objects that all have the
same shape but then the 100th object has a different property, or a property has been
deleted on it. So the compiled code needs to check before
it runs to see whether the assumptions are valid, and if they are, then the compiled
code runs. But if not, the JIT assumes it made the wrong
assumptions and trashes the optimised code. At this point the it goes back to the compiled
version and this is called de-optimisation or bailing out. Usually optimising compilers will save you
time, they will actually make the code run faster. But if you have code that keeps gets optimised
and then gets bailed out on and then gets optimised again, if you get into the cycles,
it can actually take more time than it would have just running through the baseline compiled
version of the videoed. So a jot of JITs will keep track of how many
times they've tried to optimise a function, and if it keeps not working out, then will
he will mark it as don't even try optimising this again. So that is the JIT in a nutshell. Code starts off running in an interpreter
and the monitor collects information about it. Then it will send code off to be compiled
depending how often that part of the code is being run. Now that we understand more about the work
that the JavaScript engine is doing, let's look at ways to maybe make this execution
go a little faster. One way would be to get rid of some of the
overhead, so we can move some of this ahead of time. But in order to do that, we would need to
get rid of the dynamic types. If we are going to be optimising ahead of
time, we need the types to be explicit in the code, because we aren't going to be monitoring
it at runtime and see what types are running through it. These dynamic types that can change at runtime
are a problem. I already suggested that is what made JavaScript
successful: the dynamic types help developers get up and running quickly. Why we would want to change something that
made JavaScript successful? I want to be clear here that we don't have
to change anything in JavaScript to take advantage of the benefits of WebAssembly, but there
is a change that's already happening which we can take advantage of, and that is the
move towards modularity. Over the past few years, both with PHM and
the 2015 module expect, JavaScript has become a more methodised ecosystem, and the nice
thing about modules is they provide boundaries. You don't really need to know about the inner
details of a module that you're depending on, so these modules, they could compiled
ahead of time using a language that doesn't have these flexible types that JavaScript
does, and it wouldn't affect how you code. Take, for example, React which has a lot of
different consumers. The React core team has already been working
on making their reconciliation algorithm faster. An option for them would be to write the new
reconciliation algorithm and something like C and then compile it ahead of time. But as long as they keep the API the same,
consumers of React actually wouldn't notice this. When they update the code, the only thing
they would notice is any performance improvements. So this is what WebAssembly does: it makes
it possible for library authors and application developers to code in languages that are more
consistently conformant, but then to have that code run on the web like JavaScript does
and to integrate with existing JavaScript. This means that you will be able to benefit
from WebAssembly without having to understand it or why it's fast, but I always find it
more rewarding when I do understand that stuff. So I'm going ahead and walk you through how
WebAssembly works. In order to do that, I'm going to have through
another crash course, this time in assembly and compilers. I talked about how communicating with the
machine is like communicating with an alien. I want to take a look now at how that alien
brain works. How the communication that is coming into
it gets parsed and understood. There is a part of this alien brain that is
dedicated to the thinking - like adding, subtracting, and logic. There is also a part of the brain near that
which is the short-term memory. Those parts are pretty close together in the
same part of the brain. Then there are some longer-term memory. These different parts have different names. So the part that does the thinking is the
earth, medic, and logic unit, the ALIO. The short-term memory, those are called registers. That is encapsulated in the central processing
unit, or the CPO. The longer term memory, that's random access
memory or RAM. Each part of the short-term memory has a name
and this makes it easy for the brain to understand what it should be working on at any given
time. The sentence is in machine instructions. When a sentence gets into the brain, it gets
split in a way that means different things. The way the sentence will be split up will
be very specific to the wiring of this particular brain. For example, this brain might take the fourth
through the tenth bit and pipe it through the ALIO and based on where there are ones
and zeroes, the ALIO will figure out what it is supposed to do for this instruction. Then the brain would take the next two chunks
to figure out what it needs to do that operation on and these will be the addresses of registers. You will see I've been adding annotations
above the machine code here, which makes it easier for us as humans to know what is going
on with this machine code. That is what the assembly is - symbolic machine
code. It is a way of human beings being able to
read and understand machine code. You can see here there is a one-to-one relationship
between the assembly and the machine code for this machine. Something you might have figured out from
that is that you actually have a different kind of assembly for each kind of wiring you
have for a machine. Any time that you have a different architecture
inside of a machine, any time there's a different kind of brain in the machine, there's a good
chance it will have its own assembly. So we're not talking about the targets this
translation just being one thing, just being one kind of machine code, it is many different
kinds of machine code. Just as we speak different languages as humans,
machines speak different languages. So, if we are talking human to alien translation,
you may be going from English or Russian, or Mandarin to alien language A or alien language
B. In programming terms, this is like going from C plus, CLL or Rust to ARM. If you want to go down to the high-level programming
languages down to assembly languages, you're going to have to create a whole bunk of different
translators. That would be pretty inefficient. Most compilers put at least one layer in between. The compiler will take the high-lex programming
language and translate it down to something that's not quite as high level but not as
low level as machine code. And this is called an intermediate representation. The compiler will take any one of the higher
level programming languages and go down to the single intermediate representation and
go from the intermediate representation to any one of the assembly languages. The thing that goes from the higher level
programming language to the intermediate representation is called the front-end; anything that goes
from the intermediate representation down to the assembly is called the back-end. Now where does WebAssembly fit into this picture? You might think that it is one of these target
assembly languages which is kind of true except that each one of those languages corresponded
to a particular architecture, and when you're delivering code across the web, you don't
actually know what architecture you're going to be running on. So WebAssembly's a little bit different from
normal assembly. It is a machine language for a conceptual
machine, not an actual physical machine. Once the browser downloads the WebAssembly,
it can make the short hop between the WebAssembly code and the actual assembly code for that
particular architecture. Let's walk through the tools that a developer
of a library like React would use to make their code WebAssembly. The compiler has a lot of work to go into
it for web assembly called LLVM. There are a number of different frontends
and backends. If we wanted to go from C to WebAssembly we
might use Clang taking us down to the representation, and once the code is in the intermediate representation,
LLVM can do some optimisation for us because it understands it at that point. Then we want to go from the intermediate representation
down to WebAssembly. There is a process that will go all the way
from LLVM to WebAssembly but you might not want to use it until it's fully finished and
there is another tool which has a FA Cup finished WebAssembly backend, using a fork of LLVM
under the hood. Even when the LLVM back-up is done, you might
want to use some script in to compile your code at present. It can be useful to pack in useful libraries,
things like a file system that works on top of index.db. Regardless of whether you're using LLVM or
sciptum together, the end resulted is .wazm for web assembly. This can be loaded in JavaScript. Right now, the way that you load it in JavaScript
is a little bit complicated. We're making that easier. Webpack has plans to work on it and other
module owners plan to work on it. Once the browser has a built-in module support,
WebAssembly can use that too. It should be as easy as loading a JavaScript
module. When I say that, though, I should add a caveat. Loading a WebAssembly module should be as
he's an as loading a JavaScript 1 but working with it is going to be a little bit different. Let's say you're calling a WebAssembly function
from JavaScript, and this is the JavaScript function. And this is the WebAssembly function. Functions in WebAssembly can only take WebAssembly
types as parameters, and at the moment, that's numbers, so integers, floats, that's what
you're working with. That's different from regular JavaScript modules. And the same restriction applies to return
values as well. But what if you want to be able to return
a string? You can't do it. For any data types that are more complex,
you need to put them in the WebAssembly module's memory. So this memory is an array buffer. It is just a JavaScript object that simulates
a heap. The integers that get passed back and forth
can be used kind of like pointers into this heap. So the C code can use that to write to the
memory as if it were an address and then the JavaScript can use that number to figure out
the array index that it needs to pull the value from. It's likely that anybody who's developing
a WebAssembly module for developers is going to create a wrapper around it so you don't
actually need to know about that. I think it helps to understand the performance
characteristics, understanding how the memory works. What I want to do now is go back to the diagram
and look at what it is about WebAssembly that can make things run faster. So, first off, this isn't actually shown in
the diagram, but it can take less time to download WebAssembly than JavaScript because
it is more compact. It was designed specifically to be compact,
and it can also be translated into a binary form, even though JavaScript is pretty small,
if you have equivalent code in WebAssembly, it is likely it will be smaller. Parsing takes less time than JavaScript too. JavaScript needs to be parsed from the source
into an abstract syntax tree and then usually converted into an intermediate engine called
bytecode. WebAssembly is already a bytecode. It just needs to be decoded from that binary
version, and decoding is faster than parsing. Compiling takes less time, because a lot of
it has been done ahead of time before the file was even put up to the server. Plus the compiler doesn't have to dominate
pile those multiple baseline Stubbs that it was doing before for the dynamic types. And you don't get into the optimise and de-optimisation
cycles that you did with the JIT. Running your code is fast because many of
the optimisations that JIT makes to JavaScript just aren't necessary with WebAssembly. Plus WebAssembly itself provides many instructions
that are just faster. Human programmers don't need to program WebAssembly
directly so that means its designers can create something closer to how machines think so
depending on what kind of code your code is doing, these instructions can run anywhere
to 800 per cent faster. As for bar engage collection, the manuals
now use memory management. This is likely to change. I will explain more about that later. For now, you don't need to worry about garbage
collection. So what is the status of WebAssembly right
now? In late February, the browser vendors announced
that WebAssembly was ready to ship all by defaults in browsers. We started ship it on by default in Firefox
the next week, and then Chrome did the week after that, and it's in preview versions in
Edge and Safari. With this, developers can start shipping WebAssembly
code for. For earlier versions of browsers that don't
support WebAssembly, you can ship down an as JS version. It is the precursor to web assembly. It is fully JS. What is in browsers is the MVP - the minimum
viable product. The MVP doesn't contain all the features that
the community group wants but with it, WebAssembly is reasonably fast and usable. However, it should get even faster in the
future through a combination of fixes in the engines and new features in the spec. For example, a fix that needs to happen in
Firefox specifically is that currently calling a web assembly function in JS code is slower
than it needs to be because of something called trampolining. Instead of the JIT knowing how to deal with
WebAssembly code it has to go through a transfer function controlling from JavaScript to WebAssembly. This is a lot slower than it would if the
JIT knew how to handle this function itself. Slower is relative. We're only talking nanoseconds here. But if you have lots of back-and-forth communication
between WebAssembly and JavaScript, you can notice that. So that's the kind of fix that you can expect
in the engine. As for the spec, there are a number of features
that are coming soon. One that is expected reasonably soon is threading. One way to speed up the code is to make it
possible for different parts of the code to run at the same time in parallel, but this
can staples backfire since the overhead of communication between threads can take up
more than time it would have to just run that all sequentially. But if you share memory between the threads,
it reduces this overhead. To do this, WebAssembly will use the new shared
array buffer that's being shipped? Browsers shortly. Once that is in place in browsers, the community
group can start specifying how WebAssembly will use it. On feature in a needs to be standardised is
direct ARM access. Currently, there's no way to interact it the
DOM doing element.htmls. Instead, you have to go through JS to set
that value. The community group is currently working on
adding DOM support, though. One last feature that has a lot of folks excited
is integration with the browser's bar engage collection. So, today, you can ship down your own garbage
collector with code if you want to but is slow for a few reasons and the community group
isn't making it possible for WebAssembly code to be used with just the built-in GC which
is a highly optimised one that the browsers have been working on, so it will run fast
and you will have that integration. Unfortunately, that's all I have time talk
about today, so I'm going to have to wrap it up I had a fantastic technical review on
this from Luke Wagner. He is the person who came up with the way
to add types in azm.js and did a lot of the work to push WebAssembly forward. He is with us today and will be doing a Q
and A about WebAssembly in the Mozilla space about lunch. Feel free to ask us questions, orb you can
ask on Twitter or we will both be at the party tonight. Thank you to him. And thank you all for listening. [Cheering]. [Applause]. >> Thank you for the fantastic talk. We're going to continue right on to the next
talk. If anybody wants to move to the side track,
there is going to be a great talk about sharing is caring, patterns for JavaScript library
design, and give us a minute to set up and other people to move in, and we will get started
with the next talk.