WebAssembly Threads - HTTP 203

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

SURMA: What better way to start a recording after, I don't know, two months since the last time we did this and change up everything? What could possibly go wrong? [MUSIC PLAYING] Should I talk a bit about WebAssembly Threads Jake? What do you think? JAKE: Well, have you written slides about that? Because if that's what you've written all the slides about, then I think that is what you should talk about. Otherwise, it'll get very confusing. SURMA: I wrote the title, and I'm like, well, now I wrote the title, I got to write the rest around this. And so I did. Well, so the thing I really like about WebAssembly-- and this is very much, I think, a Surma thing that doesn't necessarily apply to everyone-- is that WebAssembly has very little surface in itself. WebAssembly can't do a lot of things by itself. It can't-- very tiny things. And I'm going to talk about this, what it can do. And yet we end up with capabilities like Threads and SIMD, all these things that JavaScript can't provide. And so I wanted to talk about that, talk about WebAssembly, what it can and cannot do by itself, and how interactions with JavaScript can open up all these possibilities. And for that, I thought I'd start a little bit with a small introduction to WebAssembly at the low level. Before I start, it's probably important to say, this is not what you need to know to use WebAssembly. This is a bit like looking under the hood. JAKE: So before we continue, you said JavaScript doesn't have threads. And JavaScript doesn't have threads, correct. But the web has workers. Node has workers. SURMA: I'll talk about that. JAKE: Oh, OK. Fair enough. SURMA: Yeah, that's exact-- because that's exactly the interesting bit, because they're, to an extent, yes, but on the other perspective, you would say no. And then you combine these two, and suddenly magic happens. So that's what I want to talk about. So I'm going to give a bit of an incomplete overview, enough so that you hopefully understand what WebAssembly can and cannot do, but probably not enough to cover every variant in which it can do these things, but just a peek under the hood of WebAssembly, something you don't necessarily need to know to use WebAssembly, but that could be interesting or useful to know every now and then. So this is WebAssembly, kind of. This is the human readable assembly languages, the human readable as-- JAKE: I'm going to have to take issue with human readable. SURMA: [LAUGHS] JAKE: Because this human can't read it. SURMA: It's-- yes. I hear you. It's Assembly. I mean, people who have seen Assembly or any form of machine code, the human readable versions are not readable in, oh, I understand what's happening. But at least you can decipher individual words compared to the binary representation of this file. So this text presentation is called WAT, or "what," WebAssembly Text period, I guess. This language is literally a tech [? interpretation ?] of what is in the file in binary. So you can define module, and one module will basically end up as one WebAssembly file. A Wasm module can contain multiple functions, and functions can take numbers as parameters. And numbers in WebAssembly can be 32 and 64-bit integers and 32 and 64-bit floats. And you can do math with these numbers, and then you can return a number as a result. And you can export-- JAKE: What more could you possibly want? SURMA: Exactly. It's all you need, right? And you can export some of these functions to be callable from "the outside." And we're going to talk about outside in a bit. So JavaScript is one of the host systems of WebAssembly. There is now actually multiple host systems out there, some for PHP, some just standalone like Wasmtime, which runs as a standalone app on your desktop machine. But we're going to talk about JavaScript, because we are a web show. And so in JavaScript Land, you would fetch the .wasm module and instantiate it, which means that it will compile the module and instantiate it, and you can call these exported functions, which is now-- this is what the outside is. And here I declare a function that takes two parameters and a return type. 32-bit integer is the return type of this function. And then I use these two parameters to add them, and that is also implicitly the return value of this function. And at that point, it returns JavaScript. And the JavaScript environment knows how to convert between JavaScript types and WebAssembly types. And pretty much all the Wasm types just turn into 64-bit floats, because that's the only number type JavaScript has. Recently, there's been an addition to WebAssembly where 64-bit integers are now going to map to BigInts, because a 64-bit float can't represent all the numbers a 64-bit integer can assume. So that will address that problem. JAKE: But does that mean then JavaScript can-- JavaScript has a number type that WebAssembly doesn't support as well? Because if you get an arbitrarily big number, it will get to a point where WebAssembly can't-- once it's beyond the 64 bits. SURMA: Yeah. That can happen currently. So yeah, if the big end grows too big, it cannot be represented in 64. I actually don't quite remember how it is handled, if it throws or if it gets clamped or something like that. I would look into that. Have to look into that. [ELEVATOR MUSIC] You see here that JavaScript can call WebAssembly, but WebAssembly has access to absolutely nothing. You can pass in the numbers and use these numbers to do arithmetic within WebAssembly, but you don't have access to any of the APIs you might be used to. The WebAssembly is completely isolated. And that alone is actually surprisingly powerful, but we need a bit more to make it actually useful. So the next step is that you can declare imports. And here I am saying that I'm expecting an import in the surmas_imports namespace and that the import is called alert. And I expect it to be a function that takes one 32-bit integer as a parameter. Later, I call that function with a result of our computation, returning the result of that computation. The instantiation for the WebAssembly module remains largely unchanged, except now we have to provide these imports. And that has to happen at instantiation. And instantiation will fail if I don't provide all the imports the module requires. So this here is the so-called imports object, and the alert is obviously good old alert function that I hope we all remember from our start of JavaScript debugging. I certainly use that a lot for debugging. So this shows now that you can not only export WebAssembly functions to JavaScript, but also you can expose individual JavaScript functions to WebAssembly. But still, only number types will be able to be passed back and forth, because WebAssembly, for example, has no built-in understanding of strings. Now so far, these WebAssembly modules have worked with just parameters and that arithmetic on these parameters. For any more complex kind of work, you actually need a bit of memory. And that's why WebAssembly also has a way to handle chunks of memory that you might already know as array buffers. So here we declare that this WebAssembly module expects a memory in our import object and that it needs to be at least one page big. WebAssembly measures memory in pages, each page being 64 kilobytes. That has security and operating system integration reasoning. It doesn't really matter, but basically, the smallest unit of memory is 64 kilobytes, and every memory has to be a multiple of that size. And now we can use load and store to manipulate the values in that memory. So instead of adding the two values from the function parameters, we are now adding two values that we find in memory. WebAssembly memories are, as I mentioned, a lot like array buffers, but not quite the same. They have their own type, because they grow, they can grow. They have a different unit of measurement, and they need a [? little ?] of special setup for security under the hood. But in a way, they behave exactly the same. So here I create a new WebAssembly memory. So you can create a typed array view on the [? screen, ?] just like with normal array buffers. And then we can use ths DataView to put these values into memory and then use our Wasm module to add them up. This is obviously a useful example, just put two values in memory and add them up. But it just shows how the interaction with the memory works. JAKE: So the memory in Wasm, it's just like-- or sending a function into Wasm, like you did with alert, and sending memory into Wasm, is just-- is exactly the same. SURMA: It's pretty much the same. You have to declare what an import is supposed to be, because WebAssembly is strongly typed. So at compile, it is known that this important needs to be a function and this import needs to be a memory. But it's the same way. You as the host system have to make the conscious choice to give something to WebAssembly. WebAssembly cannot just grab anything by itself, which is one of the security-- JAKE: Because that talks to what you said before, of how WebAssembly is so lightweight. I always assume there was a deep integration of how the memory in WebAssembly works. But it is just chucking a JavaScript object in there, and then you're performing operations on it in WebAssembly Land. SURMA: So for complete transparency, a WebAssembly module can actually declare it's [? mo ?] memory and export it instead, but it still functions the same. It's just like, here's-- because the WebAssembly module declares its own memory, you'll also know it doesn't get access to anything it shouldn't get access to, because it's created at instantiation time, so it doesn't get random access to unknown data. So it's all about the security and the primitives that are being exposed here. That covers all the things that WebAssembly can do that we need to know about to talk about it. This is pretty much what is called the Wasm MVP, the Minimal Viable Product, which was the synchronized launch between all the browsers. There are proposals, obviously, in WebAssembly Land to augment what WebAssembly can do, but almost all of them are just almost syntactic sugar on top of these things. Very few of the proposals actually expose new capabilities, and if they are, they're often limited to arithmetic, which I think is very interesting. So let's talk about threads, because JavaScript is a bit weird on this topic, because it is, by design, single-threaded. JavaScript, however, supports parallelism, at least on the web and in [? node ?] recently, with workers, which runs a JavaScript file in a truly parallel fashion to your main thread. However, you can only send messages back and forth with the worker and the main thread with postMessage. And there is no way to share a variable between those two threads, like you might be used to from other languages that support threads or any form of threading primitive. And since the-- JAKE: Oh, well, except shared array buffer, right? That's-- SURMA: Well, Jake, that's what I'm getting to. JAKE: Oh, OK, I'm sor-- [LAUGHS] SURMA: So if it just added shared memory to JavaScript, things would break, because many of the parameters are designed around the fact that they can get interrupted or that there could be erase conditions. So instead, the shared memory concept has to be isolated to a specific type, which, as you already spoiled, Jake, is the shared array buffer. And that's-- JAKE: I'm sorry. I'll just shut up and sit here and let you talk, because it's clearly-- everything that I'm thinking of, you've already got covered, so. SURMA: I'm just glad you asked this question exactly here, because that was my next slide. So I did something right, at least. So yeah, shared array buffer is pretty much just like an array buffer in the end. But you can get shared access with the same array buffer from across the threads. And both of these threads will see the memory manipulation under the hood in real time. So here, what I'm doing is I'm running the main thread and basically have a "while true" loop to wait until the first cell of the memory is bigger or greater than-- greater or equal than 100. So this basically means the main thread is blocked. And then the worker, I get access to the same shared array buffer, and I increment the first cell value. So even though the main thread is blocked, at some point it's going to get unblocked, because the worker is running in parallel for real and can work on the same exact memory chunk as the main thread. So this is called a spinlock, where you just keep spinning in an endless loop and keep checking your condition to continue. They're a thing, but they're obviously quite bad, because you're just locked in your CPU at 100% for this thread, because just that's all the processor's doing. And-- JAKE: Are you going to mention atomics in-- SURMA: I am now going to mention atomics. Aren't you on top of things. JAKE: [LAUGHS] SURMA: Because I actually think that atomics are not that well known, because very few people probably work with shared array buffers, and they only work with shared array buffers. And basically, they have just a couple operations with atomics to make operating on these shared array buffers more reliable and predictable. As for example, in a worker here, we can block [? out ?] a memory cell and wait for other threads to notify us that this memory cell is now ready to use. It's a form of a mutex. So in the worker, we can use Atomics.wait to wait on a certain cell. The first value here is the index, and the second parameter is the [? ex-- ?] the first memory is the view. I should read my own code. The first memory is the actual memory view. The second parameter is the index, and the third parameter is the expected value that needs to be in the cell before we start blocking. That is a typical mutex programming pattern, that you check if the value had been changed before you started waiting. And then in the main thread, we can basically just wait on the user to click a button, and once they do, we use Atomics.notify, and that will wake up all the threads that are waiting on this memory cell. And so this way, the CPU is not locked at 100%. This is not a spinlock. It's actually in cooperation with this operating system and will put the thread to sleep and save system resources. Now that was basically all the prelude, all the building blocks that we need to know to understand the WebAssembly threads proposal that is now in Chrome stable. So the WebAssembly threads proposal is actually much less than I thought it was, because when you think about threads in C or Rust, you think about calling a function and having it run in a separate thread. WebAssembly Threads is not that. What it really is, it's just it allows you to declare a memory as shared, which basically makes this WebAssembly memory behave exactly as a [? normal memory, ?] but also like a shared array buffer in that it can have multiple views in real time onto the same memory from different threads. And additionally, it exposes these atomics as WebAssembly instructions. Now this is interesting, because it doesn't actually allow you to spawn a thread. It just gives you these atomics. And this is actually solved by the language compiler or the runtime that you use to write your WebAssembly. So I did a diagram. And I know people find these kind of UML diagrams scary. But that's actually what I [? found interesting, ?] that it is more complicated outside of WebAssembly than it is inside of WebAssembly. So basically, you [? went ?] to JavaScript, and whenever you compile something from C to WebAssembly with M script, M script will not only generate a .wasm file, but also a JavaScript glue code, as it's called. And that piece of JavaScript takes care of loading the WebAssembly module for you, populating the memory with all the values it needs to be in there, and it provides the integration with the host system that the C language expects to be in place. So for example, when in C you call pthread_create, which is the function to create a thread, that is actually a JavaScript function that is imported into the WebAssembly. So when you call pthread_create, the call goes into JavaScript. It will spawn a worker. It will send the module and the shared array buffer over to the worker. The worker will also load the glue code and instantiate the same module on top of the exact same memory. And now we have main thread and worker running on the same memory with the same WebAssembly module, and they can now use the atomics to synchronize. So from here on in, it actually behaves like a real C program. But all the magic really happens in-- JAKE: So what's the new bit then? Because we could have, like-- today, before this Wasm feature became a thing, we could still give WebAssembly a shared array buffer as memory. So there'd be nothing to stop us instantiating two bits of Wasm that are actually using the same bit of memory in different workers, and they could be instructed to work on separate things at separate times. So is that the case? And if so, what's the new bit? SURMA: Before, you couldn't instantiate WebAssembly on a shared array buffer. That just wasn't possible. And you didn't have the atomics instructions inside WebAssembly. So that's what I mean. Those are really just the new additions, and they already exist in JavaScript. They are not new capabilities on the web really. The difference is that in JavaScript, we don't have any way of expressing complex objects on top of a shared array buffer. But any normal compiled language does exactly that, where you can build complex classes and structs, and they all somehow get represented in linear memory. And so that is this combination of actually kind of old JavaScript features, shared array buffers and atomics, and WebAssembly's capability to bring the high level constructs to a low level virtual machine that, in combination, we now have real threads on the web, which kind of was possible with workers and shared array buffers, but not in a comfortable way. Yeah, we've been using it with Squoosh now, and it worked surprisingly well for some of it. Yeah, and so I looked into it, and I was like, I was surprised at how small, to an extent, the proposal really is. And yet the combination of these things really turns into something very powerful. JAKE: So I'll put you on the spot a little bit, because I haven't-- SURMA: Yes, please. JAKE: --looked in detail at what we've done with Squoosh, because it wasn't me doing that work. So it's going to be spinning up these workers. When does it get rid of those workers? Does it generate them per thread and then destroy them per thread? Or does it have a thread pool? SURMA: I think it has to. JAKE: Mm. SURMA: So I-- actually, no, that's not true. I think different compilers will handle it differently. I know that M script takes a worker pool parameter. JAKE: Hmm. SURMA: So I wonder if that is, to an extent, how it works. Because I mean, a computer has-- you can spawn as many threads as you like, technically, but you have limited amount of cores that can actually run any of these threads at any given time. My hunch is that, naively, it will probably spin up as many workers as you create threads, and it will kill the worker when the thread is done. JAKE: Mm. SURMA: But there's probably smarter things out there when you created worker pool, that they get recycled and reused. I mean, workers-- not workers. I think [? WebAssembly's ?] threads are still so new, to an extent, that there's so many things to measure and to optimize and to see how, with actual usage patterns, how that affects performance. I know that Google Earth has been using them for a while, so I'm guessing they had-- and I know they've been talking about it, so I wonder how much feedback has been flowing back between the M script engineering team and the Google Earth engineering team. But so far, Threads has been, quote unquote, "good enough" to run all these use cases with good performance results in the wild. And that makes me kind of hopeful at least. JAKE: So is there any proposal to put in genuine threads? Like-- I don't know, I say genuine threads, and that's kind of like what-- SURMA: You want to spawn something without having to write the glue code, right? That's what I thought, that that would be the capability to somehow spawn a thread. No, I don't think so. I think what this would be, that would be WASI, where you have a standardized systems interface. So instead of, for each language, you reinvent how to mock out a thread creation call, there would be a standardized interface to the host system, which is what WASI is, where you can say, open the file, create a network connection, but probably also, spawn a thread. And then the WASI implementation, may that be in a desktop environment runtime or maybe a JavaScript layer on the web or wherever, would just have this generic implementation. But that is still, I think, a bit out being worked on, being-- there's experiments, and that's really, really good, but nothing that I would say you could settle on for production right now. But yeah, that was basically the WebAssembly handwritten version speedrun with Surma and threads. JAKE: Did you mention-- you mentioned something about SIMD along the way as well. What's happening there? SURMA: SIMD is an even smaller proposal, because it just-- well, it adds one new type, which is 128 bit something. And you can interpret these 128 bits either as four 32-bit integers, two 64-bit integers, as eight 16-bit integers, and just see them as vectors, and add them and multiply them. And they settled on that, because that seems to be what will compile on most actual CPUs. Because WebAssembly by itself is just an intermediate format, right? When you download WebAssembly, it will get compiled to real machine code that runs on your actual processor. So it needs to find a SIMD equivalent that will compile as many [? processors ?] as possible. And then all it does is just add this new type and add a couple of new instructions. And then the compiler can decide whether to the CPU supports or not, and either pretend to run those instructions in one instruction cycle, or just do it in series and pretend that it did it in one cycle. And yeah, that's the other thing that we need to look into but we haven't done yet, have we? That's still one of the things we want to look into for Squoosh. JAKE: Actually, so the status of Threads, that's shipped, right? SURMA: Mm-hmm. Well, we've-- so [? Ingvar, ?] our colleague, has done some experiments and has got it to work with both Rust and C++, and we have seen some pretty good performance improvement on those. SIMD is harder, because often SIMD needs to be handwritten. The compiler often can only figure out very clear cases to auto-vectorize, as it's called, to automatically turn a loop into a SIMD instruction. That only works in very few cases. The problem is that many codecs that have SIMD instructions use handwritten assembly, but not handwritten WebAssembly, but for other CPUs. So those don't compile to WebAssembly. So we're in a niche where we don't know how we can make use of the SIMD from these codecs to make use of WebAssembly SIMD. But I think we'll play around a bit, and maybe he'll find something. JAKE: But in terms of the thread stuff, the new thread stuff, what browsers has that landed in? SURMA: I know it's in desktop Chrome. I know it's in Firefox desktop, if you have COOP and COEP enabled. And I think we even have it in Android's Chrome soon, if you have COOP and COEP enabled, which we have on Squoosh. So we also have [? feature detection. ?] So if you have support on your browser, Squoosh will just use threads. If not, you will get the old codec version without threads. JAKE: So I think the reason for this is because on desktop Chrome, you get process isolation out of the box, whereas you don't get that in Firefox, and you don't get that on Android, because there's more of a memory concern. And that's when you need to-- SURMA: Exactly. JAKE: --yeah, close all the doors to outside content, which is the COOP and COEP stuff. We've got another episode on that. We can link to that. SURMA: Oh, do we? JAKE: Yeah. I think we did. SURMA: We lost shared array buffer for-- oh, we did that one, you're right, with all the acronyms. Yeah, I mean we lost shared array buffer for a while due to Spectre and Meltdown. And now we have found mechanisms how we can bring them back without them being a risk, which is COOP and COEP And we should link that episode, because it was good. JAKE: Thank you. [LAUGHS] SURMA: [LAUGHS] JAKE: Well done, us. SURMA: [LAUGHS] All right, that's it. That was my WebAssembly speedrun. JAKE: Cool. Buh-bye. SURMA: Bye. [MUSIC PLAYING] JAKE: All right, [INAUDIBLE] SURMA: Oh no.

Info

Channel: Google Chrome Developers

Views: 23,114

Rating: undefined out of 5

Keywords: GDS: Yes, WebAssembly Threads, WebAssembly Threads proposal, assembly language examples, WebAssembly text format, WebAssembly, WebAssembly specs, Atomics, assembly language, CSS, HTML, HTML spec, content visibility, new videos from Chrome, Chrome, Chrome Developers, Web, Chrome devs, developers, Google, tech, tech videos, web, videos for developers, css, javascript, performance, new in tech, new in web, new in chrome, Halloween, HTTP203, Jake and Surma

Id: x9RP-M6q2Mg

Channel Id: undefined

Length: 24min 41sec (1481 seconds)

Published: Tue Oct 27 2020