Loop Tiling - HTTP 203

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

SURMA: (BEATBOXING) Let's do some some 203. Are we good? JAKE: We are good. [THEME MUSIC PLAYING] SURMA: I don't know if you've noticed, but we've built a thing. JAKE: Are we going to talk about Squoosh again? SURMA: A little bit. JAKE: OK. SURMA: Maybe, but another aspect of Squoosh. JAKE: All right. SURMA: So that's kind of interesting. JAKE: OK. SURMA: So this might be a long one, so bear with me. I'm going to start at where we started, and then we kind of fell down into this rabbit hole. And I want the audience to fall into this rabbit hole with us. JAKE: Yes, and I'm really looking forward to this one. Because sometimes when we do these, one of us is maybe slightly pretending to know less about the subject than we do. Whereas in this one, there's a lot that I really don't understand. So-- SURMA: And I'm really worried that I might not actually be able to explain everything as much as you would like me to. JAKE: OK. Well, I'll-- SURMA: So let's see where we end up. JAKE: Yes, I'll let you know honestly. SURMA: Let's start with, what are images on the web if we manipulate them with JavaScript? So let's talk about image data, which is a data structure that we use in Squoosh. So once we get an image in and we decode it, and we turn it into an image data object, which is a data structure that exists on a platform. Basically, it has three properties-- the width, the height, and data. And data, there is a Uint8ClampedArray. And in there, you have just four bytes for each pixel. JAKE: Yes. SURMA: And it's the first row, and the second row, and so on. JAKE: And then each one is like red, green, blue, alpha, right? SURMA: Exactly. So what you see here is like it's a red pixel, then a green pixel, and a blue pixel, and a white pixel. And because the image is 2 by 2, that is what the image would look like, right? JAKE: Huh, nice. SURMA: So it's basically just a series of numbers with no concept of rows or columns. But because of that information, we can rearrange them and interpret them as a proper two-dimensional image. JAKE: Brilliant. SURMA: That's kind of how it works. All right. So now, in Squoosh, we had the goal to rotate an image by 90 degrees. JAKE: Sounds like a simple thing. Probably only take 10 minutes. SURMA: I mean, you wrote it-- the first version, right? And so let's talk about how you wrote it. You rotate an image by 90 degrees, gets an input image, which is this image data object. JAKE: Yes, it is. SURMA: And what we do, we figure out by 90 degrees, what is the width and the new height, which is pretty much just height and width swapped. JAKE: You're doing fancy Surma code already. SURMA: It's a little bit because otherwise it wouldn't fit. So I'm compressing things down. JAKE: Right, OK. SURMA: This is actually kind of two-- JAKE: So here, you're essentially assigning the height to the width, and the width to the height, because it's 90 degrees. Right, OK, I'm following. SURMA: And I'm trading in new output image, which has this new width and the new height. JAKE: Yes. SURMA: So now the goal is to go through the pixels and put them in the right spot in the output image. JAKE: Ba ba ba ba ba. SURMA: So what we do, we for loop over all the pixels in the input image, and we figure out where they would have to land in the output image. So basically the new x-coordinate is that kind of formula, the new y-coordinate's that. Then we figure out which input pixel it is, which output pixel, and just copy it over. JAKE: More fancy Surma code here-- wouldn't get through review. SURMA: I know. You don't like it. JAKE: OK, that's fine. SURMA: And then because we have four bytes per pixel, we just loop over four times, and just do another thing. We copy the r value, the g value, the b value, and the a value. JAKE: Yep. And off we go. SURMA: And this works. And this was actually decently fast. We shipped it this way. JAKE: We should say the reason we did this rather than canvas is because we wanted to run it in a worker. SURMA: That's an entire different story. But yes, we did a lot of tests with what seemed more fancy [INAUDIBLE] technology didn't seem to work. So we ended up writing our own piece of JavaScript just for this problem. JAKE: Yes, because offscreen canvas, only in a couple of browsers, whereas this is just basic-- SURMA: JavaScript. JAKE: --JavaScript, so that works everywhere. SURMA: And it can run in the worker because it has the image of it. So this-- we shipped this. This worked. JAKE: Yes. Yes, it did. SURMA: And then I looked at some point, and was like, hmm, there's actually kind of an obvious optimization that you missed. And so I basically added a little patch. This all stays the same, same as before. But now I'm creating a u32 array. JAKE: Yes, yes. SURMA: But basically we have the same underlying chunk of memory, but instead of seeing it as a series of bytes, we see it as a series of 32 bits-- numbers-- because every pixel consists of a 32-bit number, right, for r, g, b, and a. And so this way, we can simplify or actually remove the inner loop. JAKE: So it's this bit that was here that-- doing something four times every time, we're now just doing it once. SURMA: It's now one copy operation which actually maps to a machine instruction. Most of the times, the V8 will be, like, super smart and go, like, whoa fast. So this was actually quite a bit faster. So, cool. And then we ship this-- still fine. And then it turns out that for some reason, in one browser, this was super slow. JAKE: Right. And we've been advised by legal-- SURMA: By our legal department to not name the browser. JAKE: Apparently it's a Chrome policy not to-- SURMA: I've never heard that before, but-- JAKE: No, we're not allowed to talk about other browsers. So we can't mention which browser it is. SURMA: But it's one that didn't run in our machines. JAKE: It didn't run on your machine, did it? You had to use a VM to run this different browser. OK. SURMA: Either way-- like, most browsers were fine, good enough at least, and then for some reason, this one browser just ended up being extremely slow, like unreasonably slow. So we must have hit some weird corner case. JAKE: Yes. SURMA: Because this browser isn't slow usually. It's a very good browser. JAKE: Yes, and different JavaScript engines optimize with different things. So the fact that one browser was slower here isn't saying that that browser is terrible. It's just saying V8 is very good with this kind of tight loop code, over engines that have optimized for, like, more dumb bindings stuff. SURMA: Exactly. JAKE: So it wasn't that surprising that one browser was completely different in terms of problems with this piece of code. SURMA: So we thought, well, what do we do? Maybe we frame a web assembly at the problem, right? JAKE: Aye. SURMA: So we looked into that. And the first problem we had that, when you write WebAssembly and you load it, it turns into a module that has functions, the functions that you wrote in whatever language you were using. JAKE: Yes. SURMA: Right? JAKE: This is different to an ECMAScript module. It's a Wasm module. It's a different thing. SURMA: It's a different thing. And these functions can only take in and return numbers. So there is no easy way, straight up, to pass in an image. So what do you do, right? So what we ended up doing-- I'm going to reuse the video I made for my article-- JAKE: Oh, brilliant. SURMA: --basically, the JavaScript was going to load the image, put it into the WebAssembly memory, and then we're going to use WebAssembly to just do the reordering within that WebAssembly memory buffer and use JavaScript to read it back afterwards. JAKE: Right. SURMA: So that means the WebAssembly really is completely isolated from all of the outer world, really, so to speak. It just has its chunk of memory to work and will read in the image, do the reordering that was shown before, and then JavaScript comes back, takes over, and reads back the resulting image. JAKE: So JavaScript and WebAssembly, the thing they share is memory. That they-- SURMA: Pretty much. JAKE: WebAssembly, it's its memory. SURMA: So this WebAssembly.memory is WebAssembly-specific memory. But it is also exposed as an array buffer that we can use as a u32 array or whatever we need in that very instance, right? JAKE: So the amount of memory we need for WebAssembly is essentially double the size of the image, because it's going to-- SURMA: Yeah. JAKE: --have the main image in memory and then the next bit. OK. SURMA: So how do we create Wasm? We've done it before with mScript in C. But that's also Rust. But we actually found a very interesting project we stumbled over called AssemblyScript. JAKE: Yes. SURMA: Which is a-- they call themselves a type script to WebAssembly compiler, which is true. But might be a little bit misleading. Because you can't just take any type script and compile it to WebAssembly. It is using the TypeScript syntax and the TypeScript standard library things, but with their own version of their library that is specifically tailored to WebAssembly. So what you can see here is the signature way. Now we have types, as you know, from TypeScript. But there's the I32 type, which is the type WebAssembly has, but JavaScript doesn't. JAKE: And that's the 32-bit integer, right? SURMA: Yes, the signed 32-bit integer. JAKE: Signed. SURMA: That's also be u32, which is the unsigned. JAKE: Why are we using signed? SURMA: Four reasons. JAKE: Four reasons? OK. Let's gloss over it. This is good. Because I can recognize this. It looks a lot like JavaScript. It looks a lot like TypeScript. SURMA: And so will the rest, except for two lines. So this looks the same. So we switch height and width. JAKE: Yep. SURMA: Now this is a bit interesting. Because we have this chunk of memory, we need to know where our input image starts and where our output image starts. That's what these two variables are. So our input image starts at 0, at address 0 in this memory. JAKE: Which is always does. SURMA: Index 0, you can say, and the output image is right after the input image ends. And the input image consists of width times height times four-- JAKE: Four bits per pixel. SURMA: Bytes. JAKE: Bytes per pixel. [LAUGHTER] See the thing about this, and I'm sorry to interrupt the flow. I should say that I came to the web as a CSS person, CSS Front-End, and I learned JavaScript. Whereas you came to the web from being a programmer-- well, and then you went to the web, right? SURMA: [INAUDIBLE] I did embedded systems. Like I was literally writing kernel code and low level memory management. And I had no idea about CSS and how to do UI and anything like that. JAKE: Right. SURMA: It's just two completely different angles. JAKE: But I would say that if anyone is watching this, thinking what is going on? SURMA: Yeah. This is-- JAKE: I am feeling exactly the same. So don't worry too much about it. All right. Come on. Let's-- Let's go. SURMA: But for now, these are basic indices in the array. Where does the input image start? Where does the output image start? And then this looks familiar-- looping of all the pixels. JAKE: Yep. SURMA: And figuring out where the new coordinates are. We did all this before. And now there's these two AssemblyScript specific functions. The first one is load, which allows me to load a u32 from the memory at a given address. JAKE: Right. SURMA: And so in this case, what I'm doing is I'm using the input image space, where the image starts, plus the pixel I want to read. JAKE: So this is very similar to what we were doing before with the uint32array. But it's where-- but there's a special command to get it straight from memory rather than-- SURMA: Yeah. Because it's a WebAssembly memory. And that's, like-- JAKE: Right. SURMA: --implicit. It's not something you get handed as a reference. It's just there is a global, almost like. JAKE: But it's the same thing. We're passing the same indices into it. Yes. SURMA: Exactly. JAKE: OK. SURMA: So we're loading our pixel and then all we have to do is write it back to the output image. And it's the same thing, restoring the value v, which we just read, back as a user to into the output image space. JAKE: OK. SURMA: And now we have written AssemblyScript. JAKE: And then this converts to WebAssembly. And what really struck me with this is that if I wanted to write WebAssembly, this is the tool I would use. SURMA: Yeah. JAKE: Because this looks really familiar to me. SURMA: You don't have to learn a new language, right? Because-- JAKE: Yeah. SURMA: --I think you've learned a bit of C because of Squoosh. JAKE: Yes. SURMA: But that's pretty much it, as far as I know. You've not written Rust, I think. You kind of-- JAKE: I know PHP. [LAUGHTER] SURMA: [INAUDIBLE] PHP to a WebAssembly compiler, then. JAKE: I would love it. It was the first language I learned. SURMA: So we have this function now. And now I want to compile it to WebAssembly. And luckily, AssemblyScript makes it very easy. So we just installed the AssemblyScript package. And then we have an ASC command, which we give our TypeScript file to. And it will give us back a WebAssembly file, with no additional glue or JavaScript, which I think is quite interesting. Because most other implementations for WebAssembly give you glue code, which is the initial-- JAKE: A huge JavaScript file, otherwise, is really difficult to deal with and work with. But this is just-- yeah, just Wasm, right? SURMA: So we did this. We got a rotate.wasm file. And now the interesting bit might be how to load it. Because usually, glue code loads it for you. But now you don't have glue code. How does this work? It's actually not that difficult. What you do is you take the instantiate streaming function from the WebAssembly object and put a fetch in there. Because the WebAssembly compiler, at least the non-optimizing one, can compile while the Wasm file is still downloading. JAKE: So this instantiate streaming takes a promise. SURMA: A promise, or response, or an array buffer-- JAKE: That's a weird API. It's, like, why does it take a promise? SURMA: Because they want to make this simple-- if you don't have to wait the fetch, right? JAKE: OK. I don't agree with it. But that's fine. SURMA: Sure, fine. JAKE: You should just put an array in there. SURMA: Either way-- I find it really interesting. It starts compiling while it's still downloading. So it's not like download, then compile. It's actually almost in parallel, which for WebAssembly modules, which can be quite big. You know? I think that the Unreal Engine one is, like, 40 megabytes. That will make quite a difference. JAKE: Yes, absolutely. Not so much here. SURMA: So no, absolutely not-- so yeah, the Wasm module is, by the way, it's, like, 500 bytes or something. So it's really small. It's smaller than the compressed ng's of JavaScript code that we had. JAKE: Nice. SURMA: That was actually quite cool. So now we get an instance back from this one. And on that instance, we can have exports. And exports is all the functions, but also the memory that we are going to work on. JAKE: Right. SURMA: So we can grow our memory. Because we didn't know what size it has. But we have to go to the size that fits our images two times, right? Which we would have to calculate. We'll skip this here. But I would-- JAKE: So that would just be that the size of the [INAUDIBLE] array data times 2. SURMA: Yeah. JAKE: OK. OK. SURMA: And then I will somehow load this image into the buffer, which is really just-- memory has a dot buffer property which is a normal array buffer. Plus we can use all the [INAUDIBLE] to put data in there. JAKE: Right. SURMA: Just put it in. JAKE: Yep. SURMA: And then you call rotate 90, and we'd image back, and you're done. JAKE: Ah. So exports has all of the methods. SURMA: So this is the method. This is the magic, where you call into WebAssembly. And you can also see it's synchronous. So WebAssembly is something that will actually take the control away from JavaScript and do its thing, and then return the control back to JavaScript. It's just like an actual function. JAKE: OK. OK. SURMA: Which I think is super nice. And so this was fast, and we were super happy about this. JAKE: Yes. This was much faster than-- SURMA: It wasn't faster in Chrome in the sense that it didn't outperform JavaScript. It was as fast, or almost as fast. But it was consistently fast across all browsers. JAKE: Yes. It had taken the browser that doesn't run a Mac from seven seconds down to, like, 500 or something, 500 milliseconds. SURMA: It was very, very acceptable. JAKE: Yes. It was really nice to see that similar value-- SURMA: --across all browsers. So we were super happy about this. So we opened PR in our Squoosh and you reviewed it. And we wrote an article. And then "Hacker News" happened. JAKE: "Hacker News" happened. SURMA: And that's something I would never say. Because usually the comments on our articles are quite annoying. [LAUGHTER] JAKE: "Hacker News" can sometimes be quite pedantic, I find. But in this instance, there was some pedantry. But the pedantry was really interesting. SURMA: It was really interesting. JAKE: Some fascinating results-- and just a lot of it I didn't understand. And I hope you're going to explain it to me. SURMA: Yeah. So someone said, why aren't they using tiling? Tiling would make this so much faster. Let me quickly try it. And yeah, I totally did it for something, like, 20 milliseconds. I was, like, what? JAKE: Yeah. So they had taken it from-- what was it, sort of framed to fall into milliseconds down to-- what was it? SURMA: I think 40. JAKE: 40, which is such a huge improvement. And that was even faster than we were seeing from a canvas element. Yeah. SURMA: So and I had to obviously sit down and actually understand what's happening. So let's talk about what tiling actually is. JAKE: Yes. Please do. Because I have no idea. SURMA: So I'm going to explain tiling. But there was also another suggestion for performance optimization. I'm going to talk about both of these. But I'm going to get the other one first to get it out the way. Basically, some people were saying, oh, if you look at this y times width, it's completely independent of the inner loops. If I move it out between the outer and the inner loop, that would make it faster. Because that calculation can happen only once per outer loop. It doesn't need to happen every time in the inner loop. JAKE: Yes. And I thought this was going to be the kind of thing that the optimizer thingy doo dah would take care of for me. SURMA: And it is. JAKE: Ah. SURMA: So this is the kind of advice where you don't have to worry about these kind of things. Like moving constants out of a loop is something that not only most compilers can do-- so, like, the [INAUDIBLE] compiler could do this or the Rust compiler-- but even the V8 compiler that go from the JavaScript to machine code or from WebAssembly byte code to machine code, will do this. So this is an optimization that we don't have to do. And where we can say let's keep it readable and obvious and don't introduce another variable where people that read the code would have to have even more state in their head to understand what's going on. JAKE: Yes. OK. SURMA: But the other thing is tiling. And tiling is something that I hadn't heard of. I actually had heard of it. But I also was under the impression that compilers would do it for us. And in this case, it is not. JAKE: What is it? Tell me. SURMA: So what is tiling? So this is an image-- JAKE: Correct. SURMA: --it's actually the album cover of our podcast. I don't know. Did you know that we do a podcast, Jake? JAKE: We do a podcast, as well. SURMA: We should link to it in the description, Jake. JAKE: Yes. We should. SURMA: So we have been reading this image so far like this. We've been going row by row. JAKE: Yeah. SURMA: And just-- what is this pixel? Where does it belong? OK. Copy. And look at the next pixel in the same row. That's kind of what we did, and we thought fine. Tiling is a different approach where you tile the image into tiles. JAKE: That's good. Yeah, those are tiles. Excellent. SURMA: And then do whatever you're trying to do within a tile first. So instead of going row by row, you just go tile by tile. And within the tile, you go row by row. JAKE: This is legitimately-- SURMA: It's the same thing. JAKE: This is legitimately a different way of doing the same thing. SURMA: I know. Now the interesting thing is that this turned out to be so much faster. JAKE: Yeah. Like, a tenth of the time. I still don't understand yet. SURMA: So let's implement this real quick. Which it's not actually-- JAKE: Can I just say one of our previous recent episodes, we talk about the dangers of over optimization. SURMA: Yeah. JAKE: And why are we doing this? SURMA: Because it ends up being so much faster. JAKE: OK. OK. OK. SURMA: We actually, with this optimization, we end up going well below 100 milliseconds, which within the RAIL guidelines makes it feel like an instantaneous response to the button. JAKE: It is an optimization that matters. Cool. SURMA: And before that, we were at, like, 300 to 500. Which was fine, but if you can go under 100, we should go under 100. JAKE: Especially for bigger images. OK. SURMA: So basically, I just do an additional two outer loops. Which usually sounds wrong, but in this case is very, very right, where we iterate over all the tiles that we have. And then, in there, we basically have the same old loop, where we loop over each individual tile. JAKE: I'm starting to hyperventilate. Why? OK. SURMA: So this is tiling implemented. JAKE: So I get it. SURMA: Let's talk about why this might make things faster. JAKE: OK. That is the bit I don't understand. SURMA: So originally, tiling-- when I Googled tiling and researched it, it was mostly the use case for matrix multiplication, which is a different use case. Because input values are used multiple times. If you multiply two matrices, you have to read the cell at 1, 1 multiple times for, I think, for each column that you're calculating the output matrix. JAKE: OK. SURMA: So it makes sense that, if you do tiling, you have a better chance of having that value still in the cache. We're talking now processor level 1 cache, by the way. JAKE: So hang on. OK. We will need to explain what that is at some point, as well. But my feeling is, by reading memory sequentially, you're more likely to hit caches. Because you're dealing with a little bit of memory that was very close to the last bit of memory. SURMA: So if I have these two really big matrices, and I go through the first row of the input matrix, by the time I come-- I end up at the end, the values from the start might have been kicked out of the cache. Because level 1 cache in the processor is really small. We're talking like 200 kilobytes of cache, maybe, or less. JAKE: Right. So the processor has an L1 cache-- SURMA: Which is, like, super fast. JAKE: --so that there's these set of caches that gets bigger and slower. SURMA: Yeah. JAKE: Until you get to memory-- SURMA: Actually, memory is actually really slow. JAKE: Memory is really slow-- in relative terms. SURMA: Yeah. JAKE: OK. SURMA: And so what the tiling does is, by shortening the amount of time you spend going away from the initial value, you have a better chance of having the initial value still in the-- JAKE: Buh, buh, buh, buh, buh, buh, but-- SURMA: For matrix multiplication. So with this one-- JAKE: Yeah. SURMA: This is going to make sense why this would make it faster. JAKE: Because the second row is a massive jump from the first-- SURMA: Yeah. The rotation-- we read every value once, and we write it once. So why would caching make things better? JAKE: That is roughly the question I have in my head. SURMA: So there's two theories. And I don't know which one of them is actually true. JAKE: Are you telling me you don't even know? SURMA: Well, I even talked to Benedict, our V8 VM engineer. And he's, like, I have two theories. But it's really hard to test. JAKE: OK. OK. SURMA: So one version is that [INAUDIBLE] are really smart at predicting what memory you are going to grab next. So by basically seeing the tiles, it can make better predictions what-- JAKE: Oh. SURMA: --cells to grab, already put into the cache for you, even though you haven't exited that code yet. And the other thing is that, because the cache is so small, that there's a certain pattern which cell can be cached in which cache cell. So this gets a little bit confusing. But basically, if you think about it, if you have like three cache cells, just three individual cells-- JAKE: What can go in a cell? SURMA: One value. JAKE: One value, OK. OK. SURMA: OK. So memory error zero can only go in cache cell zero. And cache one can go in cache cell one. Cache two can go in cache cell two. Memory error three can only go in zero again. You wrap around, right? So you assign those. JAKE: Yep. SURMA: And then again, by keeping it smaller, you have a better chance of not overwriting the old value that you have put into your level one cache. So basically, all this is about is, by making things, making your access memory smaller, so that you don't evict the cache from the things that you-- JAKE: No, this is basically, it's working because our inner loops are smaller. SURMA: Yeah. JAKE: Right. SURMA: So it makes the processor make better predictions and also make the processor not evict the cache. Because the area we work on is smaller. JAKE: So then does the tile size-- yeah, what's the tile size? SURMA: So that's what I thought, right. And so I did some benchmarks. On a MacBook, on an iMac, and a Pixel 3-- because the bigger the machine or the bigger the processor, the bigger the level one cache usually is. JAKE: Right. SURMA: So the iMac that I have is an 18 core massive processor thing. It has massive [INAUDIBLE],, while the Pixel 3, obviously, has a very, very tiny level 1 cache. JAKE: All this code is single call, anyway, right? SURMA: Yeah. JAKE: Yeah. SURMA: So basically, at zero is the relative time it took for no tiling. JAKE: So that's the original-- SURMA: That's baseline tiling. JAKE: --wasn't, yes. SURMA: So what you can see here is how the time shifted relatively to that base time, depending on what the tile size is. JAKE: Interesting. SURMA: So if I have a tile size of two, a two-by-two pixel grid, it makes the code slower. Which is not very surprising, because you have so much more looping going on and more jumps. JAKE: OK. SURMA: It gets faster really, really quick. At some point, over here, you kind of hit level one cache boundaries where it then gets slower again. JAKE: Right. I see. OK. SURMA: To be honest, there's one weird thing where the Pixel 3 is slow, even with the massive grid, which I'm not quite sure why that is. I think-- JAKE: You mean it's fast even-- SURMA: Yeah. JAKE: It's faster. SURMA: I expected the pace was going to, like, go up somewhere around here. JAKE: You would assume the level one cache is less than any Mac book. SURMA: It probably is. And there's probably another effect here that I don't quite understand. JAKE: OK. SURMA: But what I felt-- JAKE: There's different architecture, as well, in that processor. OK. SURMA: But it seems to be a sweet spot between, like, 16 and, I don't know, 64, depending on what you want. I think 16 looks really promising in this graph. Which means you have, like, a 256 pixel grid that you work with. JAKE: I thought I was going to come into this episode and I was going to go away understanding why the tiling works. No. It just does. SURMA: I spent the last week on this. Right? JAKE: Right. SURMA: You've been kind of sitting across from me and hearing me talking to people and trying to figure this out. This is as close I've gotten to understand it. In that there is this interaction between the processor predicting what values went in the cache. And then, not forcing the processor to evict that cache, because you read too far ahead. JAKE: But this is a massive case for tools, not rules, right? Don't go away and rewrite all your code-- SURMA: With tiling. JAKE: --with tiling. SURMA: No. Right. JAKE: This is something you would have to very carefully profile on a wide range of machines with different processor architectures to see is this actually working across-- SURMA: And also I find it interesting because we started at let's rotate an image, a very high level use case. And we fell down. And ended up with, like, let's talk about processor architecture and level one caches. JAKE: Yes. SURMA: So thanks to "Hacker News", I guess, for ruining my week. But it's been actually been very educational, even though I still don't fully understand it. JAKE: But I feel like-- SURMA: I'm OK with that. JAKE: Yeah, and I feel that my understanding of lower level stuff is-- like I say, there's that confusion element. But I feel like I've got an appreciation for-- SURMA: The smarts, right-- JAKE: Yeah. SURMA: --that go into that. JAKE: It's incredible. SURMA: So let's take a breather. And we'll see our poor audience. next time. [LAUGHTER] JAKE: But this is going into Squoosh in-- SURMA: Yeah. It's going to be-- [INTERPOSING VOICES] [MUSIC PLAYING] JAKE: Yes. SURMA: Oh. That's the one thing-- ah. [GRUNTS] I have to write this down. How do I fix this? JAKE: [LAUGHS] (SINGING) Something for the edit. Something for the edit. SURMA: OK. Let's go from here.

Info

Channel: Google Chrome Developers

Views: 22,898

Rating: 4.9524837 out of 5

Keywords: Loop Tiling, image rotation, app, Squoosh, Hacker News, HTTP203, HTTP, replacing a hot path in app's JavaScript, WebAssembly, JavaScript, JS, C++, WASM, predictable performance

Id: S0NQwttnr1I

Channel Id: undefined

Length: 25min 35sec (1535 seconds)

Published: Tue Mar 19 2019