Next-Generation 3D Graphics on the Web (Google I/O ’19)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

So many "You can't do this in CSS!" examples.

Why doesn't the CSSWG fix this?

πŸ‘οΈŽ︎ 1 πŸ‘€οΈŽ︎ u/Baryn πŸ“…οΈŽ︎ May 10 2019 πŸ—«︎ replies

Maybe because it shouldn't be the job of css to do heavy operations when it comes rendering a page. Also, I think it would overly bloat the language and make it harder for browser makers to maintain parity when implementing the spec.

πŸ‘οΈŽ︎ 1 πŸ‘€οΈŽ︎ u/Jon723 πŸ“…οΈŽ︎ May 11 2019 πŸ—«︎ replies
Captions
[MUSIC PLAYING] RICARDO CABELLO: All right. Hello, everyone. Thanks for coming to this session. I hope you're having a good conference. Before we begin, we'll be sharing a lot of links. But don't worry-- you'll find them on the YouTube page whenever this recording goes up. OK. Let's start then. My name is Ricardo. And together with Corentin, I will be talking about the future through the graphics on the web. But before we do that, let's have a quick look at the past and present. WebGL landed in browsers in February 2011. That was in Chrome 9 and Firefox 4. Those browsers were the first ones who implemented. Back then, with the Google Creative Lab, we created a interactive music video that aimed to showcase the new powers the technology was bring into the web. It was a pretty big project in between creators, directors, concept artists, animators. Around 100 people worked on the project for half a year, and 10 of us were a JavaScript graphics developers. We knew the workflow and tools were very different compared to traditional web development. So we also made the project open source so others could use it as reference. Some years later, Internet Explorer and Edge and Safari implemented WebGL too, which means that today the same experience works in all major browsers, in desktops, and tablets, and phones. What I find most remarkable is the fact that we didn't have to modify the code for that to happen. Anyone with experience doing graphics programming knows that this is rarely the case. Usually we had to recompile the project every couple of years when operating systems update or new devices appear. So here's a quick recap-- just double checking. WebGL is a JavaScript API that provides binding to OpenGL. It allows web developers to utilize the user's graphic card in order to create an efficient and performance graphics on the web. It is a low level API, which means that it is very powerful, but it's also very verbose. For example, graphics cards main primitive is a triangle. Everything is done with triangles. Here's the code that we're going to need to write in order to display just one triangle. First, we need to create a canvas element. Then with JavaScript, we get the context for that canvas. And then, things get pretty complicated pretty fast. After defining positions for each vertex, we have to add them to a buffer, send it to a GPU, then link the vertex and fragment shaders. Compile a program that will be used to inform the graphics card how to fill that those pixels. So that's why a bunch of us back then started creating libraries and frameworks that abstracts all that complexity so developers and ourselves could stay productive and focused. Those libraries take care of placing objects in 3D space, material configurations, loading 2D and 3D assets, interaction sounds, et cetera, anything for doing any sort of game or application. Designing those libraries takes time. But over the years, people have been doing pretty amazing projects with them. So let's have a look at what people are doing today. So people are still doing interactive music videos. That's good. In this example, "Track" by Little Workshop not only works on desktop, mobile, but it also works on VR devices, letting you look around while traveling through glowing tunnels. Another clear use of the technology is gaming. "Plume" is a beautiful game developed by a surprisingly small team and was released in our last year's Christmas experiment. Another one is a web experiences. In this case, the "Oat the Goat" is an interactive animated storybook designed to teach children about bullying. And the folks at Assembly used Maya to model and animate those characters and then export it to glTF via Blender. For rendering the user GIS, they brought like 13,000 lines of TypeScript to make the whole thing work. And another very common use is a product configurators. The guys at Little Workshop, again, show how good those can look in this demo. But those use cases do not end there. People are doing data visualizations, enhancing newspaper articles, virtual tours, documentaries, movie promotions, and more. Like, you can check the ThreeJS website and the BabylonJS website to see more of those examples. However, we don't want to end up in a world where the only HTML elements in your page is just a canvas stack and script tech. Instead, we must find ways of combining WebGL and HTML. So the good news is that lately we have been seeing more and more projects and examples of web designers utilizing bits of WebGL to enhance their HTML pages. Here's a site that welcomes the user with a beautiful, immersive image. We're able to interact with the 3D scene by moving the mouse around the image. But after scrolling the page, then we reach a traditional static layout with all the information about the product, as traditional websites usually look like. The personal portfolio of Bertrand Candas shows a set of deep elements affecting dynamic background. It's a little bit dark but OK. With JavaScript, we can figure out the position of those deep elements. And then we can use that information to affect the physical simulation that happens on the 3D scene on the background. But for underpowered devices, we can just replace that WebGL scene with a static image, and the website is still functional. Another interesting trend we have been seeing is the websites that use distortion effects. The website for Japanese director, Tao Tajima, has a very impressive use of them. However, the content is actually plain and selectable HTML. So it is surprising because, as you know, we cannot do these kind of effects with CSS. So if we look at it again, what I believe that they are doing is they had the DOM elements that they are copying the pixels all those elements into the background of WebGL canvas. Then they had a DOM element that they apply the distortion. They did the finished transition, and they put the next DOM on top. So it's still something that you can enable and disable depending on if it's more-- it works on mobile, some other things, but something that you can progressively enhance basically. One more example to cite for this that applies the distortion effect on top of the HTML basically making the layout look truly fluid. Then again, this is something surprising because it wouldn't be possible with CSS. So I think those are all great examples of the kind of results you can get by mixing HTML and WebGL. But it still requires the developer to diving into JavaScript. And that, as we know, can be a little bit tedious to connect all the parts. If you're more used to React, this new library by Paul Henschel can be a great option for you. React Three Fiber mixes React concepts on top of previous abstraction. So here's the code for the animation that we just saw. Notice how the previously defined Effect and Content components is composed into the canvas. It makes the code much more reusable and much easier to maintain. However, I think that we can still make it even simpler. And the Web Components-- I believe Web Components will allow us to finally bring all the power of WebGL right into the HTML layer. We can now encapsulate all those effects in compostable custom elements and hide all the code complexity. So for example, here is another project that we did for the WebGL launch eight years ago. It was kind of a globe platform. It was a project that allowed JavaScript developers to visualize different data sets on top of a globe. You will have the library. You have your data, and then you'll have to use different-- manage here different parts of the data to display. But even if we tried to hide the WebGL code, developers still had to write custom JavaScript for loading the data and configure the globe and append it to the DOM. And the worst part was the developers will still have to handle the positioning of the DOM object and the resizing. And it was just difficult to mix it with a normal HTML page. So today, with Web Components, we can simplify all the code. We use those two lines. The developer only has to include the JavaScript library on their website. And powerful custom element is now available to place whenever they need in the DOM. Not only that, but at that point, by duplicating the line they can have multiple globes. Before, they will have to duplicate all the code, and it will be, again, harder, more code to read and parse. A component that is already ready to use, the previous one is not ready yet. This one, model-viewer, is already ready. And for this one, basically, we wanted to do that. The problem is that displaying a 3D models on the web is still pretty hard. So we really wanted to make it as simple as embedding an image in your page, like as simple as adding an image tag. So that's the main goal. For this one, again, the developer only has to include a JavaScript library. And then a powerful a custom element is ready to display like any 3D models while using the glTF open standard. An important feature of HTML text is accessibility. For low vision and blind users, we're trying to inform them on both the 3D model of what the 3D model is and also orientation of the model. Here you can see that the view angle is being communicated verbally to the user so they can be oriented with what's going on. And also it prompts for how to control the model with keyboard. And I see exit back to the rest of the page. The model-viewer also supports AR, Augmented Reality. And you can see how it's also really being used on the NASA website. So use by adding the AR attributes is going to show an icon, and it's going to be able to launch the AR viewer for both an Android and iOS. For iOS, you have to include the USCC file. And lastly, while building the components, we realized that depending on the device you can only have up to eight WebGL context at once. So if you create a new one, the first one disappears. It is actually like a well-known limitation of WebGL. But it's also good practice to only have one context for keeping memory in one place. The best solution that we found for this was creating a single WebGL context off screen. So like [INAUDIBLE]. And then we use that one to render all the model-viewer elements on the page. We also utilized the IntersectionObserver to make sure that we are not rendering objects that are not in view. And also ResizeObserver whenever detecting either the developer is modifying the size, we re-rendering if we have to. But we all know how the web is. Sooner than later someone we want to display hundreds of those components at once. And that is great. We want to allow for that. But for that, we'll need to make sure that the underlying APIs are as efficient as possible. So for that, now Corentin is going to share with us what's coming up in the future. Thank you. [APPLAUSE] CORENTIN WALLEZ: OK. Thank you, Ricardo. This was an amazing display of what's possible on the web using GPUs today. So now I'll give a sneak peek of what's coming up next in the future where you'll be able to extract even more computational power from GPUs on the web. So hey, everyone. I'm Corentin Wallez. And for the last two years at Google, I've been working on an emerging web standard called WebGPU in collaboration with all the major browsers at W3C. So WebGPU is a new API that's the successor to WebGL. And it will unlock the potential of GPUs on the web. So now you'll be asking, Corentin, we already have WebGL, so why are you making a new API? The high level reason for this is that WebGL is based on an understanding of GPUs as they where 12 years ago. And in 12 years, GPU hardware has evolved. But also the way we use GPU hardware has evolved. So there is a new generation of GPU APIs in native-- for example, Vulkan that helped do more with GPUs. And WebGPU is built to close the gap with what's possible in native today. So it will improve what's possible on the web for game developers, but not only, it will also improve what you can do in visualization, in heavy design applications for machine learning practitioners and much more. So for the rest of the session, I'll be going through specific advantages or things that WebGPU improves over WebGL and show how it will help build better experiences. So first, WebGPU is still a low level and verbose API so that you can tailor usage of where GPU to exactly what your application needs. This is the triangle Ricardo just showed. And as a reminder, this was the code to render that triangle in WebGL. Now, this is the minimum WebGPU code to render the same triangle. As we can see, the complexity is similar to WebGL. But you don't need to worry about it because if you're using a framework, like Three or Babylon, then you'll get the benefits transparently for free when the framework updates to support WebGPU. So the first limitation that WebGL frameworks run into is the number of elements or objects that can draw each frame because each drawing command has a fixed cost and needs to be done individually each frame. So with WebGL, an optimized application can do a maximum 1,000 objects per frame. And that's kind of already pushing it. Because if you want to target a variety of mobile devices and desktop devices, you might need to go even lower than this. So this is a photo of a living room. It's not rendered. It's an actual photo. But the idea is that it's super stylish, but it feels empty and cold. Nobody lives there. And this is sometimes what it feels looking at WebGL experiences because they can lack complexity. In comparison, game developers in native or on consoles are used to, I don't know, maybe 10,000 objects per frame if they need to. And so they can build richer, more complex, more lifelike experiences. And this is a huge difference. Even with the limitation in the number of objects, WebGL developers have been able to build incredible things. And so imagine what they could do if they could render this many objects. So BabylonJS is another very popular 3D JavaScript framework. And just last month, when they heard we were starting to implement WebGPU, they're like, hey, can we get can we get some WebGPU now? And we're like, no. It's not ready. It's not in Chrome, but here's a custom build. And the demo I'm going to show is what they came back to us with just two days ago. So can we switch to the demo, please? All right, so this is a complex scene rendered with WebGL. And it tries to replicate what a more complete game would do if every object was drawn independently and a bit differently. So it doesn't look like it, but all the trees and rocks and all that there are independent objects and could be different objects. So in the top right corner, there's the performance numbers. And we can see that as we zoom out and we see more objects the performance starts dropping heavily. And that's because of the relatively high fixed cost of drawing each object, of sending the command to draw each object. And so the bottleneck here is not the power of the GPU on this machine or anything like that. It's just JavaScript iterating through every object and sending the command. Now let's look at an initial version of the same demo in WebGPU. And keep in mind this was done in just two weeks. So as the scene zooms out, we can see that the performance stays exactly the same, even if there's more objects to draw. And what's more, we can see that the CPU time of JavaScript is basically nothing. So we are able to use more of the GPU power because we're not bottle-necked on JavaScript. And we also have more time on the CPU to run our applications logic. So let's go back to the slides. What we have seen is that for this specific and early demo, WebGPU is able to submit three times more drawing comments than WebGL and leaves room for your applications logic. A major and new version of BabylonJS, BabylonJS 4.0 was released just last week. And now, today, the BabylonJS developers are so excited about WebGPU that they are going to implement full support for the initial version of WebGPU in the next version of BabylonJS, BabylonJS 4.1. But WebGPU is not just about drawing more complex things with more objects. A common operation done on GPUs are, say, post-processing image filters-- for example, the depth of field simulation. We see this all the time in cinema and photography. For example, this photo of the fish, we can see the fish is in focus while the background is out of focus. And this is really important because it gives us the feeling that the fish is lost in a giant environment. So this type of effect is important in all kinds of rendering so we can get a better cinematic experience. But it's also used in other places like camera applications. And of course, this is one type of post-processing filter, but there's many other cases of post-processing filters, like, I don't know, color grading, image sharpening, a bunch more. And all of them can be accelerated using the GPU. So for example, the image on the left could be the background behind the fish before we apply the depth of field. And on the right, we see the resulting color of the pixel. What's interesting is that the color of the pixel depends only on the color of a small neighborhood in the original image, in a small neighborhood of the pixel in the original image. So imagine the grid on the left is a neighborhood of original pixels. We're going to number them in 2D. And the resulting color will be essentially a weighted average of all these pixels. Another way to look at it is to see that on top we have the output image, and the color of each of the output pixels will depend only on the 5 by 5 stencil of the input image on the bottom. The killer feature of WebGPU, in my mind, is what we call GPU Compute. And one use case of GPU Compute is to speed up local image filters, like we just saw. And so this is going to be pretty far from DOM manipulation, like React, or amazing web features, like course headers, so please bear with me. We're going to go through it in three steps. First, we look at how GPUs are architectures and how an image filter in WebGL uses that architecture. And then we'll see how WebGPU takes better advantage of the architecture to do the same image filter but faster. So let's look at how GPU works, and I have one here. So this is a package you can buy in stores. Can you see it? Yes. So this is a package you can buy in stores and the huge heat sink. But if we see inside, there's this small chip here. And this is the actual GPU. So if we go back to the slides, this is what we call a die shot, which is a transistor level picture of the GPU. And we see a bunch of repeating patterns in it. So we're going to call them execution units. These execution units are a bit like cores in CPUs in that they can run in parallel and process different workloads independently. If we zoom in even more in one of these execution units, this is what we see. So we have, in the middle, a control unit which is responsible for choosing the next instruction-- like, for example, at two registers or load something from my main memory. And once it has chosen an instruction, it will send it to all the ALUs. The ALUs are the Arithmetic and Logic Units. And when they receive an instruction, they perform it. So for example, if they need to add two registers, they will look at their respective registers and add them together. What's important to see is that a single instruction from the control unit will be executed at the same time by all the ALUs, just on different data because they all have their own registers. So this is single instruction multiple data processing. So this is the part of the execution unit that is accessible from WebGL. And what we see is that it's not possible for ALUs to talk to one another. They have no ways to communicate. But in practice, GPUs look more like this today. There is a new shared memory region in each of the execution units where ALUs can shared data with one another. So it's a bit like your memory cache in that it's much cheaper to access than the main GPU memory, but you can program it directly, explicitly, and have ALUs share memory there. So a big benefit of GPU Compute is to give developers access to that shared memory region. This was the architectures of GPUs and their execution units. So now we're going to look at how the image filter in WebGL maps to that architecture. For a reminder, this was the algorithm we're going to look at. And in our example, since our execution units has 16 ALUs, we're going to compute a 4 by 4 block, which is 16 pixels, of the output in parallel. And each ALU will take care of computing the value for one output pixel. And this is GPU pseudo-code for the filtering WebGL. And essentially it's just a 2D loop on x and y that fetches from the inputs and computes the weighted average of the input pixels. What's interesting here is the coordinates' argument to the function is a bit special because it's going to be pre-populated for each of the ALUs. And that's what we'll make that ALUs each do an execution on different data because they start populated with different data. So this is a table for the execution of the program. And likewise, we can see the coordinates are prepopulated. So each column is the registers for one of the ALUs. And we have 16 of them for the 16 ALUs. So the first thing that happens is that the control unit says, hey, initialize sum to 0. So all of them initialize the sum to 0. And then we get to the first iteration of the loop in x, and each ALU gets its own value for x. Likewise, each ALU you gets its own value for y. And now we get to the line that does the memory load of the value of the inputs. So each ALU has a different value of x and y in their registers. And so each of them will be doing a memory load to a different location of the input. Let's look at this register at this ALU. It's going to do a memory load at position minus 2, minus 1. We're going to get back to this one. So if we go in and do another iteration of the loop in y, likewise, we update the y register, and we do a memory load. What's interesting here is that the first ALU you will do a memory load in minus 2, minus 1. That's a redundant float because we already did it at the last iteration. Anyways, the loop keeps on looping, and there's more loading and summing and all that that happens. And in the end, we get to the return, and that means the sum will get returned to the output pixel and the computation for our 4 by 4 block is finished. Overall, the execution of WebGL of the algorithm in WebGL for a 4 by 4 block did 400 memory loads. The reason for this is we have 16 pixels and each of them did 25. So now, this was how the filter executed in WebGL. We're going to look at how WebGPU uses the share of memory to make it more efficient. So we take the same program as before. It's that exact same code, and we're going to optimize it with share a memory. So we introduce a cache that's going to contain all the pixels of the input that we need to do the computation. This cache is going to be in shared memory so that it's cheaper to access than the actual input. It's like a global variable that's inside the execution unit. Of course, we need to modify the shader to use that input tile. And because the input tile needs to contain values at the beginning, we can't just start like this. So this function is going to be a helper function that computes the value of the pixel. And we're going to have a real main function that, first, completes the cache, and then calls the computation. So like the previous version of the shader, the coordinates are pre-populated so each of the ALUs does a different execution. And then all the ALUs work together to populate the cache. And there's a bunch of loops and whatnots there, but it's not really important. So I'll spare you this. What's interesting to see is that only 64 pixels of the input are loaded and put in the cache. There is no redundant memory loads. Then we go through the main computation of the value and likewise. This is very similar to what happened before. But on this line, the memory load is now from the shared memory instead of the main memory, and this is cheaper. So overall, thanks to the caching of the tile of the input, the WebGPU version didn't do any redundant main memory load. So for our 4 by 4 block, it did 64 memory loads. And like we saw before, WebGL had to do 400. So this looks very biased in favor of WebGPU, but in practice, things are a bit more mixed because WebGPU didn't do main memory loads. But it did a bunch of shared memory loads, and it's still not free. And also WebGL is a bit more efficient than this because GPUs have a memory cache hierarchy, and so some of these memory workloads will have hit the cache that's inside the execution unit. But the point being, overall, WebGPU will be more efficient because we explicitly are able to cache input data. So the code we just talked about-- in the graphics world, it's called an image filter. But if we look at the machine learning world, it's called a convolution operator. All the optimizations we talked about, they also apply to Convolutional Neural Networks, also known as CNNs. So the basic ideas for CNNs were introduced in the late '80s. But back then it was just too expensive to train and run the models to produce the results we have today. The ML boom of the last decade became possible because CNNs and other types of models could run efficiently on GPUs in part thanks to the optimization we just saw. So we are confident that machine learning web frameworks such as TensorFlow JS will be able to take advantage of GPUs to significantly improve the speed of their algorithms. Finally, algorithms can be really difficult to write on GPUs using WebGL. And sometimes they're just not possible to write at all. The problem is that in WebGL, where the output of computation goes is really, really constrained. On the other hand, GPU Compute that WebGPU has is much more flexible because each ALU can read and write memory at any place in the computer memory. This unlocks a whole new class of GPU algorithms from physics and particle based food simulation, like we see here, to parallel sorting on the GPU, mesh skinning, and many, many more algorithms that can be offloaded from JavaScript to the GPU. So to summarize, the key benefits of web GPU are that you can have increasing complexity for just better and more engaging experiences. And this is what we have seen with BabylonJS. It provides performance improvements for scientific computing, like machine learning. And it unlocks a whole new class of algorithms that you can offload from JS CPU time to run on the GPU in parallel. So now you're like, hey, I want to try this API. You're in luck. The WebGPU is a group effort, and everyone is onboard. Chrome, Firefox, Edge, Safari-- they're all starting to implement the API. Today, we're making an initial version of WebGPU available on the Chrome Canary on MacOS, and other operating systems will follow shortly. To try it you just need to download Chrome Canary on MacOS and enable the experimental flag and safe web GPU. And again, this is an unsafe flag, so please don't browse the internet with it for your daily browsing. More information about WebGPU is available on webgpu.io. So there's the status of implementations. There's a link to some samples and demos. A link to a forum where you can discuss WebGPUs. And we're going to add more stuff to this with articles to get started and all that. What we'd love is for you to try the API and give us feedback on what the pain points are, what you'd like the thing to do for you, but also what's going great and what you like about it. So thank you, everyone, for coming to this session. Ricardo and I will be at the Web Sandbox for the next hour or so if you want to discuss more. Thank you. [MUSIC PLAYING]
Info
Channel: Google Chrome Developers
Views: 129,305
Rating: 4.9349151 out of 5
Keywords: type: Conference Talk (Full production);
Id: K2JzIUIHIhc
Channel Id: undefined
Length: 34min 15sec (2055 seconds)
Published: Thu May 09 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.