[JSConfUS 2013] Steven Wittens: Making WebGL Dance

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

"How to wow a bunch of web developers with conventional 3D graphics" ;)

πŸ‘οΈŽ︎ 16 πŸ‘€οΈŽ︎ u/mflux πŸ“…οΈŽ︎ Nov 08 2017 πŸ—«︎ replies

What a fantastic video! The animations really really helped clear up so much for me. I feel like I understand graphics a lot better now, even after having taken a university course on graphics a few months ago.

πŸ‘οΈŽ︎ 1 πŸ‘€οΈŽ︎ u/Zofren πŸ“…οΈŽ︎ Nov 08 2017 πŸ—«︎ replies

This guy has a fantastic blog at http://acko.net.

πŸ‘οΈŽ︎ 1 πŸ‘€οΈŽ︎ u/Flopster0 πŸ“…οΈŽ︎ Nov 08 2017 πŸ—«︎ replies
Captions
[Music] so the title of this talk is making WebGL dance and that's because linear algebra isn't as sexy on a schedule but the talk isn't so much about WebGL directly as it is about graphics hardware and how it works because this could be making OpenGL dance or direct3d all these principles kind of apply and so my remote is not working just sick here we go so I'd like to talk about three things how to draw where to draw and what to draw which is sort of the the mental model of how you should think about what a graphics card does and where it comes from as opposed to doing sort of yet another WebGL 101 how to draw a cube and make it move because well that teaches you you know how to use WebGL and how to use the API it kind of ends up being a dead end because you're not quite sure what's going on and once you get into more advanced stuff if there's a bug or an issue you don't know what to do so the good news is however everything I'm going to be talking about is really stuff that you should understand but not necessarily know how to implement yourself because the graphics hardware does it for you it's convenient so even though there's gonna be a lot of techniques and all that you don't be worried about it because you just you know type the command or set the thing to true to get that effect going the bad news is there's going to be four dimensions and I'm not going to apologize about that because it's cool and the thing is you know this is mathematics but we shouldn't be afraid of it we should embrace it what this means in this case is this just a fourth number where before there used to be only three and it has some interesting consequences so let's get started in the beginning there was pixels and pixels were fun but no that's not true there was a vector graphics first but we'll forget about that pixels there was something called brezin hemlines and that was fun because using a simple for loop you could just color in pixels and draw lines and once we figured out how to do that we could even draw solid shapes using something like scanline rendering to draw a triangle like this unfortunately if you try to for example make a spinning triangle with the technique it doesn't so much rotate as tick why does it tick why does it look bad well it's because we're snapping to pixels everything we've done is we've defined our graphics operations in terms of the pixel grid which means we can only move the corners in this fashion how do we solve this well if you know a little bit about graphics you may be thinking of the word anti-aliasing no that's something else because look at this beautifully anti-aliased triangle that's still jerking around in an ugly fashion what we actually want is something called sub pixel accuracy which looks like this where even though you know it's alias there's only black and white and there's jaggies it's moving smoothly and I can put the corners of the shape anywhere I want how exactly does this work because it's not obvious anymore where exactly the the edges on the pixel grid start or end when the the pixel in the corner actually should be black when it's partially covered etc and the key is something called sampling or using samples where whether a pixel is black or white is actually just defined by one thing which is the point right in the middle so at that point is inside the triangle we color the pixel black otherwise we color it white and this might not seem like an important concept but it really is because it defines everything that follows and that's kind of why I'm talking about it first because what this means is there's really two worlds on the left hand you have the world of vectors where you know everything is mathematical and beautiful your shapes are mathematically defined and then you use sampling to transfer it over to the world on the right which is the raster world the world of pixels and you really have to think of this sort of as a final step and the reason is because when you actually look at this diagram on the left hand side those samples are in the middle of every pixel and that means the edges of the pixel grid don't actually exist in vector world there's sort of an artifact of the rasterization if you will and in fact we describe this as a nearest neighbor filter where you know the when we try to turn this grid of samples back into something continuous we just assign the color of the nearest people to that area and the fact that the pixels come out square is kind of just an artifact of that is not good of the where the samples are placed and and so this is ugly and pixelated a better ways for example to use a bilinear filter where you you add gradients between pixels to sort of even it out and what's important about this diagram is that the color information is not again on the only edges of the pixel grid it's in the middle and so the edges of the pixel are kind of a distraction data they don't factor in here so what actually is anti-aliasing and an aliasing for well it's this in this case I'm taking a texture of white and of gray and black and looking at in perspective and it's being sampled and you can see in the distance there's all this noise and blurring is going on and the reason that happens is because when you sample you can only pick one or the other either gray or black what it actually should look like is something like this where the further you go into the distance the the more blurred out and gray it gets because you know the projection of a single pixel on the screen now covers a wide area far far far in the distance look at a liasing a bit closer has to do with something called a sampling theorem and the Nyquist frequency where if you take this pattern of bars and you can press it you get to this point where it goes white black white black white black alternating and you try to compress it further it starts to get messed up in fact it folds in on itself and does something weird running that backwards you can see you approach the Nyquist frequency white black white white black and then it expands smoothly again so what that means is you can't compress any more variation onto the pixel grid then that particular frequency because there's no room for it and so the fact that you get jaggies when you're trying to draw shapes is really an artifact of aliasing and instead of thinking again of the pixel edges you need to look at the slopes here because because of the pixel grid there maximum slope between white and black that we can we can represent with this information and that's really where the problem lies it has very little to do with jaggies which is what most people think of when they talk about anti-aliasing so what is anti-aliasing isn't blurring all the things sort of it's blurring it in a specific way what you're actually doing is you're determining how much of your shape covers every pixel and then shading the the pixel proportionately and so you can do that mathematically using the rules of geometry to get an exact result but that's kind of annoying especially when you have you know not one triangle but let's say a million which is not all that uncommon these days so we use something else called super sampling where you put lots of samples inside the pixels and then you get you know in this case 16 shades gray possible per pixel and that sort of looks okay unfortunately now you're doing 16 times more work so they invented something called multi sampling where you only apply the the dense sampling on the pixels that need it and it looks like this now you you'll notice in the middle of the shape you know we're doing traditional sampling which means there's no anti-aliasing there which means it'll is going to look ugly and then distorted like before so for that we use other things like the anisotropic filtering that I showed before and the what you end up approximating and sampling is something more like this where this looks more like a spinning triangle and then the jagged one before which just skipped around now if you try to think of this in the the pure vector world without sampling this is going to look something like this which is a triangle that's rotating smoothly surrounded by one pixel white gradient and you can see that because of the sampling this is where the sort of shades of grey Stairsteps comes from as these gradient transitions underneath the the sampling grid and whenever you have any sort of filtering technique whether it's a Gaussian blur or mipmapping or an anisotropic filter you're basically just blurring your information before sampling it to make sure that the slope of your color information does not exceed the limit that you represent now just for completeness I mentioned this WebKit font smoothing which is you know a lot of people have issues with this and turn it off because Apple kind of did something stupid and uses different filter that makes a font look darker but it's just you know it's sampling on a sub pixel grid doing red green blue separately and the thing is pixels are really dying or are dead but they've been reborn because pixels are dead as a design unit because everything is scalable now and even though I'm supposed to be supposed to be talking about WebGL all of this also really applies to CSS now because your iPhone your your zoomable browser is a GL scene and is defined in vector world not in raster world and not just pixels but any sort of information in geometry for example this height map which I'm generating procedurally and that's that's a different talk is defined on a grid and you know you have to think of this as sampled information that you're processing now you can see here I'm tuning the knobs on this generator to make it oscillate in an interesting way and you can see that I've added water and you may be wondering how does it know which part should be blue which parts should be green is it for example taking the surface that I've defined and and cutting out the parts that are underwater is it doing something else well it's actually quite simple and it turns out that samples are both the curse and that's and the blessing the problem and the solution because we use something like that your sample isn't just color it can have additional information associated with it for example depth we use something called a Zed buffer where for every pixel in your image you just record the depth as a number in this case represented as a shade of gray and whenever you draw anything you update your depth map along with it so that you know is the thing that I'm drawing closer or further away than what's already there and so you can do per pixel cutting out of shapes that intersect without actually knowing anything about the geometry involved it's it's a sort of very dumb but very effective way of doing it and that turns out to be what GPUs are great out because this is massively parallel izybelle and aside from depth for example you might also record orientation in the form of normal which is a vector I'll go into that more later for example for something called deferred lighting where you you generate an image like this and then you paint the light in afterwards using only this information so you stop knowing what is outside the frame of the picture you don't even know what it represents all you know is that you have a grid of samples with depth and normal and color that you then used to do lighting with so next step is the part where you learn linear algebra and that might you know send heart rates spiking a little bit but don't worry because we're going to do fun stuff with images let's start with something called the affine transforms which you would know as the transform tool from Photoshop or illustrator and those transforms include rotation scaling and skewing and they have something in common what they have in common is that they all preserve parallel lines no matter how you combine them and that's a very useful property because it means you can describe the entire transform just by saying what it does to the grid so you have you can describe the grid using something called a vector basis which is just your X unit and Y unit represented as an arrow in this case and because vectors are arrows we can sort of do math with them and that's that's essentially linear algebra but let's forget about the numbers for a second and just do it on paper and suppose for example that I define a new vector basis and I give you two arbitrary arrows and I say tell me what the smiley face looks like after it's been transformed or let's start with just that one point in purple see what it looks like or where it should be and that's easy because you decompose it into its X and Y coordinates and then you reassemble it on the other side using the new grid and because vectors are arrows you can also scale them not just add them together and that means that using this procedure of disassembling and reassembling you can find out where any point is after it's been transformed just by using your vector basis and if you get this then congratulations you get matrices because these mysterious grids of numbers that they threw at you in school and that they never really wanted to explain are really simple they're the coordinates of the blue and the green dot that just describe where the grid has moved to when you're done transforming and so the column on the left is just the x coordinate y coordinate of the blue points the right the green numbers are the X&Y coordinate of the Green Point the reason you put them in a matrix is because that allows you to do the computations effectively in this case something called the matrix vector multiplication which is just transforming a point to conform to a new grid and it looks like this when you write it out and so again you do this procedure of disassembling this case literally cleaving your matrix into two and multiplying it by the two coordinates of the point that you're interested in in this case transforming the red point into the purple point transforming a circle into an oval and then if we want to for example apply another transformation let's say a rotation we find that our vector basis has just moved to a new location which means there's just a new matrix and this is where where matrix math gets really interesting because if you do the naive solution to this problem you apply two transforms as in the bottom and you know for every transformation that you pile on you have to do another matrix multiplication but because of the properties you can condense them all together and end up just optimizing in a way into a single transform and that's sort of the secret as the Y computer graphics can be fast because you can condense the relationship and the placement of objects into just a single mathematical object of a constant size namely a matrix and this works in 2d but it also works in 3d so a 3d image is just a set of three vectors this case red green blue that define an orientation in space and in the matrix you end up with three columns of three coordinates and again you do a matrix multiplication to transform in this case the flat smiley face into a one that's positioned in 3d and it also works in four dimensions and that can be a little weird now why on earth would you do this what's the point is what is the fourth dimension well the key is actually in the image which is four dimensions projected back into 3d where suddenly parallel lines are no longer parallel which is interesting because that's what perspective does and when you're rendering in 3d often you want perspective you want it to look natural and that is why we use four dimensions and I can't really show you four dimensional math because picturing that is kind of kooky but I'll show you how it works for two-dimensional images where you apply the third dimension to do the same kind of trick so the reason we use four dimensional matrix is to do 3d math is the same reason we use three dimensional matrices to do 2d math and it works like this you have ordinary 2-dimensional space a grid and that you end up sort of doing a bat-signal thing where you project it at word like this and you apply so this is this is space this is 3d and yet we call it two-dimensional space why well because the image that we're projecting is two-dimensional and just like you know a bat-signal projected onto a cloud the delight along any particular Ray is constant which means you only have two degrees of freedom instead of three why do we do this bill because what happens when you start changing the vector basis and in doing so we've gained a couple new things for one we've gained a whole new Zed vector the red one which which didn't exist before we just made that up and that's the interesting one because if you move that one you aim the bat-signal just like that and that means that all of a sudden instead of just being able to change shapes in place we can move them around as well which is something that affine transforms couldn't do on their own so you get translation and scaling effects where if you pull out the the red vector or push it in sort of changing its length you end up tightening the beam or expanding the beam and we have something else which is that our x and y vectors now also have a Z coordinate which until now has just been zero and they control the alignment of the the image because if I change those for example tilting them all the way back you see that you tilt the image so that part of it is now closer to the projection point part of it is further away which ends up being what perspective does so if you project this back down to two dimensions now your parallel lines are no longer parallel and you you've done perspective transforms and then that has an interesting implication because it means that if I'm drawing something like this a cube in 3d space I can position and move and orient it using a single four four dimensional matrix and it has a structure where in the upper left portion there's a three by three normal matrix that can turn that controls rotation scaling skewing there's one for translation which is you know shining the the beam differently in four dimensional space and then there's a Z sorry in this case the W coordinates the fourth coordinate d HL which controls perspective and creates you know narrowing lines effects and vanishing points and that sort of thing and like I said matrices can be condensed you know applying five matrices in a row you can express that as just one single combined matrix which is very handy and that means that the mathematical relationship between that little black cube on the screen and you know as its defined in 3d space and the actual location on the screen as its projected is defined by a single matrix and and I can show you how that works we start in object space which is what you're looking at the cube on its own defined with its the the origin sort of at its center of mass or if it's a character usually you put it under their feet it's just somewhere in the middle so you have a point of reference you transform object space into world space by placing objects in with the matrix matrix you can scale them rotate them move them around etc and world space is where everything else lives so in this case I added a camera and I added a ground plane so it has something to sit on and world spaces again defined relative to some reference point which is the the frame that you're doing all your your placement and simulation in then we have another matrix transformation that takes you to view space which is a coordinate systems centered on your camera in this case Zed points forward but that's a convention and you have to be careful which way your axes points when you're doing these things and finally from view space we go into screen space which is simply pixels and pixel units and so because we're doing matrix transformations this entire chain is ends up being implemented or represented as a single transformation and you can go backwards as well but that's a little bit trickier finally I just want to talk about shaders which is how these things are actually implemented on a graphics card say for example you have a 3d model which in this case is defined as vertices points in space and triangles that span those points so in terms of data structure you're looking at a list of vectors which define points and then a set of indices saying you know triangle 1 goes from 1 2 to 3 triangle 2 goes from 2 to 3 to 4 etc and that makes up this whole object in object space and then we have a program called a vertex shader that just takes one vertex as an input and outputs another vertex as output and you can see that it's those are four dimensional vectors that are coming out and that that's because we're dealing with the four dimensional sorry 3d projective space which has four coordinates and so this transforms something called clip space which I mentioned for completeness you can just think of this as screen space there's there's a sort of a difference but it's not very interesting and then the second part is we take these coordinates into clip space which is you know where is the thing on the screen and then we turn it into samples and then every one of those samples we apply another program to which is the fragment shader and that determines the color of every pixel so on the left you see that the light is changing the position of the vertices isn't changing so only the parameters of the fragment shader are being tuned here not the vertex shader the vertex shader is just constant at this point so what does this look like this is defining a language called GLSL GL shading language and it's a very simple program you can see that the main function consists of one line in this case which simply applies three separate matrices to the position after taking the the position and making it four dimensional and there's this complicated terminology here but it's really not that hard for example uniforms are just global variables for example if you're trying to draw an object and the object is red that the fact that it's red is constant everywhere so that's a uniform property that you you set and then per vertex we have something called attributes which is the specific position of every point and the at the end we set GL position which is a standard variable - that determines where on the screen you're rendering and so this is this is like your vanilla g GL pipeline but the point is that this is just one possible way of doing it you can do tons of things in vertex shaders and and that's why shaders have been sort of the main driving force behind what's been going on in games and in just offline 3d rendering as well transforming shapes on the fly on a sort of point by point basis is a very powerful approach because you can do it in parallel very efficiently and then a fragment shader looks like this for example again we have uniforms like say color and a direction of light in this case and then we have something called varying and then this is this can be a little bit tricky to grasp but basically our information at this point is only defined on the corners of the mesh so when you're trying to fill the pixels in the middle you need to have a way to transform that information and you can just you know assume that it's constant across the entire triangle which makes it uniform or you could have a property that's varying that varies and the way that works is you just define it at the corners and the graphics hardware interpolates it for you fills out the part in the middle and then what this fragment shader does is determines how how much the surface should be lit by doing something called a vector dot product which you can look up and see what that does and then multiplies that intensity by the color of the surface and so this is again the simplest possible fragment shader in fact what's going on on the left is actually a little bit more complicated because it has something called specular light which is sort of that the glossiness that I didn't even bother to implement in the code on the right and so here's an example of for example a vertex shader that does some real work this is skeletal animation and the way you have to think about this is the model of in this case this cyberdemon from doom 3 is just defined statically and usually they even put it in a sort of Vitruvian Man pose so it's nice and neutral and then they feed in the orientation of all the bones in the skeleton every point is linked to a point on the skeleton and by tuning the knobs by changing the matrices you can animate this character and this is all done on the GPU on the graphics hardware not on the CPU side an example of a fragment shader that is very common is normal mapping and so you see that all of a sudden it seems like this model has become way more detailed and that's an illusion because the geometry hasn't changed all I'm doing is I'm applying something called a normal map which tells the describes the orientation of the surface and so I'm cheating because it looks like it has tiny bumps and little crevices and all that but it's it's painted on in a way that changes the shading so it's not just a texture it's it's information that is being splattered onto a model interpolated and and used to do the lighting on a per pixel basis and because this is such an easy trick we don't care about depths as much as we do about light and shadow this is a very effective way of creating the appearance of hundreds of thousands of triangles when really there's only a couple thousand and you combine that for example with a color map texture and then there's something else called a specular map which determines which part of this creature are glossy in which parts are metal or soft shaded and you end up with something like this which I think looks pretty damn cool we're not done yet I'll just show you one more trick that you can do with fragment shaders so here I've put a floor of bricks underneath the monster and the bricks are being normal mapped so that means that as I move the lights above it you can see that the lights that have catches the edges of the brick and then it looks like they have depth but that's an illusion because if I put the camera at a glancing angle you can start to see that it's fake that there's no actual depth here it's just sort of a texture that happens to have convincing shading but we can fix that we apply something called parallax mapping and suddenly it looks like these bricks actually have depth and yet what I'm telling the GPU to do is still just draw a flat square how does that work why it does this all of a sudden look like it is correct and here's the secret I'm going to lock the uniform that tells the shader where the camera is so the shader is going to think that the camera is staying right here when when it's not and I'm going to rotate to the other side and now all of a sudden this illusion is completely destroyed and you can see if you look closely there's a sort of distortion going on with the texture where it is streakiness and and and weirdness and so the point of this effect is it distorts the texture of the bricks in such a way that from your point of view it looks correct in 3d but it's really cheap it's it's completely fake and so you can see that the the silhouette at the top left for example stays a straight line as far as the the depth is concerned this is a completely flat surface but it's such an effective technique and here for example I'm going to update the the uniform that controls that tells it where the camera is and all of a sudden the illusion is restored and you get breaks that look real that's pretty much it I just want to leave you with a couple of things to look at in case you like this and that's something you might want to do arrow twist comm which is Paul Lewis who has it's not only a beautiful site but the articles are really well-written and simple and straight to the point so check it's mostly 3GS which is also what I use I like using 3GS because it just takes care of the boilerplate there is Inigo classes site which is sort of like a treasure trove of demo scene techniques and and other interesting mathematical things mr. doob is the author of 3GS has a site full of demos and interesting things that you can look at and ultra Cape qualia is another person it was their implementation and part of the cyberdemon model that i used for this presentation thank you very much [Applause]
Info
Channel: JSConf
Views: 52,352
Rating: undefined out of 5
Keywords:
Id: GNO_CYUjMK8
Channel Id: undefined
Length: 30min 55sec (1855 seconds)
Published: Mon Aug 05 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.