Quick Understanding of Homogeneous Coordinates for Computer Graphics

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

the usual way of representing objects in a 2d space involves using two coordinates X and Y to indicate their positions along the X and Y axis but you may have encountered the same object represented with a third coordinate or a fourth coordinate if you work in 3D all without changing anything to the object okay but why bother with the extra component then this is what we call the homogeneous coordinate system and it's probably more useful than you may think in the first place and there's a beautiful reason why it's used in computer graphics and why Graphics libraries like Vulcan and opengl force you to use it but before we jump into it let's revisit the two coordinate system in 2D to understand its limitations with magic from linear algebra there are multiple ways to transform your 2D objects with 2x two matrix multiplication you can scale it along the X and Y axis Shear it reflect it and rotated all of which can be combined into single matrix by multiplying this transformation in right hand order meaning that the first Transformations occur at the right of the product up to the last transformation at the left all of which can be reverted by Computing the inverse Matrix but anyway here's the catch how do you manage to Move It from its origin with such a matrix the the answer you can't the solution you can sum a translation Vector to the mix thereby moving your object in a specified Direction on The X and Y AIS quite simple right a more convenient approach however involves adding a coordinate set to one and incorporating a new column and row into the multiplied Matrix Now by placing the translation value at the upper right of the Matrix we achieve the exact same outcome see by yourself on the first row of the results the Y component gets cancelled leaving only X and the translation value DX for the second row the X component gets cancelled leaving only Y and the translation value Dy the last row now keep the initial homogeneous value provided by the vector which is one okay but how is that more convenient well now you can put all your transformation in a single Matrix again the previous transformation we've seen can be put at the upper left of the Matrix when you want to apply the translation at the end for example by writing the rotation values there we rotate first then we translate you could also just multiply your new translation Matrix somewhere in the mix with the condition that they are all homogeneous 3x3 matrices this is typically not doable with a vector sum and don't forget that the order matters as rotating then translating isn't the same as translating then rotating now you can apply your new homogeneous Matrix on each point of your object thus reducing the number of steps by half as you don't need to sum a vector anymore and at a scale of thousand of points it makes a substantial difference for performance or debugging this is specifically used when you have an object defining local coordinates and you want to represent it in absolute coordinates or to another local coordinate system which beside the rotations is done by translating the whole coordinate system another advantage of this new found power is that you can disable the ability to translate a vector by setting its last coordinate to zero thus the object gets locked in place let me prove it with the same translation Matrix as before we're going to calculate the result of the product on the first row both Y and DX get nullified leaving only X the second row it is X and Dy that get cancelled leaving only y and for the last row everything gets null which overall gives us back a copy of our input Vector we generally Reserve this treatment for vectors that represent Direction because translating them only changes their orientation and their magnitude which is not the intended Behavior you typically see this kind of vector used for light direction or normal vectors but then what happens when the homogeneous coordinate is neither one nor zero but an orbitary real number let's call it w in such cases the rule is to divide every component of the vector by the homogeneous coordinate to retrieve the original Vector bringing the homogeneous value back to one while rescaling the other components this offers a notable benefit observe what happens when we modify the bottom values of the Matrix as shown here just by swapping the two values at the bottom bottom right X and the homogeneous value get cancelled but y escapes its terrible fate hence the last coordinate adapts the value of y meaning that every component will eventually get divided by Y and we are still achieving that with just a single Matrix this is precisely how we achieve perspective projection in perspective you need to downsize objects proportionally to their distance to the viewer for that we simply divide the X Y and Z component by their distance from the camera plane which is often represented by the Z coordinate or negative Z if the camera is looking toward well negative Z this division can be geometrically Illustrated as a 3D scene where you project the field of view which is a pyramidal froom into a rectangular prism as you can see this operation shrinks the object at the end of the froston and then we neutralize the Z component to project and screen hence if you look closely at the transformation matrix used for perspective projection and just ignore the top side it includes the same bottom part we've seen before which multiplied by a 3D homogeneous Vector will result in Z being the homogeneous value and will eventually cause the division by Z well now you should have understood why Graphics API B you with four dimension vectors this unlock three essential use cases lock translation Midway translation and self division for perspective which gives us basically a trick to play with coordinate systems all with the convenience of using only one Matrix which is why we call it homogeneous

Info

Channel: Miolith

Views: 22,905

Rating: undefined out of 5

Keywords: Graphics Programming, Computer Graphics, Homogeneous Coordinates, 3D Object Representation, Linear Algebra in Graphics, Vulkan, OpenGL, Matrix Transformation, Perspective Projection, 4D Vectors, Coordinate Systems, Computer Graphics Libraries, Transformation Matrices, Performance Optimization, Debugging in Graphics, API Programming, Vector Manipulation, Perspective Rendering, Graphics Pipeline

Id: o-xwmTODTUI

Channel Id: undefined

Length: 6min 53sec (413 seconds)

Published: Sun Dec 03 2023