REVEALED: Quake III's SECRET Algorithm!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Great bit shifting math wizardry

👍︎︎ 2 👤︎︎ u/BubbleBolha 📅︎︎ Aug 12 2021 🗫︎ replies
Captions
in the year 2005 something happened that shook the world of computer software it happened when id released a source code to the 1999 game quake 3 arena almost immediately programmers discovered a tiny algorithm buried within it to most programmers the algorithm didn't make any sense and it never would it was simply akin to magic others couldn't accept magic in their midst and a few would become obsessed with and consume by trying to understand the algorithm soon it took on a life of its own with almost a cult following as junior historians undertook to find out who invented this piece of computer science wizardry and not only how but why it worked at all somehow it was several times faster than its fastest known contemporaries but it didn't look like it did anything useful at all that algorithm is now known as the fast inverse square root and when i first saw it there was only one person i could think of that could have been responsible someone i worked with back at microsoft in the 1990s somebody insanely brilliant the kind of mind that regular geniuses look at with envy and wonder but if there's anything that bill gates taught me it was never to name drop so when i tell you that i work down the hallway from michael abrash it's not merely to impress you it's to help explain why when i first saw the brain crippling fast inverse square root algorithm i assumed it could only have come from his intellect abrash could well be the world's foremost expert on optimizing assembly language having written such books as the zen of assembly language and the zen of code optimization and the zen of graphics programming combine those three topics and you've got an obvious candidate for somebody who could have been qualified to have been the source the thing is however i had no proof i knew that michael left microsoft to join it for the work on quake just before this algorithm would have apparently been created but it was still a pure guess the algorithm might have come from id founder john carmack one of their other programmers or some old obscure soviet research paper for all i knew the more i dug into it though the more i needed to know because once i saw the algorithm well you'll know when you see it too hey i'm dave welcome to my shop well that's about as long as i can remain that serious especially on my birthday oh if you forgot to bring me a gift just drop me a sub or like on that intro and i'll be more than happy with that before we get too much further into it i want to reassure you that i'm not a math head i'll only resort to the math when it's essential and even then i'll make absolutely no attempt to impress anyone with my math chops if anything i will explain things using the same tiny little baby steps that my own brain requires when it comes to any kind of non-discrete mathematics check the video description for bonus links to the mathematics done with the appropriate amount of rigor with that out of the way back to the algorithm the first thing i did of course was to email mike and ask him what he knew of its origins his answer was simple he couldn't offer any insight because he'd never worked with it so suffice to say it wasn't his i wasn't the first to follow this trail however and it would turn out to lead straight past mike abraha and on through john carmack back to sgi indigo and beyond but what makes it so special what does it even do just from its name we can assume that its purpose is to calculate fast inverse square roots but precisely what is a fast inverse square root then who needs one and why do we care so much in the world of computer graphics vectors reign supreme at its most familiar a vector is just a 3d point in space with an x a y and a z value but we're not best served by thinking of vectors as merely pointing to locations in space recall that a vector is a quantity having both direction and magnitude oh yeah vectors can also be used to define other quantities such as planes a plane can be defined by a point in space plus the normal vector which is simply two vectors and is known as point normal form in other words vectors are everywhere in computer graphics when you hear how many polygons some incredible gpu can transform through its pipeline every second under the covers it's really operating on the vectors that define those polygons and the faster you can operate on the vectors the faster your graphics the higher your frame rates and the better your realism if you're a game publisher it translates directly to success in dollars and cents and it all comes down to fast vector math one curious fact about vector math is that it's usually much easier to perform when the vectors are in what is known as normalized form normalized means that a vector's magnitude value is set to 1 which can be done by dividing all the other components by the overall magnitude of the vector when you're done the vector will be exactly one unit long and point exactly to some location on the sphere of the unit sphere now let's imagine we have a vector of zero nine zero the magnitude of that vector is nine we divide each x y and z component by nine and we get zero one zero which is the length one vector that lands right on the unit sphere that's precisely what we want than what we expected if you're worried about what happens to the original magnitude it can often be discarded if only the direction is needed otherwise you can add another dimension to the vector which we will call w we'll start it at 1 and we'll scale it up as we scale down the other vector components that way if we ever need to get back to original form it's preserved in the w nothing is lost and we still get the convenience of normalized form but let's try a slightly less contrived case like 2 3 5. the magnitude of that vector is 6.16 dividing each component by the magnitude yields a vector of 0.32 0.49 0.81 which is also on the unit sphere these calculations were easy because i knew the magnitude but how did i obtain it let's look at the magnitude on a 2d vector first since we all saw this one back in elementary school the length is equal to the square root of x squared plus y squared our 2d vector has an x and a y or a run and a rise to get the length we simply use a pythagorean theorem and then to apply it in more dimensions we only need to include additional terms that means that the magnitude of a 3d vector is the length is equal to the square root of x squared plus y squared plus z squared the computer programmer would never actually square a term or raise it to the power of two they would simply multiply the number by itself which has the same effect but with multiplication being much easier than exponents it's just faster to do thus we wind up with length is equal to the square root of x times x plus y times y plus z times z going back to our simple two three five vector we apply the math the length is equal to the square root of two times two plus three times three plus five times five which is the square root of 38 which is that's six point one six four four one four zero zero two nine number and so on dividing each individual term by the overall magnitude we see that x over six point one six equals zero point three two y over six point one six equals zero point four nine and z over six point 6.16 equals 0.81 now we can easily check our work by plugging those numbers back into the pythagorean theorem where we should expect them to yield a length of one and indeed other than a small rounding error because i'm only working these out to two digits it works h or the length again equals the square root of x times x plus y times y plus z times z which is the square root of 0.32 squared and 0.49 squared sum with 0.81 squared that works out to the 9986 from basic algebra we can recall that dividing by the square root is the same as multiplying by the reciprocal of the square root or one over the square root thus the truly critical operation we're trying to perform here is the inverse square root if we grant for a moment this operation is critical to computer graphics performing it as quickly as possible is now our goal you're gonna think i'm just holding out on you if i don't show you this algorithm at some point so i'll do it but it's gonna look like voodoo nonsense we'll come back and revisit it once we've laid some more groundwork but for now here it is all right real quick three halves is just the constant for three over two and x two is the incoming number divided by two y is the original number next we take the address of the floating point number and then recast it as a long pointer and then take its value as a long which is actually an undefined operation in c we then take half of that result and subtract it from the constant 0x5f3759df we finally smash that integer result back into the floating point memory for some reason before applying what is apparently a newtonian iteration to remove any remaining error there's a second iteration pass that's commented out this thing didn't really do any work it just smashed some bits around in floating point form and then subtracted stuff from a constant so it doesn't make a width of sense at first glass but does it work if you type it in and compile it you bet is it fast absolutely and rest assured we'll test at least four variations of the inverse square root to find out just how fast it is but how accurate can it really be well it's within one percent for everything i tried if you uncomment the second iteration of the tony and refinement it's barely any slower and it's far more accurate coming in as precise as about one part in ten thousand before the quake version has any hope of making sense though we need to understand two things how to find a square root and how floating point numbers work on modern computers both are important to understanding how it works let's start with a trivial implementation that we have a better hope of understanding when i started coding the 8-bit processors lacked any form of division instruction or multiplication at the cpu level so square root would be a subroutine supplied by your compiler vendor or a subroutine you wrote and called in assembly language today most all chips provide a useful division instruction which means this is directly translated into a square root instruction using a compiler intrinsic thus our code can be as simple as an inverse root returning 1 over the square root of n it's just that easy if we're allowed to let the compiler and the floating point unit on the processor do the heavy lifting for us and while today the fpus are effective enough that this is a reasonable approach this was far less true back on the hardware of the 1990s and even today if you want to parallelize work on a gpu that doesn't natively support square roots or inverse square roots you absolutely want the fastest algorithm you can find to perform those discrete steps the quake version also accepts a floating point number but if we could live with an integer perhaps we could find a simple newtonian approximation of our own as follows now there are two points in the video where i'm going to give you some math and some commentary without going all the way back to first principles and this is one of them understanding newton's method for finding roots requires that you are comfortable with logarithms and understand some calculus but i don't want to lose a bunch of people here so keeping in mind that good compromise means everybody's a little unhappy here's my layman's version of how it all works newton's method is a trial and error approach for finding the solution to a function or its root essentially you start with a guess you want to then continually refine that guess and home in on the right answer so you iterate on it by making the next step equal to the previous step less the function's rate of change at that point multiplied by the current value wherever you're at why do we need to know the rate of change ah because it helps us to optimize where to make our next guess consider that a truly lame approach would be to just start too high and then subtract some amount each time try again and repeat until you're close enough that would be hugely inefficient however but we really want to know is how much lower can we make the next guess and still be safe why use the rate of change it's because newton's method employs the function's derivative a concept from calculus that amounts to finding a helper function that describes the function's original slope or the rate of change at any point along its curve we know that we've picked a point at x0 that is too high but how much too high is it well we can extrapolate where our x would cross at zero if we started here and applied the instantaneous rate of change all the way back down to zero using the current rate of change will mean that our line would actually be tangent to the real function we don't know where the real answer is but we know that it's at least safe to be this far away and so that's where we're safe to make our next guess x1 turns out to be too high as well so we extrapolate back to the point x2 and now we try there we simply keep repeating this process until we've reached whatever level of accuracy we require or until we drop below the target as you can see each iteration takes us incrementally closer it just takes some time but you can solve pretty much anything to any level of accuracy that you might need now back in high school i was fairly bad at math and yet pretty good at programming so i wrote a quadratic function solver for my trusty old sharp el5103 a calculator which i still have and love good old mr gazine was no fool however and promptly noticed that while i could solve pretty much anything i never showed my work i spent a lot of time entering iterations into my calculator i did rock the few multiple choice tests we had though one problem is that once you get up to polynomials of the fifth degree or higher it's been proven that no such formula can even exist but newton's method will still solve it by brute force now dave's easy cheaty way of finding the derivative of a simple function is to reduce the exponent by one and multiply by the old exponent thus if our function where f of x equals x squared the derivative f prime of x is 2x if you're not sure why just humor me on the calculus but at least you won't be able to spot all my mistakes a square root can be written as a function of x being x to the raised to the power of one half the derivative is then one over two times x to the power of one half or put another way it will be half of one over the old value to apply newton's method then at each step we need to move by that amount multiplied by the current position that's how much we will add or subtract depending on which direction we're moving in for our simple square root algorithm you must start somewhere so we'll make our initial guess being half of the input number our next guess at any stage will be the current guess plus or minus one over the old value multiplied by the input number all then divided by two that's our maximum safe rate of change at any point as long as the next guest remains too high we repeat rather than keeping a series of guesses like x 0 x 1 x 2 and x 3 x so on it uses a loop and it just alternates between x 0 and x 1. now in the end i don't know if i've made it any easier to understand newton's method but if nothing else you should be able to at least think of it in terms of an iterative approach at finding a solution to a problem that progressively improves its accuracy as you make each subsequent attempt that much will get you by now we leave the comfort of our integers and explore the world of floating point when bill gates and paul allen were confronted with the task of adding floating point math support to their fledgling altair basic interpreter they did what any reasonable young pair of go-getters equipped with an assembler and a dream would do they hired an expert for cash and that guy was monty davidoff college dorm made of bills who studied applied mathematics at harvard i guess one night in the harvard dining hall bill and paul were complaining that they weren't looking forward to writing all this floating point code and that's when monty chimed in to proclaim that he'd done it before he convinced them that he was more than capable of doing it again so they agreed to let him have the task at the time there was no standard for floating point numbers and so davidoff had to set one it became known as microsoft binary format and it's still in use today under the specification known as ieee 754. have you had that feeling yet the one that says oh no there's no way he can't wrap this up in time it must be a two-parter and indeed it is because in the conclusion we'll look at floating point format so that we can appreciate how the algorithm smashes all the rules stuffs bits where bits don't belong and generally abuses the compiler in ways that i'd never seen before all that's coming up in the next episode so please make sure you subscribe to the channel so that you don't miss it if you found this episode interesting or entertaining please do let me know by giving it a thumbs up and if you didn't by all means absolutely give it the old double thumbs down in the meantime in between time i hope to see you next time right here in dave's garage and a few would become obsessed with obsessed with this is the right tone yes now it is yes i enjoy this much better back in the year 2005. sorry i'm recording what's up in the second variation it loops thanks to c plus plus inheritance it's a little more incr new if that's the case you can simply pass ba the user or caller of the class need not know anything about this it just happens automatically because we've overloaded uh overwrote it we overwrote it we overwrote it with my range of roven rigid rover here you can see the audiobuffer class defines its own new and delete operators that call ps alec and ps3 ps3 freedom freedom freedom from audio lifers
Info
Channel: Dave's Garage
Views: 365,337
Rating: undefined out of 5
Keywords: fast inverse square root, quake 3 arena, quake iii, algorithm, quake 3, story, john carmack, mike abrash, c++, programming, id software, gaming, quake, history, math, mathematics, fun, Michael abrash
Id: uCv5VRf8op0
Channel Id: undefined
Length: 17min 10sec (1030 seconds)
Published: Wed Aug 11 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.