Spatial Hash Grids & Tales from Game Development

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Wow! You rock! Another awesome tutorial with unique content.

👍︎︎ 1 👤︎︎ u/krazybubbler 📅︎︎ Nov 30 2020 🗫︎ replies

the narration style and pacing are perfect. well done! it seems like a very hard balance to strike, and hard to find the right level of abstraction to avoid getting into the weeds, but i think you nailed it

👍︎︎ 1 👤︎︎ u/the_cat_kittles 📅︎︎ Dec 01 2020 🗫︎ replies

Apologies if I missed it somewhere in the video. How is the size of each cell determined? i.e. if the map is 100 x 100, and you want to split it into cells the size of 10, I didn't quite grasp what was determining the size of each cell.

👍︎︎ 1 👤︎︎ u/newguy2445 📅︎︎ Dec 01 2020 🗫︎ replies

Great content, man. I got a good laugh out of the game dev story. Sometimes the simplest solution is just don't.

👍︎︎ 1 👤︎︎ u/invisibo 📅︎︎ Dec 01 2020 🗫︎ replies
Captions
we're going to talk about spatial hash grids we'll go over the problem we're trying to solve we'll talk a bit about the alternatives why you may or may not want to go with them i'll tell a little story from my game development career that's kind of relevant to this and then we'll code it up so let's start by talking about the problem it's not uncommon for games to have a ton of units in the world especially games like starcraft that kind of thing those real-time strategy games and from a programmer's perspective you often need to figure out what's happening nearby in the vicinity of a single unit often many times and often you'll need to repeat that process again and again for every single unit for example this guy fires into an area an explosion goes off something happens we need to figure out who's dead or wounded and obviously the faster the better so how do we solve this well there's a whole bunch of ways the most obvious is the naive solution which is let's say we have a bunch of objects in the scene so let's say this array here is all of the entities in the world now you want to find the ones that are nearby so we'll define this function get nearby and pass as parameters the position and the radius you're interested in the naive solution is to simply iterate the list so first we declare the results array and then we start iterating the whole list and each object we compare the distance from the object and the position and check that it's less than the sum of the radius of the object and the radius we're searching if you were to draw a picture say you've got this object here with this radius so we just draw a circle around this thing now if we have this point and we want to see what's close to that point we calculate the distance to the object and compare it to the radius plus the search radius if the distance is greater than the sum of the radiuses it's too far away otherwise if it's less than the sum of the radiuses it's close it's nearby and so this solution sucks because it's slow as hell it's order n for a single call which means if every unit on the screen needs to go check what's nearby the whole thing ended up being order n squared so this solution is out the other thing that typically comes up is some sort of adaptive tree structure this is your quad trees your oak trees your kd trees your archeries your loose octrees your dynamic aabb trees your whatever random technical words tree and these solutions are great they can really accelerate things under the right conditions let's say you have a big world with clusters of objects and big vast sparse areas of nothing these kind of sparse data structures are great because they allow you to efficiently represent these large spaces they allow you to early out when a node has nothing allowing you to effectively cull and discard huge swallows of your world all at once but there's disadvantages too starting with the fact that they're not trivial to implement or maintain if you're a big company with resources to pour into just having a single engineer throw themselves at tweaking out and maintaining some hardcore data structure obviously go for it not sure why you're watching this channel or taking my advice but go for it but for the rest of us there's some advantages to going with something that isn't necessarily the absolute best but is simple to build understand and extend when needed so for my personal story this was back a couple years when i had just started in the industry i was working on a game that would go on to become prototype which while not being a massive blockbuster on the scale of other giants sold a couple million copies so that's respectable there's a chance someone watching this has played it anyway so what happened was we had this internal team called the advanced technology team or atg for short and the lead rendering engineer had spent a bunch of time implementing this fancy pants visibility system i don't recall exactly what it was but i do remember that it was some sort of adaptive aabb tree so fancy and complicated took a long time to get it polished production ready all that crap whatever and initial results seemed to be good so it was adopted by the prototype team at some point i was tasked with leading optimization efforts so i fired up the profiler and you can kind of guess where this is going so i load up the profiler one day and notice that the visibility system is taking up an enormous amount of frame time if i recall correctly we had about a 90 million cpu cycle per frame budget and this thing was taking about 70 to 80 million cycles meaning it was taking nearly the entire frame and remember how i said it's important for code to be easy to maintain and extend well nobody really understood this data structure there was no documentation of any kind because games and the original guy wrote it the lead engineer had left the company at this point so this was a complete mess like the code itself wasn't messy but it was complicated just sucking down cpu cycles pointers everywhere cached mrs galore it had it all so i started screwing around with it and what i found was if i took this tree structure and i just made everything stay in the root node and never split the whole thing ran considerably faster at least on the xbox 360. about 10 times faster to be exact i committed the change to perforce and reported the findings to the tech director and team and what ended up happening was that one of the senior engineers on the technology team ended up being tasked with building a new visibility system a simple one a glorified array basically all sim de-optimized with some extra handmade occluders that artists could build into the scene that's it that hand rolled and optimized version ended up being fast enough that the visibility system disappeared off the profiler so job done the moral of the story is that sometimes simple solutions implemented well are good enough i get that working on these other data structures is super sexy and fun but you know we're in the business of making games not visibility systems okay so let's talk spatial hash grids now spatial hash grids are conceptually really simple let's start with a simple scenario so let's say that our world is this rectangle so i'll just draw a quick rectangle on the screen and let's say you've got a whole bunch of crap scattered throughout your world so i've got trees and rocks and people scattered everywhere and the problem you're trying to solve is given say this character here you want to be able to query and find out who's nearby the way that you can do that is by dividing the world into fixed size cells and then for any given point in the world you basically hash it to one or more of these spatial cells so if i have this big tank for example it's bigger than a single cell so i'd insert it into all four of these cells that it's touching and then when you want to find out what's nearby it's a simple matter of checking all the cells in the area and compiling a list of everything you see so let's code up a simple one i'm not going to optimize this in any way because i want to make this super clear how it works i'll follow up with an optimized version so stay tuned for that the first thing we need to do is declare a class we'll call it spatial hashgrid and we'll start with just a basic constructor with a few parameters bounds and dimensions i'll explain what these parameters will mean bounds will be the min and max of the area that the grid will be operating on so if your world goes from negative thousand negative thousand to a thousand thousand then you're going to feed in two arrays to bounds dimensions is the other parameter we need to initialize the spatial hash grid and that controls how many cells we'll have along each dimensional axis so if the world is 100 units wide and we have 5 cells then each cell will span 20 units the only other thing we need to do here is initialize the cell's property and we can do this a number of ways but since i want to just illustrate the basics here we'll do this the simplest way possible and we'll allocate a map to hold the cells now the public way that you'd use this spatial hash grid is kind of like well let's say that i have this code here so i'll create a new grid by going const grid equals new spatial hash grid and let's pretend that the bounds and dimensions is declared somewhere now in a game setting you're going to need a few operations first off you'll need a way of creating a new client for the grid in other words normally you want to write something like const client equals grid dot new client and the parameters with parameters containing a bunch of crap like position size that sort of thing now once you have a client you'll need a way of updating that client so it would look something like client.position equals newposition and then you'd want to notify the grid of the changes with a call like grid.updateclient the reason being if that unit or entity is moving around in the world you obviously need a way of notifying the grid that its position has changed so the grid can do any bookkeeping necessary now for the most obvious usage and that's finding nearby units and that needs to look something like const nearby equals grid dot find nearby and pass it in the parameters and that would return some sort of list or something of the nearby clients lastly there's one final cleanup operation we need to do and that's when we want to destroy the clients at some point there's going to be a need to remove a client from the grid so we'll need an operation like this where we can go grid.remove client and pass in the client and that will just remove the client from the grid so that it isn't being actively maintained anymore and won't show up in queries so we got to implement these four interfaces which should be enough for most cases starting with new client so we'll write out the function definition here with two parameters those being position and dimensions that being the initial position of the client and the width height of the client respectively and all this function is going to do is create a new client dict so i'll copy these parameters in so we'll copy both this position and dimensions into the new client dict and we'll add this indices property which we'll get used in a bit now new client has to actually insert this into the grid so let's write a call to a non-existent function this dot insert client and of course we'll go and fill that in now the insert client function i'll just declare this below and it just has the one parameter that being the client being inserted by the way this isn't intended to be a public function conceptually this function is super simple using the position of the client and the size of the client we need to loop over the cells it potentially touches and insert it into those cells so you can see that i've just used javascript destructuring syntax to make this a bit more readable by turning position and dimensions into x y wh next i need to figure out the min max range of the cells we're iterating over and so we need some sort of function that does the conversion from a position to a cell index in other words if you're looking at the grid and i ask you what cell index this point is in you need to provide an index so that i can look up that cell specifically so we're doing just that with again some functions that haven't been filled in yet but we're calling them get cell index and the parameters to this function are just the x y coordinates we're interested in for this function we actually need to calculate both the min and the max range of the cells so what we need to do is take the position and subtract half the width and height for one index and add half the width and height for the other and that'll give us a range we can iterate on so now that we have these two we'll just store them in the client just because we'll need them later then we need to write two for loops that iterate from the min to the max in both dimensions both in x and y once you have the loops we'll need to use the two coordinates or cell indices to construct a key for the cell map so we're going to again call a function that doesn't exist in this case it's this dot key and we'll pass in the coordinates and it'll pass back a string that we can use to look up in the cell map now we just need to add a quick check to make sure the cell map has been initialized since we didn't do that right in the constructor lazy initialization isn't a good way to do this and i do not like doing this in real production code but this is a tutorial so if it's not constructed we allocate a new set lastly now that you have the key and you know what cell is initialized so you insert it and we're done except for the fact that we have two unimplemented functions so let's go implement them starting with the key function this is going to be relatively simple there's two parameters which just represents the x y indices of the cell we're interested in and the goal here is just to generate a hashable key that can be used in the cell map to uniquely identify this cell and that's as easy as just concatenating these two indices together as a string with a period in between that's it should work fine so let's move on the other function that we needed was the get cell index function which given the position of coordinates will compute the x y index of the cell this position belongs to and although this looks pretty involved it's not really that involved the first calculation here where we calculate the x value all this is doing is taking the position subtracting the minimum bound and then dividing by the max minus the min or the total bounds of the area what that gives you is a value between 0 and 1 and then i just clamp it to the range 0 1 just in case the position was outside that range after that it's a simple matter of taking that 0 to 1 value multiplying by the number of cells minus one and then turning that into an integer and of course we do it for both the x and the y value giving us a tuple of x y indices for the cells so we're all set for new client now we should move on to the most obvious usage which is to find nearby things so we start by declaring the function and takes two parameters position and dimensions and these are pretty self-explanatory position is the location you want to query for nearby clients and dimensions is the width height around that point to search i.e how big of an area do you want to search now keep in mind grid search isn't perfect it may return things slightly further away but that's normally a non-issue so this function fundamentally what it needs to do is figure out the cell coordinates for the bounds of the query so the position plus and minus the bounds after that it'll need to loop between the min and max indices and basically just collect every client from every cell concatenating them all into one big list so that's what we're doing i did some quick destructuring at the start of the function just to make things a bit easier to read and then these two calls to get cell index get the min and max indices for the cells after that it's a simple matter of iterating along the range of cells in both the x and y axis pretty much exactly the same thing as we did in the insert function the major difference here being we need to do a small check here so once we have the cell key we need to check that there's actually a cell with that key because there may never have been a unit inserted into it and so it doesn't exist remember these are lazily created after that i declare this client set because the clients may show up more than once for example if you have a big unit that spans multiple cells it gets inserted into multiple cells but here in the find nearby function we only want to record seeing it once so we can use a set to trivially make sure that the list of clients is unique so if the cell exists we just iterate it take every element from that set and insert it into the result set after that you return it job done that's pretty much everything find nearby needs to do let's move on to update so we'll declare this update client function and it has one parameter the client that's being updated and you can implement update trivially as just two operations removing the object from the grid and then inserting it back into the grid that essentially accomplishes what an update function needs to do so here we called this delete client followed by this call to insert and that's it although delete client doesn't exist yet but it's literally the next thing we're going to implement so we're already done here okay last function the delete client function and this is conceptually super simple what you're going to do is given the client you get the indices that were stored earlier in the client dict then we just iterate the range of cells that this client was inserted to and called delete so that's it each cell contains a set which has a delete function so you don't need to go searching for the item yourself set takes care of it reasonably fast too the ecma standard doesn't specify the exact data structure a set needs to have only that it provides sub-linear access times and that should be it we're all done i've put together a little test bed where we can see the whole thing in action if we load it up you can see red spheres moving around i've put a few hundred spheres into a loosely bounded area and they're just kind of wandering around randomly i've given them some basic flocking behaviors check out that video if you want to understand more but anyway they're just wandering around and we're doing a query for all nearby units for the red sphere and any nearby units we're coloring green now notice that it's not absolutely perfect because of the way the grid works you're guaranteed to get everything that's at least the distance you specify but you may also get a few that are slightly further away depending on where you are relative to the cell we can also run some numbers so i cropped out a quick naive version of the spatial grid using a linear algorithm it just straight up searches every client and now we can compare numbers on performance the basic setup i have is i have a hundred thousand clients in a dense area and we're going to do a test using a thousand random queries on both the naive version and the spatial grid we just built so as a baseline doing a thousand queries takes about six and a half seconds a little bit more than that on the naive version and that shouldn't be a surprise considering it has to iterate every single client doing bounding volume tests that means a thousand queries times a hundred thousand clients or a hundred million operations let's compare that to the basic spatial hash grid we built and in this case we're getting way way better numbers a thousand queries only takes about 18 milliseconds roughly that's an absolutely fantastic improvement but it does come at a slight cost because just raw queries isn't the only thing you need to do units need to move around so we can also compare forcing all the clients to move and see how long that takes and in this case the naive version is obviously better since it didn't have to do anything all we're recording here is roughly the overhead of the loop and function calls and probably the timer but the spatial hash grid doesn't do too bad takes around 350 milliseconds or so to get it done as a bonus treat here's a much improved version there was some easy low hanging fruit in here performance wise so i went ahead and made a faster version here are the numbers from the improved one query times improved significantly taking less than half the time as before same with the updates both happen significantly faster in the newer version in the next tutorial we'll step through with the profiler and i'll show you step by step how i looked at the performance here what i looked for and how i improved on this anyway this was just me trying out a new format for these tutorials hope you enjoyed it and learned something new in the process code should all be up on github if i remember to upload it let me know what you thought in the comments below also like and subscribe as always until next time cheers
Info
Channel: SimonDev
Views: 25,276
Rating: 4.9942818 out of 5
Keywords: simondev, game development, programming tutorial, three js, spatial hashing, spatial hash grid, spatial hashing collision detection
Id: sx4IIQL0x7c
Channel Id: undefined
Length: 19min 7sec (1147 seconds)
Published: Mon Nov 30 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.