CppCon 2015: Scott Wardle “Memory and C++ debugging at Electronic Arts”

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

tl;dr = "We have 30 million developers that we pay 5$ an hour that monitor memory manually"

👍︎︎ 13 👤︎︎ u/ibasejump 📅︎︎ Oct 08 2015 🗫︎ replies
Captions
I'm gonna talk to you about memory and C++ debugging at Electronic Arts I'm gonna talk to you about it in a bunch of eras basically I'm gonna talk about you know what we used to do back when game programming was more like embedded programming back in 2000 on ps2 then I'll start talking about you know how things were back on 360 when we got virtual memory for the first time and est l came about finally I'm gonna talk about you know things now with ps4 and Xbox one and I'm gonna show you a sneak peek of what our current sort of state-of-the-art debugging tools this is me I've been in games forever I like to try and solve problems visually and draw pictures and do animations and stuff that probably got me into games and I'm badly dyslexic so if you see any spelling mistakes please let me know I won't have noticed them so on that note let's start with vocabulary one of my favorite topics in memory debugging systems and things I've always noticed that there seems to be a lot of different vocabularies around like heaps and arenas and alligators and pools and so this is the what I'm gonna use I'm gonna talk about alligators and they will be a class that can free or allocate memory usually it's a pure virtual interface and then I'm gonna talk about arenas and marinas are like a set of address ranges where you can find an alligator from it and so you can go back and forth from an alligator to finding what address ranges is in or you can take any particular pointer in memory and and find out what arena that's in and therefore you know what alligator it's from and heap sometimes I'll say quite a bit and that's kind of the combination of it to the two so back in 2000 you know we just started converting from C I mean most people turn on their c4 plus compilers by this point we didn't have any virtual memory there was no OS you know very similar to embedded programming systems and very simple system only 32 Meg's of RAM so what I started seeing about you know how to get debugging information into the game in how to debug all of the allocations was like techniques like this one where people would add news and deletes to all of their classes and you know this is actually a good idea for the performance sake I mean people use like fixed size pools and slabs of allocators where they they bump pointers and things like that but if you're trying to use this as a debug technique you know it falls down because you're having to you know if you don't do it this somewhere you won't really notice if you don't override your new and delete in a location then you just miss having the extra tracking information we still have code like this around in yay but it's something that I would suggest not to do but I understand why library teams do it because this is sometimes hard on a very large project you want to use global new and global delete but what interface do you use what kind of debugging parameters do you use in a very large company like EA it was actually kind of tricky to get the politics of this right eventually we settled on something like this where we had some flags which said whether we were in the top of RAM or the bottom of RAM and we had a debug name that would be associated with every single allocation to understand how we use that debug name you kind of have to understand what a heap looks like for us anyways and usually it's got a header you know unallocated block in a footer and this is pretty simple but we would store in the footer this debug name and we would put it there and be you know fresh for being corrupted because that's usually what the memory you would corrupt maybe not the best of places to put it and it would look like this it would be like category : : allocator so we would have things like you know rendering : : player or gameplay : : physic math sure those kind of things in we would sort there and this was pretty good tech we thought it was good you know it was simple back then because we didn't really have very many pools or anything because there wasn't that much RAM only tens of thousands of objects I guess we did have a small block allocator and this wasn't quite used everywhere but pretty close and we had to work really hard at fragmentation and we thought this was the height of technology at the time you would load compressed blocks in the bottom of RAM and then decompress them and put them in the top of RAM and that's what we would use those those flags for is for doing things like this and you ping-pong things back and forth to get rid of all of the fragmentation and this worked pretty well as long as you only had one CPU once of course we went to xbox360 and ps3 our world had to change though we had virtual memory now though and this was a real good advantage you didn't have a hard drive though so we couldn't really use it the same way everybody else could and the GPUs couldn't support it and that's where a lot of our memory actually went so it wasn't perfect but it was definitely an improvement having all those CPUs meant that we wanted to have more allocation systems and divide memory up a little bit more so we didn't have contention on it and and fragmentation will become sort of a higher-level problem we needed better tracking and logging systems and we invented something we call the stomp allocator and that was a huge advance in how we debugged memory corruption finally STL came into existence and this really algorithmically changed the way we work and that was really powerful but how did we deal with having multiple alligators for the most part it's pretty simple I mean you just pass an extra parameter to your new operator and it can kind of work and that's what we did delete however is a pain and so therefore we had to wrap it in a macro and a crazy amount of technology to pass in an extra parameter in order to call the appropriate destructor we still use that code today and occasionally we think mm why don't we have it so we can pass in and you know an extra parameter delete might be nice we also had to think about you know how are we going to divide up memory you know how do we decrease fragmentation if we can't sort of ping pong back and forth as much so what we did was we kind of divided it based on time is sort of the most important and size and by time I mean like static allocations they allocate once and they never get freed global allocations they you know have a lifetime longer than a level anyways and so they're passing information between levels like your front end system or something like that and you'd have sub level information that you know maybe a cutscene that happens during a level and the level pool could just disappear after a level was done and so you can figure out that your whole system was in a steady state you'd also have various types of temporary allocations for things like okay bullets are flying around you know they only last for a few frames we'd also do things by size to where you you know just like your cupboards or something like that you take all your big pots and pans and you put them into one location and you take all of your smaller ones and put them somewhere else we had the small block allocator like I was talking about and that's actually a completely a custom system but we actually would even just use general allocators and put different size things in different directions and this actually reduced fragmentation one thing that it's sort of dependent on this of team that we had whether we did this or not but we'd also break it up kind of on the boundaries the political boundaries if you will of the team and you would then have like generic heaps and small block allocator heaps for for each team this is actually a performance disadvantage really and smaller teams wouldn't want it but but big teams did because you know it was very easy to set blame for things you know if you run a smaller team you'd probably do something like this and just tag the allocations based on categories and you can see that there's you know you can just count up the orange blocks and count up the red blocks and you could see how much memory was was being taken up and set budgets based on that and and that's a really good thing to do but you know you're gonna have things like fragmentation and fragmentation you know it's hard like who's at fault who did it and when you have things like memory corruption between teams it's quite interesting to you know your shadows are suddenly being corrupted of course you're gonna blame the rendering team they wrote the shadow system it must be their fault but they're gonna be pretty upset at you when you told them to fix their bugs and then they figure out that it's actually you know people writing the sim code and they went in and corrupted one of their buffers and so that's really too bad along the same line we started to notice that it was important to keep our debugging information safe from the rest of our system there were some really important information there and we could use it to determine what was actually going on around the corruption you could you know memory effectively looked like this and you had you know the traditional header allocated block footer normal heap there and you would have a separate heap that was a debug heap that you would track by address and that address would be a hash key so it was kind of like a hash table that would be implemented and he would use that hache in order to figure out how to free it so it wasn't completely free what we were able to do there but this this concept of tracking all of our live allocations somewhere else safe meant that when you saw corruption in a rendering block or something like that you could take it dump it out after you crashed look at the previous block and find out oh it's a gameplay block and it's this size and you'd know a bunch of properties about it and then you could write code to test for bugs and things based on it and that was really good you'll notice there was of something else going on here there's this whole memory logging system as well we would take maybe we wouldn't be able to run the whole game this way but but sections of the game anyways maybe one pool at a time or one teams worth of memory at a time and streaming up to disk change by change every single transaction we did with the memory system and write it out so we'd know where every ellic and free was and we would do this so that we could write a tool like this one where you can see the you know the start of time to end of time and you can see that memory is slowly going up here right this is our game going through boot flow as we go through a bunch of changes and we'd load various files you can sort of see what's going on you can here you could pick any particular point in time and you could see a snapshot of what was going on and see it down in the spreadsheet below but the other thing you could do is you could select regions and you could see the Delta changes that happened there besides just seeing a whole snapshot and you would have you know information like you would expect to be able to debug the system you would have you know the heap it was from the category it was in allocation names the number of count of items and allocation sizes you could also get a call stack as well but that isn't in this diagram you could also see like a block of diagram of it and you could see the thing I was talking about before with with categories you could see that the the systems guys are all allocated over there and the presentation information is down in purple there you can see that we have some fragmentation in this heat because gray is free and you can see there's more than one section of grey and you could highlight blocks and find out you know it tributes about it you know what the size of this block was where it was in memory and you could see things that were going on around it and if you add memory corruption or something like that this this was very useful to to sort of reason about what was going on the other thing that we added was something called stomp allocator of course we're working on consoles so we can't use the same teams the same tools that everybody else would like Val Val grind and things like this so we end up invite in inventing some of our own I think this one is similar to electric fence where we would use the virtual memory system to allocate a readwrite page and then a read-only page right afterwards and we would stuff our allocation right at the bottom and if we you know had an array inside there and you walked off you would immediately crash and stop and you would get a call stack you would know exactly where your bug was and we could find tons of memory corruption with this this tool now because we over allocate using the system we couldn't use it everywhere I mean we have no memory left I mean when you're working on an Xbox 360 especially near the end of its life cycle we were scraping back memory by you know reducing code space and all sorts of craziness so how we use these tools is we would have all the attributes of allocations and we would only apply them to a small set at a time so as we saw a memory corruption happening in particular regions we could turn this on to you know all gameplay allocations that are between 256 bytes - you know maybe 300 bytes or something like that a very small range of memory and we could test it one other thing I learned around this time was the panes of ref pointers I first saw them and I was thought oh great I should use these everywhere and I started to and as quickly as I started to use them everywhere I ran into this problem and oh boy I started ripping them out as fast as I could and I never wanted to use them again but I've learned my lesson now really they are very useful especially in multi-threaded situations things like that I think I would start with unique pointer or bear pointers to start because I've got these great debugging systems to track all these allocations you know fighting memory leaks really isn't that hard for me what's hard is tracking down circular dependencies because I'm not really set up for it I might be able to solve this problem I think it's called garbage collection I'm pretty sure I would start from there because I think they've sold it solved it quite well I could implement a locking system to I mean I could log every single you know add ref and Dec ref in my entire system but that's gonna be a lot of memory way more than the number of blocks of memory I have because the number of blocks of memory I have so many more pointers and and how many times I use those pointers is how many times I might end up in crafting and decorating so the other thing that happened at this time was STL there's a lot of reasons why we decided to use STL and not use standard the standard library we ended up inventing your own STL but basically it came down to speed for our particular case this is on Visual Studio 2015 but it's kind of been the same for quite a while I tried it on 2012 as well and the results are a little similar this is a bunch of tests this is a benchmark and so I don't know maybe the benchmark isn't that good but out of the 188 tests that that I had 71 cases we are 30% faster there's 10 which we were slower and we know what they are we you know they're not cases that we normally use or we want it to go for memory or something instead with the the debug case it's really interesting actually we're way faster we're way faster because we've cut and paste more code the inliner doesn't really work when you have functions that call it or functions that call it or functions and so it isn't really that good so we just kind of flatten the whole thing hopefully we got it right and didn't write that many bugs because I'm guessing that's why you know STL when things don't do that one thing is really cool is we're gonna open source STL like for real okay so Rob over here is gonna take pull requests and things once we get all set up keep an eye on this place here on github and you'll see it soon we'll talk about the details to the SG 14 group and I don't know him we just barely decided all this stuff so hopefully I'm not causing any trouble with this so but even though we invented our own version of STL we didn't all think of the same thing in the same vision I guess you know allocators for example do you want them to be polymorphic or do you not we couldn't decide on that ours had state and things in them so we had more something towards the allocators that are in 20 in 2011 so we could use them sort of like this where we would be able to override and use our I core allocators which is our allocation interface and we could use them everywhere and this worked okay it was not bad but you had this problem you have a lot of default parameters and if you keep things exactly the same interfaces STL which is what we wanted to do you had to pass in all of this stuff that you didn't want to pass in and so it was a pain to use you know this is okay but it's not great another way to solve the problem was this you could have your vector and you could call get allocator and then set a locator on it and change it each time and the STL the way it works is you guaranteed that it will not allocate memory right away it will only allocate memory once you push something into it and so that that worked okay and our team is a pretty big team and so sometimes we think we can get away with just changing the interface and we shouldn't at least try to do these bad ones so this was me I added it our system allows us to look up allocators by name and so I decided that was a good idea and that was not a good idea and fortunately nobody took it and it should die one of these days but there's a lot of code that looks like that the other thing that happens with you know allocators and things is this kind of problem I'm guessing many people would know the problem here is people know what the problem here is good this compiler doesn't work there are different types right they're both vectors of type int but because there alligators are different it doesn't know what to do and so it explodes so we invented another thing this is only internally and and so I don't know if we're gonna make this public but I think we should because it's it's not a bad idea we took and wrapped all of STL I wouldn't suggest that that everybody does this but if you've got a large enough system maybe this is an idea and we changed the order of parameters for ourselves so that we could pass in the information and we made sure that the allocator was always first because i was telling all the engineers you have to pass in an alligator of course they would tell me that you know if i have to pass it allocator everywhere it's really hard to write my map of list of strings and then I would say please do not write a map of list of strings at least use a hash table of you know vectors of strings but you know that sort of thing worked well enough it was it was sort of good and you know what got people like if I give it to a junior engineer they could figure it out we continued it for a bit more so that we could for a large system if you were writing like 70 files or something like this C files or something the things on this scale starts making stents or so where you could make your own string type for that large system so in our career mode case we would make you know a new string type for it and and we would we would know that we just wanted to use the same allocator each time and it was fine it was it was close enough and because of the way we did the inheritance here you can see it's inheriting from STL a vector in this case although it probably should be string I guess we could the inheritance because this would work out the alligators are the same the types the same and I don't get a compiler error anymore and so now I can transfer the types between them this will copy the the data so from the localization string and it would copy it into the career mode string and that's what we wanted to achieve because you know localization is supposed to be sort of like a factory and so therefore it supposed to be producing strings for the other system and career mode should be owning it because it has a duration of time career career mode starts and it ends and so therefore I can tell if all the strings got freed at some point and and I can write systems that prove that's the case and so this fixes my ownership issue which is quite useful so now we're off to today so today we obviously have even more memory we have 64-bit addresses so that's really good we have hard drives on all of the consoles so we're allowed to use them now the GPUs are not quite virtual memory but they're getting close and we're even allowed to not have linear textures for them and so that's really good so that means our debugging system should change again and est-elle tracking we took another crack at it and finally I'm going to talk about our new debugging tools that we've been developing in-house so one thing that started happening around this time is you know that people started throwing out the debug name it started not necessarily being the way we wanted to solve problems we still passed in an IKE or alligator pointer everywhere so we had still had our polymorphic alligator everywhere but we wanted to do things like this where we wanted to associate a lot more data with our allocations not just associated debug name because before we would you know associated debug name and say if we loaded a file we might make the debug name equal to the file but we couldn't tell that it was a resource or something like that so this came from this resource and with scopes we could sow scopes is maybe the way we should go I mean we have to use more thread local storage and things and so there's some performance related problems there but we started using scopes quite a bit and most of the debug information that we're trying to push in now is going this way we still have the old interfaces of course nothing ever dies but it it's it's working pretty well so so our code kind of looks like this we knew we passed in an alligator we knew a team you know est-il sort of still a problem it's not quite fitting I mean we invented this EA tilaka I guess I didn't say yes Teleca by the way is supposed to be pronounced like Metallica and it's EST l ike or allocator sorry I didn't explain that before but it's brutal is pronounced so what we started to do with the STL like because we wanted to solve it in a cleaner way you can have an arena like a gameplay arena or a level arena it doesn't really matter which and you could allocate you know your team your home team or whatever and maybe it has a team ID and a vector of players and of course everyone knows what a vector is gonna have in it it's gonna have an alligator which might have be zero bytes you know first last end pointer but what you could do is you could check what arena that alligator would have been in and you can do this really fast and probably everybody can do it this day really fast is because if you've got delete to work and it's going really fast you must have global delete you must have some way to implement global delete and look up your pointer and figure out how you should free this global pointer and so because every team could implement this this means we could use that information as a parameter to decide where the players should go it didn't have to be exactly the same pool I mean we could do slightly different things we could decide that players should go in a small block allocator for example but if you think about this you know who owns what team owns this information well probably it's the same people that created it what lifetime is it gonna have well the team's created it players probably lived as long as the team may be you know that's that's you know probably making a pretty good guess at least it's better than what we were doing before what would we do before I mean we would just allocate these things in global and be done with it I guess and that's really wrong there's lots of interesting little problems with this I mean it's not free you have to spend some time to look up what arena this thing is in is that a great thing to do I don't know move operators obviously if you want to change ownerships between systems it's not gonna work it's going to move the vector but it's not going to move the underlying allocation of all of the players but it works 80% of the time and maybe that's good enough you made it you own it not a bad rule for other cases maybe you could use a yeast ellika like pattern that's a good idea you know for a factory like localization or something like that where you you didn't want to keep ownership of everything you created maybe that's a good idea so finally I'm gonna talk about Delta Delta viewer which is a tool we've been working on internally to show how we've combined together all of our debugging tools and tried to be able to see them in in one environment and try and show you why we need that what we do is we we debug sessions of data in a session what I mean by that is when you play a game you keep playing it until you crash or quit and crashing is actually fairly common when it you're working on it so and all of this data you stream it out to the ESI's machine and there's a server that runs on each ESI's machine to collect up all of this data or on the QAS machine if it's you know being run by keyway all this data is stored in tables and these tables are then in view so it's basically a database on everybody's machine but that's actually really useful because we can build these views around it and and and have all of these views work with each other so some of our popular views are sort of like a printf TTY channel kind of like thing we have I owe profiling so the load profiling frame rates jobs profiling threads those kind of information we also have you know a memory investigator tool which helps us see changes in memory over time and we have another sort of categorization system and the categorization system sort of groups together allocations and allows us to to see things at a high scale so with tty even just tty systems you get kind of an interesting thing going on here you can see channels and in this case I'm looking at only one channel and this is LTC which is load time channel and so I can see when load time started and ended in this game and I can see when level two happen I know the data very well and that's why I'm having to highlight it and so I've got a primitive little you know profiler right here and so that's kind of cool that's that's a neat idea all on its own it's nice to get the print halves off of the machine because printf sir actually on some consoles are very very expensive but of course you know that's a pretty primitive thing and you know maybe you'd want a more better version of a load profiler this is what we have and it's actually maybe overly complicated but I'll give you the highlights anyways which is there's this load profiler and it has a timeline there and time goes from one side to the other and we have this concept called bundles and we grouped together files that we're gonna load together and then we load them all at once and this is better for seeks and things we also have things like chunks which you know like parts of movies and parts of video and terrain and open-world games and that sort of thing like sort of sub parts of files and bundles really tell me when I'm gonna get in my next level or sub level and so usually they're kind of you know when they're done then you're finished loading and what you can do with these two views the tty channels and the load time profilers you can combine them together you can see the printf in the load time view you can see when they happen and you know this is like adding events to it and that's really cool you can see when you were loading level one you can see when you're playing the game you can see you're playing level two you can see when you're you know actually loading level two when you're playing level two and you can see that I haven't actually finished loading when I'm playing which is kind of odd and I probably should have done if you actually played the game unfortunately I don't have a video to go with it it's a feature we were going to add but we haven't done yet if you hover over top of that file you can find out that that crowds aren't loaded yet and if you're playing the game you could actually see that the crowd sort of pop in as the game was playing in that isn't very good we fix that bug and that was good next thing is as a load profiler and if you know anything about games of course games are broken up into frames and the the the height of each one of these things is is a frame and so higher it is worse it is and so in this case that is a very expensive frame and that is not allowed you got a hammer that guy down somehow or another you can highlight what's going on there and that's what's going on there in blue is I actually highlighted this one and all of that information shows up down there at the bottom and you can see this expensive frame you can see when it started and when it ended and you can see it's slightly bigger than all the rest all of that noise there is a thing called jobs and jobs let's just call them callbacks that happen on another thread and then you can see their function call stacks as you you you know pop back and forth happening over and over they're not complete call stacks but you know most of it is there one thing I noticed because I know this game really well and in some games this is not the case but we shouldn't be waiting for render a lot of games are highly coupled our game can slide back and forth three or four frames between our rendering inner and our sim and so I shouldn't have this all of these yellow things should be gone in this particular yellow happen to be really expensive because the GPU is a little slow and ones once I removed all of the yellow everything ran a whole lot better but you can still combine together more of these views and you need to do this because during load time the whole machine is running you're not just spinning the you know the disk in trying to load things up you have to you know load things and then you'll have to decompress them because loading compress things are faster and then you'll have the texture or whatever and you'll start stomping names or something on it with a font and then from there you know because then writing the names is easy in 32 bits or whatever you you want to compress it down and make it a smaller type of texture because the GPU would be happier to run on that smaller type and you know the color reduction and stuff this is going to work sometimes that is done in the GPU but a lot of these things are CPU limited so you can see here in 4k glory which is really hard to you read for everybody but you get the idea that I've got turbo tuner up at the top I've got my frame rate meter there and I can view all of what's going on in all of the CPUs I can highlight what's going on on the turbo tuner to see what's going on and what file IO is going on in a particular location and what I'll see is I'll see you know these frames are bad and I can see what file IO is going on there or I can see around this file IO is my framerate good you know what kind of processing is happening right after these files and I can do it either way so no matter which one I select the other one gets selected at the same time and all of that ends up showing up on the bottom and I can see what's going on and see my one big frame and what's going on in every single CPU and the GPU as well in this case I you actually don't know what the bug is yet I haven't figured that it so I'll have to spend some more time on it but soon now memory is kind of an interesting thing especially memory leaks it's funny if I ever if I you just google memory leaks you get some strange advice and they say hey make sure that you free all of your memory when you shut it down and I'm always like really I don't know I just delete my process the OS is better at it than me maybe thought that doing that in debug mode or something isn't useful I guess but really what you want is to put the Machine in some steady state and make sure that you never go above certain high-water marks and stuff because you know if you were here in constables you only have so much memory and you don't want all these performance and swap things out to disk or something like that so the way I look at memory leaks what I would call a memory leak is I capture allocations between a particular period of time like okay loading level one from the beginning of the load to the end of the load I've probably loaded most of my big assets there for the level and then I can make sure that those are unloaded by the time I'm getting to level two because I don't want that high-water mark so for example if I've got this allocation at time t1 I want to make sure that it's freed before C in this case it's freed after C it is freed but it's still Ali because it's adding to my high-water mark I said it was going to be free and it is not and so I need to go dig into this and find out what that problem is so what you can do is you can look at turbo tuner and you can determine where a B and C are you can go find out that you know okay my level one loading is here and by see all of those allocations better be gone and then you can see that I've got a big leap big list of memory leaks here which kind of sucks but fortunately I have good information I know what scope they're from I know what their pointer their size I know their call stack IDs I know a highlighted item which you know what a full call stack they're I don't know what assets were associated with that location so I have lots of all of this sort of metadata with each allocation and so that's really powerful I can see other things but it's just this sort of ABC member link thing there's other modes and can do similar things to the previous tool where I can you know just highlight two points and see what growth I had or something like that but I think this is is pretty useful you can also do memory categorization with this tool and this is data driven and you can sort of scrub between two times you can you know start before the levels loaded and after the level loaded and you can see how much memory goes up in this case you know it goes from 2.6 gigs to 3.3 or so I think this is kind of interesting because you know everybody kind of wonders what goes on in games and really I think they're very similar to a lot of programs and you know I have two point two million allocations to point one are smaller than 512 bytes so this means my debugging systems really I'm focused on this number here I have to figure out how I'm going to debug this large number of small block allocations and deal with that scale on the other hand my memory leak system when I'm trying to figure out you know am I gonna run out of RAM or something I really should pay attention to the higher numbers where I've only got 208 allocations and somehow or another that adds up to two gigs that was surprising for me but these are allocations that are larger than than two Meg's so you know they're big and I think that happens with with many programs there's sort of this exponential curve that goes on the other thing is is like I was saying this is data-driven so by loading another Yama file I can look at it in a different way I can see how many cars and how much cars took up I can see how much trees take up I could see you know how much any particular element that I wanted to in my game bullets weapons you know towns I don't know car parks I can also see and show to you what you know most games are sort of made out of most of their memory is really about rendering and about asked and weigh assets I mean meshes and textures for the most part one thing for me that when I did this experiment surprised me quite a bit is how many other allocations I had I mean I've got you know 1.2 million allocations and assets and if they're supposed to be big things what's going on there there's actually a whole bunch of entities that glue all of these things together in our engine and a lot of them are very small and there's obviously quite a few of them and that's what's going on in this case rendering assets are quite similar to content in the fact that they're nearly like textures and and buffers used to draw the scene and those buffers sometimes we're a lot like meshes actually ah another thing that people should know is that code is small compared to all of this and so data is usually the problem if you can reformat your data and and make data faster or data better then this is a good idea trying to figure out how you can compress and make data smaller is usually what you should be focused on don't get me wrong I mean I don't know three months ago I was scraping code out of of est-elle and trying to shrink our executable size back on PowerPC on on gen 3 so a PlayStation 3 and Xbox 360 I mean so it does happen and sometimes that is the best way to get back memory but usually you should look at data first so I've introduced a Delta viewer and I've shown you all of the different views that we've got I've also shown you that I've got lots of work left to do fortunately I have it your left so I'm ok I've taught a little about the the differences that we've had between STL and STL and how we're trying to track all of that memory and how we're trying to figure out how to debug it and maybe using something like you made it you own it kind of style might be a good idea silica might be another way to solve this problem it's an idea I'm not sure if it's entirely fully baked but it does seem to work pretty well and I can give it to a junior engineer I think we'll solve it and figure it out right away and so that's that's quite good ah finally I hope you learned a little bit about games in general you know what they're made out of how much memory we have and they're pretty much similar to most styles of things lots of small allocations not so many big ones most of its rendering in assets you might also have seen that you know if you don't have things like stomp allocators you really should be using them but I guess if you're on you know things like Macs and PCs then you probably haven have better tools than I do there's a lot of different ways of having large amounts of allocation schemes and really do pay attention to it and divide up memory in all sorts of ways based on team or size or you know time of life besides just the ones that everybody talks about here where you know it's about performance you know fixed size pools and you know bump pointer allocators and these kind of things what I think that's it for for me anybody have any questions or anything what am i I don't know yeah is that on okay cool all right yeah yes I the profiler will not be open sourced not yet anyways don't know if that will change that's really cool oh yeah thanks it is pretty neat I wanted to show people what we do internally I wish maybe Stella cut or something like that can be I don't know where I'll work on it behind the scenes and we'll see I know I've just just gotten the ability to do STL but I think there is a Simo mentum there and so maybe there'll be other things that we'll be able to do I don't know thank you sad to hear that you're not going to open sources but the next best thing is there any chance to see the features you showed here in the screenshots or something like that videos so we can copy that to other tools like which I'm working on I wish I wish I could have done a live demo I probably could have got away with a live demo except for the fact that I'm working on a very tiny Mac and as you saw it was like screen space um it's like 4k is is pretty good I mean I can do it in HD but ya know I'm sorry I didn't make any videos or anything of it it's pretty new actually we just got it internally I would think like the memory stuff we just got in July ah so I mean it is really our cutting edge so as of yet I don't know what we'll do with it does espl also work with like the standard memory smart pointers like unique pointer and should pointer or are they completely we would have our own implementations of those features as well and will usually put things in STL to try and make it like the the standard you would know better than me but okay yeah come talk to him and you can tell you more details about it I'm a power user more than I'm an implementer oh yeah most of the pains that we've been through as well the particular question I have was this you were describing how Tinkham wears debug performance was substantially worse that's true which we've also discovered with our own CIA STL ah there you go but we discovered really quickly that this was down to the astray to debug level I wondered if you to taken account of that what is it it's a debug level no in our case it's not I've shut off all like the parameters and everything are fairly optimal you know it's a I am trying to give it a fair fight you know I wasn't trying to disable all the things although I don't know if I left the debug one and the default I'm pretty sure though that that even in the debug case that I've turned off all the the default debug attributes and things like that and I'm not doing you know checking to see if I'm off the end or all of those kind of things that you could do an STL by default there's one up there too oh oh sorry um yeah what one in sir yeah back back right no top hey thanks so do you have sort of similar empirical data for how the adoption of these tools actually reduce I'm not sure actually I know like stomp allocator was like a revolution in the fact that like I could just i started ignoring QA and i would just go and try and test with this thing and only once i got rid of all the bugs that way that i would start you know paying attention to - you know QA x' bug reports because i would know more about the status of the game than they do but besides that sort of non empirical data i don't have very much for a EAS feel were there any you know intrinsic decisions that allowed you to increase the performance as much as you did for certain cases or how were you able to get you know such big speed ups in certain cases I probably would have to go look at pulp Adrianna's paper I know we were talking about it on the way in I think one has to do with the memory model of STL versus EST l i think we're allowed to have pointers as being our iterators in strings and in vectors so that's one place and the other place is is we don't rely on inlining as much we do more of cutting and pasting and I think those two count for much of the performance difference off the top of my head but I didn't have time to go dig into the numbers because that I knew everybody wanted to know but I haven't had time to to go into the numbers okay yes yeah we only have some subset we don't have all of them you know which ones we have I think we have I know I can only think of list yeah yeah okay interesting no not yet but that would be interesting we do have many other things but that or don't get India steel like people still write their own custom hash tables and things now we're definitely interested in adding intrusive containers as the next move last g14 so people interested contact me yeah if you guys do if you guys get them into the you know SG 14 we'll pull them probably into yes do there's sorry I just wanted to say that profile you've W profiler you showed us is very impressive it's has a lot of similarities to Windows performance analyzer bad it does and I suspect that the logs that you generate could be turned into ETL files it opened and much of the functionality or I could use D trace - of course yes I know it's a and there is actually an internal library and probably like there's two forces going on actually I bet you this might become a GUI for another thing I can see it happening as I as I wrote this I posted this and like one of my co-workers is like I want to work on that I want to do this cool thing with dtrace and we're like ha ha ha I can't turn it all in at once I I'm actually surprised at how fast it is actually like for example when we first converted over to the you know every single a lock and free sort of model that that is in that tool why it was like well in our previous system we were only able to turn it on just like certain zones at a time now I can turn it on everywhere and you know what is it you know maybe 10% or something it's actually really impressive how fast that that pedido on the profiling one could use it maybe a bit more work mainly because like like the things we're talking about but DTrace and and other tools like that we could probably get better information that and then we currently have and maybe we could implement that was better but you know they're they're usable in you know you're gonna lose three or four milliseconds of frame or something to them that's not too bad I can still play my game in real time ish as long as you can afford to drop some frames what portion is timeless it's it's very small and very big teams could afford to write some of their own tools but in this case this is one of the good reasons to start using you know engines and you can see why what's going on in the gaming community is there's becoming less and less engines because just tooling up for these things is very expensive store logs over time see yeah yeah there's a whole other sort of thing that goes on here that I'm not showing I mean there's a project we call biometrics because it came from Bioware where we you know we can see all of this data over time over many many sessions as well so but it's yeah it seems like it's per team at the moment because teams need different types of data they look at the problem differently still maybe eventually we'll be able to make it more central ok you
Info
Channel: CppCon
Views: 33,105
Rating: 4.94382 out of 5
Keywords: CppCon 2015, Computer Science (Field), Bash Films, Conference Video Recording, Event Video Recording, Video Conferencing, Video Services, Scott Wardle
Id: 8KIvWJUYbDA
Channel Id: undefined
Length: 57min 50sec (3470 seconds)
Published: Wed Oct 07 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.