HandmadeCon 2016 - Large-scale Systems Architecture

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Awesome. Thanks to this I've discovered all of the other "Handmade" stuff - I wasn't aware of it before. This is a great resource.

👍︎︎ 5 👤︎︎ u/ignotos 📅︎︎ Nov 10 2017 🗫︎ replies

Programming talks are usually discouraged around here because they get too deep into the tech for non-professionals to follow. We don't want TMoG to become r/gamedev2. But, this is a great example of describing the process in a way that consumers can relate.

Thanks for posting!

👍︎︎ 2 👤︎︎ u/corysama 📅︎︎ Nov 06 2017 🗫︎ replies
Captions
again like it kind of in keeping with the the talks the the you know three game industry talks from today because the compression talk was definitely the kind of a more theoretical one but I kind of tried to select all of the talks from today to be about the sorts of things that I can't really address on handmade hero directly right they're not the kind of things that you're really going to talk about when you're looking at something from the perspective of you know one or two people who are working on something and you're you know you're never really going to get to the kinds of either personnel scale or data scale or any of the other things that really start to introduce these sort of other problems that exist for no reason other than scale they're like scale only problems and so what I wanted to do was sort of get some perspective on just like okay for for you know a big triple-a title one that's like you know supposed to be kind of a flagship game where you're you're gonna be seeing all kinds of like you know high-end visuals high-end audio these sorts of things what exactly is sort of the workflow like on one of these and you know especially in the case where you have sort of like a pedigree engine or something where it's been refined every time like what does it look like when this thing's working well and you know also when it doesn't go well certainly but just like give us some perspective on the things that have to happen there so I invited Jason Gregory to talk to us about how they do things at Naughty Dog which I think is going to be fantastic as I know nothing about how your stuff is architected at I'm really excited to learn so please give them a warm welcome so give us some background on you again so the people who who aren't familiar with you kind of know like how'd you get into the industry like what's you know what's your background obviously yeah yeah absolutely I actually won Canadian hmm I went to University of Waterloo and I took something called systems design any Waterloo P that sounds like it yeah okay awesome I took something called systems design engineering and actually at the time I thought you know I thought I'd get into robotics or process control or working on rocket ship - I don't know something something that melds different kinds of engineering because that's what we got we got a little mechanical a little electrical and so on but as it turned out programming was something that excuse me I was doing at home and I was I had a lot of experience with and all the jobs and things I was getting was in that vein so I ended up going down the just straight-up programming route so I'm just a software guy now don't do a lot of hardware then a little bit of experience so and it's funny after when I got into gaming history I worked outside the game industry for a while and then when I got into it my mom actually we were talking with her and she goes oh that's really great news but when do you ever want to get a real job [Laughter] okay explain that actually games is probably one of the best places to be as an engineer because you get all this wide breadth of different technology right so certainly that's kind of how I got into it started at Midway and San Diego and then EA and then finally at Naughty Dog okay so that's that's like actually quite a bit of different experience with different places yeah yeah absolutely yeah and so you know at Naughty Dog like currently you're I guess you've you've kind of been there for a pretty wide span of games to give us a little bit back like when you came in no dog in like how you've kind of you know what you've been involved in yeah yeah I'm very close to being a decade dog actually we call it so almost I started right when we were in the middle of uncharted one and it was interesting time actually because we were we were kind of going through a big change from the way we used to do things on the Jacke games to what we were gonna do on the new engine and everything was being rewritten in C++ and so it was there's a lot of sort of technical upheaval if you will we say you guys see the sauce it was originally written in straight C you know actually Jack and the games before that and crash everything was written in a language called goal which is a lisp based thing amazingly yeah and it was pretty amazing because they wrote their own like custom and I mean Lisp is not a difficult language to parse and it's a relatively minimal language so it's conceivable that you could do this but they basically built their own compiler their own language and it was based on there was C under the hood and some low-level assembly language code for the really high performance stuff but everything else was built like everything was built in this Lisp language so what do you mean by everything like give us an example of the lowest level component that's still built I mean it wasn't your code base so yeah I don't actually know for sure but I do know that for example the game object model so anytime you have a crate or Jack himself or whatever that was a gold sort of object right and there was gold code and the cool thing is you could do this thing called control T where you could rebuild the code and have it live update in the game so you might be iterating like crazy as it turns out we actually still have that in C++ you know how you can in Visual Studio you can do apply code changes under the debug menu and before that when we were on ps3 with SN systems compilers we actually worked with us and to make sure that that that work that we could live update your code because it's super important to be able to iterate like that that's the workflow that everyone was used to you don't want to give that up just because you're moving away from growth right okay and actually an another interesting point is that we still use a lisp like language today but only for two things we use it for data definition so this is a rapid way of getting data that can be expressed easily in a text format into the game and I can talk about in more detail about that it's kind of interesting and then we also use it as a runtime scripting language so you literally are scripters when you're a designer Naughty Dog the first thing you do is you get a book on scheme you can get to learn the basic syntax here and what you basically learn right what you basically learn is that you know function argument a a comma are gonna be just turns into function a B there you're done you got it well not really but it's kind of like that so yeah we that is actually pretty fascinating yeah it comes partly from that history and it also comes from the fact we've considered changing this many times because believe me this language has its issues okay it's it for example it takes us to build our entire game from scratch you know well now we're actually doing a distributed build and it's even faster but it used to be maybe maybe seven minutes or two ten minutes I'm going to build the entire game and then 15 to build this stuff yeah like it was really yeah okay so it's if you mean just to ingest the text and output whatever the final format was for at these data definitions yeah so on because what we're actually doing is we're running steam code and so here hearing lies the reason why we still use it scheme anybody who's familiar with okay racket now one of the beautiful things about this language right because it basically blurs the line between code and data right like that there could be a function f with arguments a and B or it could be a data array that just contains values F a and B and moreover there's a very very powerful macro language built into racket and scheme that allows you to transform one what we call an S expression which is a parenthesized expression like this into another and so you can basically have code that writes code and so what it means is we can define custom data structures like for example we have a system called the animation overlay system where we can take animations from one character and we can say no for this new character all of the animations are the same except for these ones where we're gonna overlay new animations and to do that we just make a little syntax where you just you know do something like you know source animation gets overwritten by destination animation and then you have a whole list of these in some other big data structure that's like define you know and I'm overlays or something oh and by the way hyphens in this language are just symbols so it's not - it's just part of the name it's weird but anyway so we can just define custom syntax and then that just spits out data that our engine can suck in yeah so essentially you've in in your data description language you have metaprogramming which you are sort of doing on the fly here and I guess as a consequence of that it's like yeah there could be fairly complicated meta programs running at invest time so it's not as straightforward as just saying I took in this text yeah I uh put the Spiner exactly yeah the alternative that most other studios do is they'll say let's go with xml or json or let's write our own custom parsers yes excuse me anytime you want to add new data you have to write more parser code and some programmer has to do that here well it's the same some programmers to us to do it but they do it in scheme alright so they'll literally just write a macro and a lot of times it's really simple like there'll be this thing called syntax rules but literally to say if you see this pattern of code transform it to this pattern of code and that's it there's like a pattern matching element to it but then it can get way more complicated in that and so usually it's the programmers that are doing that stuff more than the designers although they dabble in it too so essential is that I guess you know when you're so you know and I guess not to dive into too fast too quickly here's what so if I'm looking at this and I'm saying wow it's taking 17 minutes or something to process this stuff how essential is it to have that flexibility there is it it actually no we really use this stuff like it we couldn't just build twelve transforms that we tend to call upon and have that work it's nice could you give some yeah so first of all when I say it's like twelve or fifteen minutes a build that's if you're to rebuild everything from yes and so the iteration time though if I just am building I make it small change to my animation overlays we have this thing where you could just say okay just build that one file it builds in you know maybe 30 seconds ten seconds it depends right it's pretty quick and then it's it sends a message directly to the game using a Redis server where it's like a remote procedure call of the game basically and pipes in the new data and so we can just live update that data so they can be you know designer or an animator or whoever can be just working the game and and make changes very quick I want to unpack that whole thing in a second here but but before before I guess like get into that yeah yeah you said just just before we started here that you were like I want to tell like a emergency fire story yeah you know after having listened to Chris's what once you go ahead and do that so I've got that in the in this sort of totally totally I might draw some more so yeah he's hammered rawr so so yeah it's interesting because I mean every studio has this right you develop a big game you start off thinking oK we've got all our tools in place everything's good we've shipped a bunch of our games it's going to be fine and then you realize ok like the the jump let's say from uncharted 3 to the Last of Us uncharted 3 we thought was a big game and then we got to the last of us where honestly it was really more of a ps4 game that was just trying to be squeezed into a ps3 and then that's why we made the remastered version actually because honestly it was really almost ps4 content and and then the jump from the lastest remastered to uncharted 4 was another I'd say depending on the kind of ass that you're dealing with it could be 2 X it could be 5 X Wow in terms of just the amount of data the amount of just the size of the levels everything was much much bigger so we had our own growing pains and I used to think our problems were bad until I heard Chris's talk yeah but so what's really interesting about this whole thing and this is something you learn with experience right is that depending on the decisions you make early on right in or the decisions that have been made over years and years of your studio and the technology you have it leads to a certain class of problems or certain kinds of problems that you'll hit and so the problems that we hit were completely different than the problems that they had was with hailer and destiny just because we made different choices and so it led to a different right it's kind of like a chaotic system depending on the conditions yeah so no you're gonna have something bad happen it's gonna be exactly so in our case we we do it okay so let me actually back up and I'll give you guys a little bit of a flavor of the way our pipelines look at a high picture level okay so this is something that I found really interesting when I first got to Naughty Dog we have this philosophy that I haven't seen in any other game studio where you know da we spend a lot of time on dealing with okay we're gonna publish a stable version of the game and that will go out every couple of days and then we'll have people working on branch and they can integrate in carefully and and it was all just very staged and so on and a Naughty Dog it's Wild West it's like we just we just run the latest version of the code all the time and we run the latest versions of the assets all the time there is no such thing as it works on my machine really okay which is amazing actually in a way by the way he has a conference room whose name is it works on my machine so it's a really different way of thinking about things and so check it out so what we do is we whenever we check in code we have another different code branches for things like this game is already shipped and we're doing DLC versus this is a game that's being live developed you see code branches you mean in your source code control system yes yeah so in perforce services we'll have a branch that is main which is the main tip of development for all the games that are actively in development and we'll have branches for like when enter before shipped we branched that off and now we have you know you for final and that branch is just its pristine and it's clean and then if we're making bug fixes or we're doing a DLC we can do it in that branch and not and we meanwhile they can be Wild West on the main branch again we're out for whatever else we keep were kind of off and now it's like you know people who are trying to maintain that already shipped project are in there everyone else is on is on your wild wild west mainline exactly and people will jump back and forth I mean the other thing that's interesting about Naughty Dog is we have a very small actually very small engineering team considering the size of the game so so so I would say we have about I mean again orders of magnitude we have about 10 programmers who are just focused on graphics maybe 15 to 20 that do what we call gameplay which is basically everything else that's runtime that isn't graphics and there's specialties in there like AI physics like but to give you a sense we have one guy who actually for a while it was me and one other guy that was working on it and then it's this other guy and but it's generally one or two people who own a big system like physics for a while and then they might move on and own something else and somebody else comes into that role so maybe 15 20 gameplay programmers doing all sorts of stuff and at the for most of the time I've been there - tools programmers which is crazy yeah and so how in the heck do we make a game and honestly I don't really know sometimes wondering yeah I'm still kind of trying to figure it out now since then though so one of our leads Christine yearling has moved into the role of just look I'm gonna own the tools and he's now built up that team to be four people including himself which helps a lot I mean it's doubling the team size and they're starting to really help solve some of the technical debt that we have so what do we describe here it's not my work this is mostly Christine and his team and also Christoph Palestra who is our co-president and he sort of acts like a CTO for the company okay he's also working on a lot of this this stuff it's sort of his area that and back-end servers and things like that so we built up the to the tools department to some degree but anyway so what that means is relevant small team and people take big chunks of the engine and kind of own it okay so back to the rough structure of our the way we do things so we run the latest version in the game a program we'll check something in and that we have this little bot that's running it's just a Python script that's running on on a machine and any time it sees a change in perforce yeah it says well is it relevant to the branch that I'm trying to build yeah okay it's in this branch great build meaning each of these branches has their own bill bonds yes exactly and machine or is it one machines running lots of build it can be both or either yeah so sometimes we have one machine dedicated or whatever okay and maybe different flavors too like we might be building the development version of the game and also a specialized excuse me final build of the game there's debug overlays or these sorts of things and you do have two different bills so everyone's running the latest version of the code yes a lot different people could be running different sort of flavors of that piece of code based on whether they want yeah yeah and actually it's it's so as an aside it's an interesting point different flavors building different flavors of the game has pros and cons and one of the cons is that if you're not testing there's a final version then you don't really know what you're doing so we we only limit ourselves to two flavors we have the normal release build and we have this final build and QA needs to test both and we actually as we get closer to the end of a project we start actually building real packages or disk images and they test off of the real deal because of if they don't we're kind of film so you don't have a debug build at all no but we do have this is another cool thing actually that there's a lot of segues here sorry for that but um so it's all cool stuff that I hadn't a lot of this is stuff I hadn't seen at other studios so we have a thing called a hybrid build and we used to do everything with make files and what you do with this is you just have a special m'kay file that's for you like okay Gregorio m'kay okay and it lists a bunch of files in their CPP files that I want to have built in debug and everything else will be released so I can say I'm working on animation I'm just gonna make all the animation files a debug and everything else will be released so the game still runs reasonably well I'm super super useful and so anyway okay so that makes sense so basically what happens there is it is individual programmers are in fact running some debug code you're only running the debug code that they actually care about a particular time and I mean I guess there's nothing to stop them from having star in there we all just like give me the whole yes and although well there is actually one thing which stops them which is now our game is so big that if you do that there isn't enough room for all the code in the space of our budget so you really don't want to do start because it cannot actually run yeah but but yeah so generally that's helped you work okay so when I check something in the build bot builds it and there's a web page you can go to that shows you like each check in and it goes green if it's built successfully and then it publishes it and that just means copy it up to this network drive that we call the Z drive okay now this is another interesting thing is that Naughty Dog is a hybrid Windows and Linux studio okay so we're just we're really old-school yeah the reason for it though it's partly historical and it's partly because Linux I mean honestly is just better at certain things and I'll give you an example of a little bit later on of one place where Linux really shines but a couple of examples maybe so what ends up happening then is yeah so the bill gets copied out there and then we have all of our source assets like Maya files and so on and I mean if I can maybe I'll just sketch out so you know you've got source assets over here can everyone see that yeah terrible and writings right so you've got you know my files and you've got Photoshop now PSD files and you maybe got some ZBrush files and who knows what else and you've also got these dot DC files which are basically our scheme these are text files that are our scheme racket stuff we've also got what we call our well there's a tool that's a world builder called Charter and it has a little database of files so this is like I'm laying out a world and there's gonna be some crates over here and there's gonna be some bad guys spawning over here it's that kind of stuff hoarder there's gonna be some splines or trigger regions it's all that kind of if you will kind of light data that it goes into describing a game world and then kind of on top of all this is what you might call a metadata asset database metadata oops DB and what this is it's metadata about all these other assets in cylinder form in cylinder form I'm trying to use the traditional database symbol oh yeah yeah so so so for example let's say your billion animation an animator might find it really convenient to do a bunch of run cycles and squatting and jumping into an all in one big Maya file with with a timeline but I want to break that out and say okay this is the run cycle this is idle and this is jump and so on and so I have frame ranges so that's a really simple example of some metadata that would go into this thing where we can say okay there's an asset called player you know Drake jump or something and it is this Maya file of this frame range and also these compression settings and whatever other metadata you've got I see so in some sense it's sort of like a manifest in some sense it's like hey look you're gonna have to pack this stuff up and deal with it I'm sort of giving you not only the source of it but also some sort of like ways that the source is going get extracted in some fashion is that just just to be clear so that is not also in the lisp way that's just that's got some separate listing format that's more straightforward yes and a fully program yes and in fact I say database because it acts like a database but our current implementation it's a bunch of XML files or sometimes just custom text formats that are stored on a perforce server so when you're dealing with that metadata there's there's a tool that you have we call builder although it's a silly name because it's really more about the data than building but you go in there and you can specify assets and all this stuff and it just saves it by checking files in and out of perforce and updating data that way and what's nice about that is like text formats are great this is this is a another sort of fundamental philosophy at Naughty Dog is is the kis philosophy right keep it simple we leave out the stupid part you but so keep it simple and if something really really simple works do it and only solve the problem when it becomes a problem right and sometimes that's bad because you you can sometimes just think a little bit close you know near term right right the good thing is we have a lot of I think very good senior engineers who are good at thinking about like okay here's my near-term solution but in the back of my head I know that I can jump from there to here to here to here to get to this final solution it might take me to projects to get there but that's what I've got the plan yeah exactly it's really important to because it's kind of one of those things where you know that that allows you to not make the mistake of let me do too much work right here and build something we didn't really need and over complicate things but it means that I'm not gonna hit that sort of integration wall so if I were like oh now we have to tear the whole thing down right that okay exactly and I mean that is so I'd say the younger you are as a programmer the less experienced you are the more prone you are to this which is this sense of how programming is amazing I can do anything and then I'm gonna solve the general problem in fact I'm gonna solve the problem that can you know I'm gonna file everyone's taxes while I'm making the game solving system that they can put their own taxes on yeah taxes in any country exactly so ya see you really want to avoid that and so we are very very kind of everyone from Christophe to the leads to just everybody kind of tries to exude this this idea of keep it simple solve the immediate problem have a plan for long term and deal with the issues as they come up and then it also comes back to like premature optimization for example the old 80/20 rule who's heard of the 80/20 rule right most people yeah good so like you know 80% of your time is spending 20% of the code or sometimes it's more like ninety ten or whatever so you don't want to be optimizing that eighty percent that doesn't matter so we're very big on that so anyway um I swear this metadata and we have all these source files and then what happens right is we have various tools now there are two ways you can do this one you can say okay I've got a final asset at the end so let's say it's in animation we say do this you mean do an asset build yeah so some sort of asset bill so we we have some animation and we actually package these things into into things called so let's say we've got this is maybe animation hey and we have like anime player layer and this is a so-called pack file and a pack file is just a binary format it contains sections of data and it's so it's reasonably robust just being able divided out and say I've got different kinds of data in this pack file but it's usually on the realm of like okay I'm animation player so I've got all the core animations for the player and that includes his run cycles and jumps and everything else maybe in different in different demeanors but then let's say the player needs to swim that might be a different pack file of all the swimming animations and then there might be another one that's just this one in-game cinematic is all just in this one pack file and pack files can contain other things so animations they can contain skeletal rigs they can contain geometry textures although we've kind of split those out a little bit but textures materials all that kind of stuff and then we have special pack files for this charter data that are that we call in game pack files but really it's kind of a silly name again what it really means is it's a it's a light pack file that contains rapid iteration rapid iteration type data so things like the spawners in your world splines regions lightweight data that can be built very quickly and that you're gonna want to iterate on quickly okay so actually I mean kind of thinking back to what Chris was saying right is to sort of get a little bit more perspective on that okay so you're saying that these these are actually separate formats for this no it's actually built into the same format but it's a different kind of file that's built separately in this in this pipeline so so it's really just more about what goes where but the format's still the same I'm not fundamentally building a different kind of file just because it's light right exactly yeah what the contents of the file are different because each of these files has segments of data of various formats got it but yeah so there's a there's a process Oh anyway so yeah there's a process whereby we take let's say assets that come from Maya files and textures and they first of all the Maya file can be exported out into this is kind of funny too we have a tool called Maya 3 and dB now what does this mean well NDB is the Naughty Dog database format or something it's like the naughty dog like intermediate format my okay so my 3 NDB used to be called Maya 2 NDB with a number 2 but it meant like the word - and then somebody thought next version by a 3 NDB so anyway so you get your NDB yeah oh well whatever there's a lot of fun stuff like that in EM so you get this crazy NDB file and then it goes through various other processing and eventually gets packed together with other assets into a PAC file and then you know and it could be ZBrush assets it could be whatever DC works differently it goes through the our DC compiler which actually DC stands for data compiler although the programmer Dan who works on it we used to say it was Dan's cool language but it's actually David Clark and that turns into I think of the bin file which is a special format charter database like I say it goes not through an NDB file per se but it eventually also becomes a PAC file and all of this metadata is getting sucked in to just know how to build all this stuff so is the just to try to get an idea here so when these lines are going across here we're talking probably fundamentally about a different program doing various ones of it because like for example you said Maya 3 NDB it's its own program it's not some subsection of the acid build right so the metadata all of these programs are reading out of that data to figure out how they pack things so it's kind of like a common format among everyone no matter what you're reading you're out there touching it exactly yeah exactly and there's and I'm over simplifying here so my main DB is the first step where you're taking a Maya format and you're extracting only the bits of the scene graph that matter and putting them into this format that we can then use to process from there we might generate we might send it through havoc to generate some collision data and other things we might send it through our geometry processing to get it down to that format that Chris Butcher was talking about where it's like highly optimized for that platform we there's bits that are gonna process shaders and build shader code bits that are going to deal with textures and so on and so there's a whole sort of pipeline so that's why these arrows really represent you know various tools that might be running and usually what with so what we'll do is I'll say I'd like to rebuild an employer so excuse me so we have a little tool called BA which stands for build actor which is another really weird name but actor really just means basically a pack file okay and so we say ba annum player and what that does is it looks to the dependencies and if you think about this there's a whole dependency graph right we're given an employer I say oh well it contains animations a B C and D those animations depend on these Maya files and those depend on these metadata database assets and these things depend on these ZBrush assets and whatever it is in general all of that graph is coming from the metadata DB or some of that coming from inside the Maya file because like a texture reference with this it's coming from a bunch of different pieces so basically a graph links in this graph could come from a variety of sources yeah and there's there's pros and cons to this so for example is many of you probably know Maya is good at referencing other Maya files and you can build up a composite Maya file out of hundreds of other Maya files if you want so there's those kinds of references there's references in those Maya files to our materials and our textures and stuff which are coded relative to the project but just as pads in a file system and so like if I've got a texture that's blah blah blah got TGA and it's sitting in this folder then the relative path to that is embedded somewhere in that Maya file it's good in the sense that it's super simple and it's easy to reason about and so on but it's hard it's bad in the sense that it's actually tough for us to pull on an asset and find out all the dependencies because have to parse the Maya file all the things are and presumably and I'm just guessing here knowing the way artists would typically use a Maya setup yeah it's not as simple as saying anything that's referenced is used I actually need to do the work to walk the tree and say oh that was a hidden node so I don't need to look at that reference and that was a you know this texture is just for testing or something so exactly it gets more complicated than just what early ex-raf signifies right exactly exactly it's it's tricky so and we're actually working on solving that problem right now Christian and his team are doing some really great work on building up an actual like SQL database you know relational database that stores this information anytime you build an asset it updates those references so that the next time someone builds you've got that data yeah because I actually wrote a tool one point we wanted to be able to grab some you know assets from let's say uncharted 3 and pull them over into the last of us just as a test you know just to until we had real assets yeah and so I wrote this thing called grab asset that would like grab those things one told me I should shorten it to grab we're trying so but what that thing did is it literally would have to it was slow because it would have to open up all these Maya files and read and the our technical directors who are that's our our TVs are not like a normal TV they're they're more like technical Maya people who do Python and they do rigging and they're they're super smart and they know all that stuff and that's kind of their area they're like honorary programmers that work in a kind of extended tools programmers um so those guys actually came up with a way of storing all of those references at the top of the Maya file in a special node so you could just only read the first you know K of the file and then get all the references so so that's good but even then it's ugly so we're moving towards having an actual database where you can just query and find information a lot easier so that's gonna help a lot but and so essentially that database in if we look at how this thing is structured what we're really saying is okay when I build an asset I'm essentially producing two files right I'm producing or two outputs I'm producing the actual output that I was trying to build but I'm also producing this secondary output which is possibly going to be shared with other people because normally I guess you see you only build the one for whatever the particular machine is here yep in the other case it's like know every time we parse it's Maya file we know that no matter who else will ever build this for any reason we only need this particular set of data to be processed when that asset changes so let's share it yeah and in fact there's more than one kind of intermediate file so when I'm building a single thing like an employer there might be a bunch of intermediate files a good example is we have a thing called streaming animations where you can have a very long animation and it gets chunked up into into little chunks that can be streamed in over time and then I started and so we might have like a little intermediate file for each chunk we might we might have intermediate files for havoc etcetera so so basically a small number of or well sometimes a large number of source assets turn into a very large number of intermediate files eventually boiled down to a single PAC file and yeah so and are these intermediate files in general I mean obviously we're talking about the one where okay we want a pre pre process sort of the dependencies out of here and put that in great file you it's obvious what that gets used for are these intermediate files generally just byproducts of the build process or they but do they have a lifecycle where it's like oh yeah no we'll reuse one of those if the person you know like what kind of intermediate file were talking yeah yeah so in theory we could reuse them in right now we don't and it and it's a problem so yeah so and actually all right so here's the thing when you build one of these pack files there's two places they can go and one is and so one is our so-called z drive z z drive and in here maybe we've got like an uncharted 4 folder and then in here we have like the build folder and in here our various sub folders for different kinds of assets and all of the assets that the game reads come out of there but then every user also has their own Y Drive and the Y Drive is actually it actually maps to Z slash your user name so it's all in the same file server and I could have my own local you for in my own local build so what I can do is I can say alright I'm going to build this asset locally and the game has a mode where you can say run it in in local mode where it says alright I'm gonna try to find the asset on Y and if I can find it I'll use it otherwise I'll go back to global and that means I can be as an animator testing just my animations but still having the rest of the game be current now the Z Drive is shared by everybody so if somebody builds an asset globally it goes up to Z everybody will see it immediately as soon as they rerun the game or reload that level BAM they see it right so now just literally a byproduct of the map drive it's like hey Windows will try to load it from there and so it's gonna get whatever the last thing copied up there yeah basically it's actually the ps4 reading that and I can talk a little bit about how that works as well because normally you go through this thing called the target manager through your PC and then it talks but we actually wrote a special file server to make it faster but I can talk more about that later but anyway so the idea though is that I'm building these assets globally I'm building the glue the game globally which also gets published up to the Z drive and anytime anybody runs they're running the latest latest version of everything and of course the code and asset code in assets yeah and also those those bin files and just coding assets and so when you first tell people that they usually look at you with this kind of horrific face like what how can you be do that that's totally wild wild west it won't work but it totally works and here's why because so as a programmer if I check something in and and it doesn't compile I see it immediately in the build button I get a get a message and that doesn't break anyone because they're not gonna get an on built executable exactly exactly so that doesn't break anyone it just makes the programmers that go Harry I can't build the code right so you immediately quickly jump on that you fix it worst case you just roll your change back and then you take your time to figure out what you did wrong or if you can fix it you just fix it if you check in something that breaks the game at runtime it's kind of the same thing you start getting a barrage of emails from people saying hey by the way what did I do roll it back or you figure it out quickly and you solve it but here's the thing what's the easiest problem to solve the one that you that you introduced into the codebase a month ago and you're not really sure what you did or the problem that you just introduced 10 minutes ago right so it turns out that that's just really really valuable for us and it same applies to an artist you know I check in something and I and somehow I managed to unlink a bunch of animations from the pack file so now nothing works oh my gosh what did I do well I was just working on this oh that's what I did so it was exactly the reason why pretty much the standard reason why we always try to reduce the turnaround time is because hey seeing the results has all kinds of positive benefits to it so yes there may be some downsides to it but that's really the most important word yeah exactly yeah and it turns out stability getting your stability that way is better than getting your stability in kind of a fake way by staging it because now you've got stability because you're running a game that's a week old yeah and so so we like it that's not for everybody I'd say it works better when you have a maybe a slightly smaller team yeah yeah yeah but it definitely works because coming total people are we talking about running so I'd say at the height of uncharted 4 we were probably about 300 plus people in the studio close it's pretty high yes it's you know every hundred you add is is that as a bad curb but it's like it's like it's not like oh we had fifty exactly and in terms of programmers programmers plus the TDS who are kind of like our extended tools Department it's maybe total of thirty forty people at most right so in fact probably more like well let's see and yeah maybe about thirty or so so that's a small enough teaming and we also kind of have the luxury of we tend to just we hire people who are pretty senior so we don't have an unfortunate that it's bad news in firstly that there aren't any internships at Naughty Dog generally because we just don't have the bandwidth with such a small engineering team yeah to be able to manage them to just to invest the time in training up a new intern or whatever yeah so I will point out the fact that although we've talked about a huge number of things so far you've still not told us the story that was supposed to start yes before we got to yeah no no it's good this is background that's actually important to understand that story okay yeah okay so so in fact the nice segue because the Y and the Z Drive what do these things really look like well they're really this thing called the net app which is a kind of like a giant NASA if you will like a giant network attached storage super-powerful machine that serves up these massive drives right this is just tons of Drive stuck in it sitting in some room somewhere yeah exactly basically yeah and I mean I honestly I don't I don't administer the machine I don't think I've ever physically seen it but okay I've seen it in my head and it's very scary okay so so we believe there to be yes you must seem of this nature right if there isn't then it's it maybe it's some guy who's just writing I don't know but I think it's I think it's a machine so anyway what we discovered is so for all the way even into the into the Last of Us which was a big game we had some problems intermittent problems we'd have like tools randomly failing or you'd fail to read in a file or things would get slow and nobody quite knew why but and we would look at various things on uncharted 4 as we got into crunch it really hit us because we actually have we have a farm of machines that build our assets so when you build locally what it's really doing is it's sending off jobs to this farm and on our previous games it was maybe you know something on the order of I don't know four hundred nodes on this on this farm and now we've got like 2600 nodes on the farm just to give you a sense of how much bigger uncharted 4 got and what we started having are these kind of nightmarish things where the whole company would kind of go down and the network adapt wouldn't be responding and we didn't know what's going on it took forever for it to serve up files so we started and by we I mean not me but you know so they what they would do is the net up has some nice graphing facilities on the web so you can kind of see what's going on they would do packet traces and bear in mind that this is like we're talking you do a ten second packet trace and get you know the tens of megabytes of data that you have to sort through and they because all of those packets are just asset Transport for it's gonna be I mean it's Consul exact response yeah people are building things all the time and everything and so they would trace it down to one person's machine let's say that's like whoa whoa there's one person kicked off these jobs that are really slowing things down and they'd go maybe say okay kill your build and so it was kind of like a manual form of load balancing where he'd say here too many people are building at once so this guy stopped and you know she's gonna stop and he's gonna stop and then everybody else can finish their bills and then they'll kick theirs off later and okay it got kind of crazy we didn't but why was the NetApp machine I have trouble with yeah okay that's about so so so your business just my mitigation you realize okay we just need this to go away there right now and we're gonna dig into this exactly so good so as we dug in we discovered for example that so one of the issues was our exit sometimes our tools would just fail they wouldn't launch yes what's going on and so we found it that Windows actually because Windows is now talking through SMB to the net app which is effectively a Linux kind of world right and Windows has a timeout where if if just too long goes by between when you've requested something to run and so it'll just timeout and it'll stop so we okay well to deal with that let's move our executables only on to this low pressure drive a different drive right we started moving a lot of our assets off of the Y Drive on to our local C drives so like the programmers all have their code on C now instead of Y just do me a lot of those kinds of things and that helped another one that we discovered that was awesome is you know how Windows if you change a file and you're in an editor the editor go bling such a sense file has changed so called change notifications well the entire Z Drive was doing changed notifications for everybody so we turned that off and that was very bad so that was an issue okay and so then we're like we need to upgrade the hardware and so we're like we got a better net app that's wider it's got more cores but each core is a little slower but we're like yeah it's okay it's good it's a lot wider it's me good and it actually got worse like what the heck's going on okay so Horace you mean so that drives 19 cores yes actual the CPUs yeah okay yeah because this thing is basically a bunch of drives and some CPUs that are sitting there running like an operating system that can deal with all these requests that are coming in which are coming in over over network packets right to say things like stat this file or give me open this file give me a file handle to this and so on it seems kind of surprising to me that you would need particularly heavy hitting hardware for this process but you would think but okay because there are just so many I mean and it's partly because of the granularity like just to contrast it with like in Chris's world you have you have like a giant blob that is the whole game or that is a whole level and and so there's almost no granularity which led to a whole bunch of terrible problems right but on our side we have very fine-grained files right so like each individual animation and there's lots of little intermediate files so now we have the problem of large numbers of files all in different folders and so on and so we discovered something about the way that the net app works and when you think about it it makes sense if you've got lots of different people you know doing requests like give me file handle stat this file these kinds of things it turns out that the net up has no option other than to serialize all those requests because otherwise you get into a situation where you you say oh so-and-so this file is this big but meanwhile someone else is changing it and right it's a sort of a basic multi-threaded kind of issue you've got a choke point and that choke point is is the net up serializing everything so how many clearly have no choice but to do it I mean it would take more logic for it to do it which it does not have so it cannot go like oh I realize that these two file operations will not conflict right or go I'll let them go and parallel whereas these two I'd better hold one of them right okay so it just goes anyone who touches the metadata is going in step yeah basically yeah and so by getting a machine with more cores where each core was a little slower it actually made our process worse and so with that insight we were able to again change some of the ways that we're doing things moving files around not having for example a giant folder with all the material the intermediate files all in one giant folder but putting them into subfolders so that you know you could so that if you're querying something you're not querying a huge amount of metadata all occurring small amounts of metadata that kind of thing and it's still to be honest we managed to ship uncharted 4 with a lot of these problems but we didn't fully solve it and now we're actually looking at doing some other things like for example taking some blobs of data that really don't need to be files and serving them up some other way ok so so we'll see how far that goes but it just I guess kind of goes to show that you know this mental model that you might have of a file server is just this idealized thing that yeah you asked for fine we'll give you a fine right no actually it's a it's a computer just like you know your PC and it's got software and it's got limitations and yes also they are presumably optimized for a specific thing - and if net apps primary thing that they tended to test on was not lots of little operations felling the metadata and rather was just like what's the sustain transfer rates for large files and the sort of thing then obviously they're going to have you know no idea that this is a crucial problem exactly exactly I think they learned a lot from from us howling about food saying ok ok this is a problem yes ok but so yeah so those are kinds of some of the things we did to to deal with that with that issue yeah no okay so let me ask some questions here from because there's a lot of stuff now that we for sure that we got a lot of yeah so you kind of talked a little bit about everyone running the same version of the game right and so when I'm when I'm thinking about that I'm just going like ok so we're all using the same asset base yeah and I am you know if this is a very mature type of asset so we've gotten to the point where we totally know how it works we're not really fussing with anymore I get how all the things you said are kind of just true it's like yeah ok can you know all makes sense is good but I'm curious about the the points in time when either like I need to make a change to how this asset fundamentally is stored or something like that because the runtime will read it in or you know maybe I I'm sure I think it like cases where that sort of thing happens how do you deal with the fact that ok I need to check in a change which the runtime is gonna need to read this new format of asset and I know all the assets are not in that format but everyone's reading off that and they may not quite have the look like so they seem like there's this kind of weird issue I don't want to say hey everyone stop for a second I'm gonna update all the ass all right so I'm gonna see like do you put in both versions of the code you do like to tell me a little bit about you know am i imagining this problem does it actually exist what doing what you know when I'm sure yeah basically we just never change anything no you're absolutely right there's a whole staging process and so again SuperDuper simple what's the simplest thing you can think of for doing this we do we probably do that ok so one example is let's say there's a wholesale change to the pack file format like we're introducing some you know you type of thing that just go in there yeah what we actually do inside these build folders if I go - oops gonna get it right if I go to you know build and maybe I've got like animations and so it's like annum you know like a streaming animation and so we just have like a version number on this folder so it might be like anime and then there's a bunch of assets and the game is configured to read and I'm 50 if I need to make an animation change format that that is just sweeping and I need to really make some major changes I'll introduce a new folder called am 51 all right and I'll start building those assets as a tools programmer I can just build all those assets I can kick off because we do have scripts that will automatically build all our assets and then we do actually towards the end of a project do nightly builds of everything okay so we're cycling through everything and making sure that it's all checked and it's all working okay um and so I can just build everything I might take all night but I just build everything the next morning we testing the game maybe I get a few people to switch over to 51 and we try a bunch of stuff and when we're confident okay it looks looks good fingers crossed hold on to your butts and then we we flip it yeah okay and then everybody is on 51 and worst case it all blows up and everybody says we get a hundred email saying broke and then we flip back to 50 and we figure out what we did wrong and then we stage again that's one approach for smaller grained changes a lot of times you can get away with just doing something that's much more incremental so example some of our some of our data formats might have a little bit of padding so that you know room for growth kind of thing and then I can introduce some new flags or some new fields in there and it's not gonna break anything cuz not as reading that data it's just zeros in the original they have it another thing we sometimes do the geometry the folks were working on geometry processing so on introduced this their own little mechanism where they have a versioning system within the pack file where they can say okay I've got geometry version 47 and now I'm gonna do geometry version 48 and I'm gonna write some code that says if I ever read a 47 here's how to transform it into a 48 okay so we see that might put some like logic in there that's basically saying like hey okay I think it's not too big of a deal to put a little translation in there let's just do that and then I don't have to bother with this whole thing like hey everyone switch over to 51 I think we're good exactly and so that helps you and the nice thing about doing it that way actually is that yeah you can gradually introduce the asset another good example actually excuse me is animation so you know an animation is dependent on the structure of your of your skeleton of your hierarchy right and if you introduce a change to that hierarchy you have to rebuild all the animations in theory but we introduced the ability to do some live retargeting which is super useful so we can take an animation of Drake and play it on Chloe just as Anna you know maybe just as a temporary thing or maybe even to ship the game if it looks good and so we use that also to to basically say all right well if I've got an old version of the rig or animation that's targeting the old version of the rig and I've updated the rig I can have all those old animations still reference it so now you have this situation where the game is never a hundred percent on a certain version it's got like some of its current versions some of its one version back maybe some of its even two versions back and then hopefully by the time you ship the game you've been doing nightly builds and we do this thing where we lock down towards the end of the project so there's a there's a there's sort of a fallow period where people aren't changing too much yeah and where things can kind of settle out and at that point pretty much everything is 100% on the latest thing you know since you mention it here this is one things on the list to talk about too let's let's ask that lockdown question you said that you do have this sort of lockdown process and you said you have soft lockdown and hard lockdown yeah there's it's kind of not just one lockdown there's like you know lockdown and locked in there's really like that exactly so yeah just since it came up watch it give a little bit can give you a little yeah exactly I mean every studio does this in some form right you it at some point you got a ship and you've got to just stop changing stuff right because and you know you've got the designer it's like oh I just want to tweak this one thing it's gonna be so much cooler and you got to make that judgment call and I think this is one place where Naughty Dog's really it's one of our core competencies is making that judgment call between what's worth fixing and what's not and there are times when we'll do stuff right up to the end that other Studios would say that's crazy you can't do there like some some you know development directors somewhere with his you know spreadsheet or her spreadsheet is going oh my god like don't you know yeah how can you be change to this but we feel super important to tell the story properly or to or to just get that level of quality and on the other hand there might be something that nobody's ever going to see and like no we're not fixing that one breakable cover way over there that nobody's ever going to go to we're just going to leave it because it's too risky to fix it so the way we do that is we basically before a big deadline like III or we're actually shipping the game we go into lockdown periods in in there's a soft lockdown where basically you have to coordinate with your lead to check something in but usually we say yes so people are still going at maybe 75 cents speed right so it's almost a nominal part to just say like okay look now you're gonna think about it a little because you have to at least tell me yes before you do it exactly exactly and that's probably not gonna say now yes exactly and and speaking of the way Blizzard doesn't where they have code reviews for every check-in for us we feel like that's - it's too constraining so we'll do like little code reviews and stuff during that time like someone will you know one of the programs will come to me say and say okay I'm fixing bug number whatever and I'll say okay well tell me what the problem is and what your solution is they'll talk through it and we might even do a little code review look at the diffs figure out what what happened and then I might raise a few issues like I say well wait a sec what's the ramifications of that and say oh I already thought about that it's blah blah oh I haven't thought about that and so that's a really nice period of time to just make sure things are gelling and it also gets people thinking about to get them out of that mindset right because when you're developing it's full steam ahead make as many changes you can and get all the cool stuff in that you want and it's simple you have to make that mental shift - okay it's more important that the game is stable than that your little feature makes it in right like um so that helps with that now when you get really close this is it's essentially honor system it's like we're not actually sitting there like you know having to push a button for the check in as your lead I'd do it it's more just like look you're supposed to tell me if you didn't tell me like what's going on yeah exactly and we do once in a while someone will be like oh I didn't hear I didn't hear that we were in lockdown sorry I checked this in and then we have a look at it you know post facto and we say all right well it was all right it's good or okay let's back that out or whatever and so that might be depending on the deadline it might be that might be a week before it might be just a few days before or whatever and then just like really close to the deadline like on the order of days we'll this hard lockdown where you have to get approval for any change from like one of the principals in the company so be like heaven and Christophe or the game director Bruce Straley or a creative director or like Neil druckmann in the case of the Last of Us or in check for whoever is in that role and then possibly also like a programming lead so that we can kind of chime in as far as the the technical ramifications of that change and so at that point it slows us way down right and now any change we make we're making it with a full understanding of the risk versus reward hopefully okay that's the principle of it anyway and so that's the complete process essentially for going into those lockdown things so it's kind of just like basically saying we're we're putting impediments in place yeah in increasing like sort of orders of magnitude until we kind of get to the point where everyone agrees the game it's done out we go and a few other subtleties there to mention so different departments lock down at different rates - so a good example is you know like sound and particle effects tend to be at the end of the pipeline right because they can't be done until other things are done right you need to see how the animation looks at a time the sounds and exactly and if a change in the animations timing happens the sound has to be already done sometimes so so what you end up with is you have you know design and story at one end and then you have you know like maybe animation and geometry and these kinds of things in physics and then you have sound and particles at the very end of the pipe and so usually the lockdown will be everybody except particles and sound and they just keep going okay you know it's it's a it's a wave process it's like as people you know the earlier you are in the dependency chain you're getting locked earlier a little bit earlier because you know there's these people need to go home to see their families yeah there's certainly and and more more also and just equally important that because there's there's a chain effect right so if a design decision has changed it can have a big ripple effect now that said yeah and this is another thing I think not abducted as well is that we we don't make that a hard and fast rule it's not like right what design is locked I'm sorry even though the game sucks if we don't do this we can't do it no sometimes we'll do it and sometimes protocol and sound will hate us for it right right usually those decisions are made like a you know vast majority the time those decisions can be good decisions that have a really good impact on the game although again that's one of those places where we're trying to find the right balance because you know there is a very real impact on people's lives yes it's not just the game at that point it's like well you know we have to sort of be considerate because they are in a different position than we are because they are there are the people who have to come in last right exactly and in fact it's a big it is a big issue especially for particles and sound where they do tend to work till 2:00 a.m. for a I'm crazy crazy crazy stuff during the end and we are trying to to be more cognizant of those ripple effects so that we can recognize when they happen and we can make those decisions really really carefully actually so we're trying to get better at that now awesome yeah okay so let me look at first of all did you have anything else that you wanted to go down there because I don't want to hear okay one other detail I should mention is Rory said that we branch our code whenever we've shipped a product it isn't after it shipped it's actually well before we ship and so good example was during the development of the Last of Us uncharted 3 was still going in fact 1 2 3 was close to shipping while myself and a small skeleton team was starting to get the last of us going ok and during that time the last was branched off into u3 final branch or whatever and they were all working there meanwhile we could continue along in in the main code branch and push forward on the last of us unfettered right so that's another aspect of the lockdown so there's partly on our system and there's partly a branch and that's because like in that mainline branch you actually don't really distinguish between you know either that's the engine codes the engine code and so on and so forth or actually that's another thing that's interesting is that is that our code structure is kind of like there's a bunch of shared libraries down here and then there's there's something that we call common which is really part of the game but it's built like it so it's shared code but it's built in the context of that game so command whoops comm yeah common you can use that eraser that you did you kept yeah I don't know where it sits on the back of the mmm oh this isn't I believe look at that like it okay so common so this is code that's shared but needs to be built slightly differently for different games okay and then there's the game code itself and so and all of this lives in let's say the main branch when we branch off we would make another branch that is let's say you are you three final let's say back in those days and it would have the shared code oh and and the game of course there'd be two there'd be like there'd be you three and there'd be tea Lu or whatever yeah so in the show here we would still have the shared code and we'd have you three but we wouldn't have the last of us in here at all because who cares because it's it's you three final right right meanwhile The Last of Us and can still continue developing here it can change common code it can change shared code and not break this which is now kind of lost my give me an example of something that goes in each of these layers I mean I can maybe guess but just like for to be explicit about it so what goes in share what goes in common what goes in yeah yeah perfect so something that would go and shared would be like the core animation engine you know a lot of the low-level rendering pipeline and and all of that pretty much you know a large part of even like the havoc like dealing with rigid bodies and rag dolls and these kinds of things common might be something like initialization code that sets the engine up and it's it's like a boiler plate where it's the setup is mostly the same but it's a little bit different for different games okay so we want to be able to build it differently for different games but it's still shared and then game code would be things like you know Drake's code that does the the rope and the sliding and all of these games like player mechanics and and very custom you know weapon things or or whatever now another thing is that we will often you know just for expediency we sometimes with this develop code in the game and then eventually we realize hey actually this is shareable let's move it down so a good example is we we came up with this system for a character dialogue that we call the Vox system and it started out in game and then it got kind of migrated down into and just as a staging point because there were still some dependencies on the game and we're actually still the process now just removing the last few of those dependencies so that we can migrate it and sort of promote it if you will to true shared code at which point any game could use it and it's zero dependency on the game itself no can again sorry to drill down on this one point here because I'm just I'm not sure I totally get it though okay so you know I've got my shared layer and that seems pretty straightforward right because it's like I'm literally thinking of this almost as you know like a middleware library so this is like I could give this to another game team and they could potentially use it right but the common layer I'm not sure I quite understand exactly what something in there looks like so you said for example the initialization code might get built muffle ways but I'm still fundamentally talking about the same piece of code are we talking about pound if thefts are we talking about so the difference is that common lives in the same directory structure that all the shared code lives in okay and it's treated as shared code but in terms of the make files and how it's built it's built as part of the game and so I can build the shared libraries and not even know which game I'm building right I don't say I want to build the you know the such a shared library yes and I don't know what game I'm building it's just stuff it's just code it's a library but with common I'm building it in the context of either you three or the Last of Us or you four or whatever and so I can do things like pound you know pound if I'm going for do this special thing and I say big fire that was our code name for you forcing you for it pound if you for I do some special things so that's all that is so so essentially that's like you know the code itself is actually not changing at all we're strictly talking about pre-processing kinds of operations that are happening on it but we like to keep those sort of segments segmented out because anyone who puts something in shared we kind of want to make it very clear that it's like look this thing has to not have any of those pounds just in there that we don't want this to be something that we're it's got kind of like you know janky stuff but maybe building differently yeah I mean so just as an anecdotal thing when I was a TA I was looking at low level material code like shader code or whatever and there was something in there that said pound you know Pacific assault right right do this thing for the players helmet or whatever it's like really like in the lowest level so we actually disallow it like our make folks don't even past those pound of fines down you can't do pound if you know you for in the shared code because it just won't compile but in the common code you can and so we it's it's like a nice staging area for moving promoting things down to share it and also just for places where you don't have any other choice and now obviously there would be I mean if I'm understanding you correctly there would really be no cost to you in sort of the the pragmatic sense I suppose of allowing people to do such pound ifs anywhere they wanted to right so this is more of a discipline thing where it's like look we we don't want to constantly be having this thing where I go into like you said the shader code and I see all these pound dips in there and I I just started The Last of Us and suddenly I'm trying to wade through but you know the jak and daxter someone's or whatever and the other thing but so it's like we're really just doing this as a way of are tickling to the programmer what is expected by the time we get to this thing so ya know that this is something that we can just go with yeah but chances are we don't really want to make all of this code 100% general because there's some things that are just a lot more expedient to do with that if if that goes into the common layer that can now there's another so actually let me speak to the reason why we do that first so um it's partly just that it's good housekeeping and yes there's and it sort of feeds my OCD which is nice exactly keeps you honest I guess a little bit but there's other reasons for example if the shared library is quite large it's it's a good majority of our code can be in shared and if I can rebuild a shared library and then actually literally share that binary you know Lib between two different games rather than having to build it differently for different games that gives us a benefit so there's there's that and that is from the perspective of for example a engineer making a change who wishes to see that change in both engines and would like it's relevant so at that point it's like well I cut in half the time for it perhaps not happen but I am saving a little bit there a little bit there exactly so so those are two kind of good reasons I guess the third is it again canna forces you to just think think about your architecture and make a good mental distinction between something that is that is game specific and something that isn't um now that's it you're quite right there are various hooks that we have in the shared library so that you can for example have well like the dialogue system there might be a place where you need to make a game specific decision as part of a normal logic and so for that we use like a little like a callback system or a virtual function or like a you know a function pointer or something great a mechanism in there that will allow us to say now now thunk over to some game code run that game code and then come back and whatever the results are right there there are mechanisms for talking to the shared library in that way right in a way the shared I think of it more as almost a framework for our engine more than a library person got it and then the game kind of slots in the back of it so in some sense it's it's like the the shared is is a very large piece of the code base and in general we're trying to kind of keep the game part that we swap in and out being actually much much less so if we can promote something shared we will yeah exactly and I mean again all this is tempered with pragmatism so you know when you know there might be times well especially let's say when we do this branch we do you for you three final or you for final if I'm making changes of the shared code in the main branch I certainly don't want to be propagating it up to the other branch because it could be risky right and so you know it's all it's a judgment call but that's the basic principle that guides how the sheriff okay that makes more sense all right so okay but I guess I should ask n is that every is that everything if you want to cut cover from the stuff that we talked about there before I think so yes if you think of it certain I just want to make sure so I feel like I kind of missed I dropped the ball a little bit on this one because I'm imagining although this may be somewhat this this may be a incorrect imagination kind of like I'm trying to imagine the mythical NetApp device and so on right so you you mentioned in the in the preamble that there are some places where Linux really shines was this related to the Y Z Drive situation and if so yeah I just want to sort of say can you tell us what that was I've got to crack that Easter egg yeah we kind of missed over that a little bit so yeah I guess um and again I'm I'm not a not a network administrator so I'm kind of echoing what my colleagues have told me about what happened but well so basically if you're talking if you're talking to a Windows machine and trying to talk to over samba right tuna to like this Linux box that's serving up all your files turns out Samba as a protocol is very chatty there's a lot of network traffic that goes back and forth and so on and then you've got issues of Windows itself like the timeouts on the executables and these kinds of things yes um and so like the standard way that a ps3 or ps4 comes to a developer there's a thing called the target manager that you run on your PC and the idea is that it'll serve up files probably just from your drive or from a networked PC drive it wasn't really designed for this whole hybrid Linux Windows thing and it's kind of slow to load data from that so we realized that actually way back and kristoff are you know co-president said all right well I'm going to just write a server on Linux that's effectively like a web server that just serves up blobs of data okay and then the PS the ps3 or the ps4 can just you know resolve that IP address and talk directly to it and just get its data that way right so effectively all the normal things of some it does like discovery of who's on the network and why do you have this drive name and also it's like well look we all know yeah who we're talking to let's just get rid of all this that give me the data exactly and so although from a sort of a morphological point of view you're still pinging from ps4 through to to a machine a Linux machine and then to the net app versus ps4 to PC to the net app it turns out that path is faster quite a bit faster and like you say we can eliminate a lot of these extra bells and whistles that we don't need and just serve up the data and so we end up we got to a point actually where the game couldn't keep up with the amount of data we were feeding it because it was just so efficient so it's really nice and it also means that now it's not exactly the same if I'm running off of an actual pack file that I've built like a rather I should say a Sony like package file right a disk image or if I'm wrong yes exactly the the loading behavior the streaming behavior of our engine is actually a little different off of that than it is over our network but we aimed to try to get the throughput to be like say actually the network's faster than reading off a blu-ray or reading off of the hard drive yes yes so so anyway but that's another aspect actually to making a console game where you've got it you've got to think about and test it both ways because the performance might be different I'm not sure I understood why Linux shines in that example though I could think I might have missed that part it sounds more like you shined your team it's it's a little that but it's also that I guess the NFS protocol is just inherently a little more efficient too so okay so even if we're just because ultimately yeah I'm not a hundred cent clear on that but I believe yeah it's one of those hand-wavy things or it's again Linux is more efficient at the side yes okay so I guess that took care of those two things there because I had file server as well which I guess is exactly that thing with Linux okay yeah so one of the things that that you had mentioned when you were sort of saying here's some stuff that would be just talked about from the sort of things that we do I guess I'll randomly pick the order here because I'd like to get to both of them we have about 20 minutes left I think so you mentioned that you guys sort of do you said custom version control for assets and I was wondering if that that because you did a thing there were you like okay well I can have different directories with data is there something beyond this like okay so so yes give us a little perspective like what you've built there you didn't just want to check it into your source code control system right cheese for code yeah it's good it's good to remember that one yes so when you think about like the different kinds of asset files that you're dealing with source code and also like these these racket or scheme files right these DC files JSON files text files all that stuff it it works well under perforce and perforce is great that those cans think relatively small files text formats that can be diff you can look back and see the history and so on not so good for gigantic maya files right or big textures or PSD s or ZBrush files so what we do is anything that's that's on that sort of big binary blob of data versus the small text file we and I've seen this at some other software companies so it's not a new idea I don't think but we came up with a new kind of asset control system and we custom-built it for better or worse I mean now it's it's a bit of a thorn in our side because we have to maintain it but it's also really it can be very powerful and it's got some limitations but it works pretty well the idea of it is that okay so any of you guys who have worked in a studio might know that you know you have that whole ritual or you come in in the morning and you sync the latest assets and then it takes fifteen to twenty minutes or an hour two hours in you don't have a coffee and talk with your friends while you're waiting for all your assets to come down on the witness I believe we actually had a thing that someone wrote that does this process for you before you get to do this let's just make a thing that does it for you yeah you talk to the iPhone like the location services like always like you just like it 4:00 in the morning it yeah so but the thing is if think about it like we got the giant database of very very large files Maya files all over the place and any one person doesn't need all of them well we need all of them but we don't need all of them to be editable and to be actually on a machine so speaking each artist is only really working but some very specific some small yeah in some small world that's a certain set of Maya files and so on and so this kind of asset control system like I say I've seen it elsewhere it uses UNIX or Linux symlinks okay for basically so you have like it master repository with all the assets there and there's one copy of everything that's like the latest version and there's also previous versions that are encoded in some way using directories or file names or something right right and then you just have some links to the latest version on your local machine so on your Y Drive under you know why you four there would be an art folder and that our folder is actually managed by this tool that we call BAM which I don't know what it stands for Oh actually I think it stands for a big asset manager because big was the codename for Uncharted and so that's an interesting anecdote I went so I guess Jack had a codename of next cuz they were like it's the next okay and then so big was was big the uncharted was big and then the last of us was things so then we had the next big thing so yeah so anyway but it's pretty amazing I mean you guys cannot start another franchise we got to come up with another catchphrase here I guess this period could be ellipses yeah and we go on yeah I don't know anyway so so what was I saying so all right so you've got symlinks and some links yeah so any given artists on there why Drive they've got all these sim links to every asset in the game and it's the latest version and then they've got this tool which we've also integrated directly into Maya so they can just go in and when you check out a file what it really is doing is kaat is removing your sim link copying the real asset down so you now have a copy on your on your Y Drive you edit it and do whatever you want to do it's it's now locked by the server so nobody else can edit it and when you're done you check it in and it becomes a symlink for everybody and then there's a little server that link is visually mark as like read-only or something so someone doesn't this is this file it's so part of that process of checking out that breaks the sim link also turns it into writable file so you know if you Maya would say I can't write file or something if you were do something exact if it's process wrong exact exactly so I mean there's obviously a lot of benefits to that because everybody has the latest versions of assets all the time if someone checks something in everybody just updates kind of magically which is cool but there's you know there's obviously downsides as well like that's a custom tool so if something breaks who are you gonna go to we gotta fix it right and the people who wrote that have since left so now somebody else took it over right you know we've got poor Jerome who's on our tools team and he valiantly took up the cause and he's kind of owning it now but there was another engineer before that and just so there's a cost to it and I feel like that kind of tool especially has the cost because you're tough once you're talking about something that the artists are supposed to use typically it makes a big difference if it's integrated into an art package once you're talking about integrating to an art package you're talking about maintaining you know compiled builds for each version of this art package which now are basically were frequent in it definitely kind of opens up a can of wear cerise includes should be simplest we deal with some of that complexity by I we exposed I guess if I'm I believe I'm correct about this we exposed like a Python interface to that tool so that coupling yeah so we we and that's that's the thing actually like too much coupling in general is always bad right like especially yeah very tight coupling and as evidenced by Chris's talk where everything was super tightly coupled and you make you change one float and everything rebuilds so here we're breaking that by just saying okay we've got Python interfaces to things and then you can just which it's much more portable and you don't have to worry too much but even then every time we move to a new Maya version there's a big effort of just making sure everything works because they'll change stuff under the hood scene is so implicit in I guess what you described there so this would mean that you essentially just store every version of every asset meaning you just buy more drives you just keep keep piling drives in every day to this this NetApp yeah that we don't know if it exists or is that kind of except that we do allow ourselves to purge assets after certain time so we might keep the last ten versions of something like a purge I say and what we usually do is we wait until just in our you know sub head of IT starts screaming that our drive is running out of space okay we better run a purge on this stuff and we'll get rid of some old friends but it's dangerous a little bit right because if there's something that we need yeah and so back to the compression thing if you could predict exactly what you needed then it was great but we can't so so we've definitely a few cases it's just very rare where we've had to go back to it to a tape backup or something to get something oh so you do like off-site these and several knots off-site but you do have like a sort of a purge to permanent storage it doesn't per you like it went away forever right right well we what it is really is that we're backing up the z drive and yeah and everything else periodically and so if even if we purge something there's probably a backup from before that has ok so I'm going to ask one more question then I wanted to ask a sort of a tie-in question so you mentioned that this was again in the email we can't talk about this you mentioned that you tend to use command line tools like simple command line tools a lot and that was part of the keep it simple about mentality there but then you said lately you've started to do more stuff like c-sharp cutie pie cutie tools right sure yeah and so is this just sort of you were finding like we need more like we're not able to really do everything we want to do in like meier something or and this we need our own custom editors and so you've been experienced that can you give us a little like you know that's that's three different things sort of there right I mean that you into to sort of say why three different things is it because of what people are comfortable with are you experimenting they're going to consolidate like hi what's what's going on there it's um I would say it's largely that excuse me that you have um you have a team and different people have different ways that they want to solve the problem and we you know so we'll end up having different solutions coming from different areas of the company and that's fine so a good example is if there's a tool and the artists say they come to the to the TV's right and they say hey we really need this thing to be automated like I need some way to manage my rig and I'd like to have a graphical interface that looks like this they go I'll do that in Pike QT cuz I know Python and it fits really well into Maya because Maya has got the whole Python thing now and so that's a good solution for that problem whereas let's say our level editor the programmer who worked on that dave smith was well versed in c-sharp ice and you know it's got opengl hooks and all that stuff and like let's just do that but we didn't have I wouldn't say that we've got any kind of overarching studio policy that says thou must write their thing in QT or whatever right so we're pretty flexible on that um and that sometimes bites us in the butt because like some people might not be familiar with c-sharp and they go to try to change and now they're Charter and the whole thing and they got to go and talk to David you okay let me give you a quick you know primer on how to build it and everything and then you learn it but so pros and cons I think we are starting to move more towards Python for those kinds of things in QT kind of as a whole but it's not like it's a mandate it's just something it's still kind of in its earlier stages of training maybe you know five years now you're like okay so this is how we generally go about this process right we're kind of feeling our way later yeah and I mean some things the other thing to be aware of too is that not every tool is ideal as a UI right like as a GUI because you know there have been ideas floated sometimes well let's put all this stuff in a sequel database or something and then we're like but wait a sec what about looking at the version history or what about doing a simple text search to just be able to find something right and so we tend to again try to take those those ramifications into account and say you know let's choose things that are going to remain simple enough that we can actually let's not lose some of the benefits as we move towards these things so I will say - this is kind of interesting the way we build assets right now it tends to be on a per asset basis so and the way that our the way that our game worlds work we have a streaming engine so we can stream in pieces of a world and so just as an example I don't know let's think about I don't know you choose a scene from I'm sure you know in term 4 where there's maybe like a big cityscape or something sure um that geometry would be built as maybe the buildings that you're actually standing on and moving around that's maybe one what we would call a level but it's run just a chunk of geometry data and then there's like a mid ground where there might be a bunch of levels that are like this area of town in that area of town and then there's maybe one that's like a wide that's like all the really low res geometry that's way out there that you're never gonna get to got um and those those assets are built they're divided up into pages say you know whatever some standard size like I think we're at like one mag or two Meg pages now but it used to be 512 K and we can load in and out pages and so as you're just walking through a level you can stream ya pages in and out right but those levels themselves each one of those is a pack file so if I'm trying to build this entire cityscape I might need to know that I've got seven levels that have to be built and as the background artist I know this and so I'm like I was working on the wide so I'm just gonna do a BL which is the equivalent of BA but four levels still just generating a pack file so BL this and this the wide for this into the city and so that model is you could think of it as a pole model right where I say I want this end product and it pulls back on all the dependencies and figures out and eventually sucks in the traditional make yeah very much like make very much like make although the differences that make make is very because it's based on on file times that's really the only criteria can use right right yeah and so it's it's usually actually way too granular a really good example is uh I let's say I add some special joint to this rig that's only used by the game to query the position of my nose right at the nose position joint sure doesn't affect the any many animations at all but if you're to use make it would say oh the files new or rebuild animations right so we want to be careful of that so we actually you really want your graph to be as explicit as possible because that reduces the number of things that get touched on a repo exactly so you wanted to be granular and you want it to be yeah explicit so so like in that case we will have this idea of a hierarchy ID and these little helper joints aren't maybe aren't part of that so it doesn't cause a rebuild as an example so anyway it's a pole model but what we don't have right now and we're moving towards it is this idea of like we have nightly builds but that's really just like a a bot right that is pretending to be a user and pretending to be manually going BL this now be okay and it might be out on a bunch of machines so it's in parallel but it's really just somebody's just building so the problem that we have right now is that when you build it like so I'm Joe user at Naughty Dog and I sit down and I say BL some level what it does is it actually it's pulling your version of the data and the reason for that is we want to be able to iterate and we have this idea of a local build but there were bugs in it and sometimes the local build wouldn't turn out properly so we we wanted to reserve the right to be able to do effectively a local build of a global asset using my data and the idea is I would do this I would test the game and if it's all good then I would check in my files and it's good I'm done okay if something broke then I could like I can undo my change BL again and it would put it back to the old way and then we're good to go but so what this what this means is that mistakes can happen like for example let's say I go into perforce or BAM and I check out a file right and I'm working on it and trying something and I build a few times and then I'm like I don't really like that so I asked my you know I just all right just wait till the nightly builds over writes my file and I'm good but I forget to undo that check in that check out right so now I've got an old version of the file okay so now if anybody in the company builds this level it'll be fine but whenever I build it it'll go back to the old version that's checked out on my machine we had a bunch of problems like that actually where why does this asked to keep reverting I thought that it was so and so we go over and go undo the check-in and oh sorry okay and then it was it was good so we're moving now towards having a pristine global version of the assets that are built by this bot and that nobody can touch and that way it's absolutely for sure and the only way to get a global veil to to actually go out to the company would be to actually check it in and that way you just don't have any of these these issues so we're moving towards that but it's not quite there yet so last question I'll try to sneak in here five minutes left okay not from the list okay so when Chris was describing their asset system this is why I kind of like having listened to both of them now I kind of just was sure Darius it sounded like one of the really difficult problems they're in some of the things they were dealing with was just that a lot of their processing wanted to look across a large number of disparate assets right yep and this was a fundamental problem in terms of why the system got slow because if that was not happening then it would be very easy to start isolating these these graphs right right so it sounds like you know your system just doesn't really even have that happening very much yeah and so what I'm wondering is do you did you find this to be a limitation and you just we live with the limitation or is it that no actually we we compensate for that by doing a lot of more runtime processing of things or you know is there you trying to see where I'm going with this why don't you find that you need these things that that kind of umbrella out right right well I think I was thinking about this myself during during his talk of trying to just try to get my head around why why they had these big problems we did and I think and unfortunately he's not here there's talking but I wouldn't I would have liked to get his input on this but it seems to me like there's two factors at play that caused that that monolithic thing right one of them is that you know you have you have a bunch of assets of just little dots each of these is like an asset that goes into the game yeah and you've got a whole bunch of source assets right and it seems like their dependency graph looked something like this right like where everything depends on everything and so what I end up with is you have the make game button right and when you press that 15 million tasks go off and it tries to build everything right but if you think about what the the structure of this and maybe some of that comes actually from premature optimization I don't want to put words in his mouth but it seems to me like for example not building a certain shader feature on a certain crate in a level just because the game isn't referencing that anymore is maybe once you know bridge too far right so like what we would do is we would say okay this crate needs to have that capability I know that it's used at least one place in the game with that capability so just in the asset database in our metadata we just say support this feature in the shader and if nobody pulls on it whatever you know it just supports it and so it's a little less optimal but you you now have a very loose template yeah okay so now the game object can either pull on it or not who cares one way to say this would be like okay so rather than trying to automate all this stuff and actually have this you know intelligent system that goes out and figures these things out it's just like just let the artist say that this is what they wanted to have happen during the build and if they get it wrong it's okay because they'll see the results yes but we now know that they using their intuition you can fine-tune that to the degree that it is exactly so it boils down to fully automated versus maybe a little bit more human intervention mentioned but the hammer and intervention always wins right because humans are smart they want okay so like if you think about the dependency tree in our game might be more it might look more like it might look more like this where you've got you know this thing depends on those things and this depends on these things and this depends on these things and once in a while you might have something that depends across you know but the tree looks more like this and so there's all these very lovely natural if talking about splitting the graph there's all these lovely sort of natural split points in this graph and so what you really have is a forest of lots of little trees instead of one giant Gordian knot of dependencies right right and for that reason we can just pull on one of these things and the build maybe takes five minutes or fifteen or you know and now if it's a light baked then maybe it takes hours but depending on the type of asset and we also leverage the fact that we have different flavors of assets and each one has its own requirements whereas it seems to me like in the in the halo system they were trying to it was kind of like who here's a cool hammer everything's a nail you know put it all into this one tag file system and everything's a tag which sounds cool it's kind of like in you know in UNIX everything's a file it's awesome but maybe again just one step too far and may not leveraging the fact that different assets have different qualities and so we can say like hey animations only depend on this stuff and we know that it's a bounded problem so I think that's probably that's probably why there's that if that's different yeah but then like you said the flip side is that we then buy into this idea of having large numbers of little little files on a file server and we ran into that and also I suppose you also how the situation is like well okay so now there is cognitive load for everybody who's ever dealing with these where like I now have to understand the concept that I need to set these dependencies up properly and think about how I'm you know specifying what needs to get built for what which you know if you imagine the intelligent system had worked right right you could get rid of so it's like it's kind of this thing it's like all right but we're just saying you know we just don't think it's gonna Rhino we don't think we're gonna get there or yeah exactly I mean think about it like any game you ship if there's this absolute sort of platonic perfection of the system and then there's the reality of what you actually ship right and it sounds like they were trying to go for as close to that platonic perfection as possible with zero human error but the cost they paid for that was just massive real times in this explosion where as we like to we actually go the other way where we say let's just make errors as obvious as possible so really example is if there's a missing texture it's the scrolling you know rainbow thing it's just very obvious right or if something fails to connect properly like you know this guy said that he wanted that asset but the asset was renamed and so the connection is broken then it'll just have a red error message over the guy's head saying could not connect to my nav master whatever these records right so that's that's a port we take yes we're gonna miss some things there's definitely stuff in our game where technically like we might not notice it till we're playing it at home were like oh darn it look at that that's a stupid thing how did that get through you know but that's the trade-off yeah well we are out of time thank you so much for [Applause]
Info
Channel: Molly Rocket
Views: 21,556
Rating: 4.9238095 out of 5
Keywords: Handmade Hero, HandmadeCon 2016
Id: gpINOFQ32o0
Channel Id: undefined
Length: 91min 29sec (5489 seconds)
Published: Sat Sep 23 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.