The Importance of Scalable Code // Code Review

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey what's up guys my name is China welcome back to code review Series where we take people's code we review it we talk about it we're still on this game about the Flying Dog situation but I think that's coming to a close so Channel review gmail.com email address will be in the description below as well with some instructions send in your code we're going to use it we're going to make an example out of you I mean that's really that's that's what's going to happen so uh for those of you who are brave enough to do that please do that uh I just want to basically today talk about this thing and I think we're going to be more or less done with this code base um I noticed this last time so last time by the way we talked about like how to write better header files which is quite subjective but my opinions on that at least and we also talked about how to debug the preprocessor so you can see what's going on behind the scenes how to like track which header files are included where quite a lot of useful tips I think for those of you who are new to that kind of thing I'll have that linked up there definitely check that out and I remember noticing um all of this code down here which you know when I see like this many vectors of what looks like like the same kind of stuff uh that's that's that immediately jumps out at me and I think we can talk about how this can be improved and I have a quite a few things to say I think about this because you can see as well that in the actual update like function which I guess happens somewhere over here game render for example game update same kind of thing it's going through like all of these uh collections here iterating through all of these Collections and doing stuff and a lot of the stuff is doing is pretty much identical um and yet they're split up and what's even more interesting about this particular implementation is they're not actually different types they're split up uh well it looks like they're split up just for the sake of being like this is the black birds these are the green birds these are the blue birds and it sounds a bit weird like why would you split them up but I can kind of see why that's done because just from this context it makes no sense but uh you know what's important when you're code reviewing code and when you're writing code when you're reading other people's code is contacts you need to know the whole story like you need to know more and I mean this isn't even purely about context it's more about like well okay what are the other uses of this kind of thing and if we scroll down a little bit you can see that in game Collision there's a different kind of situation that happens there's different Behavior based on what is colliding with what and so like player versus Blackbird player versus green bird is presumably different now it's not I'm immediately clear why it's different I mean you can see the score number uh changes differently and the number of lives for example goes down if you hit a blackbird but if you hit a green bird you just lose score it looks like you don't actually lose any lives um you know we spawn looks like some Sparks probably the same amount of Sparks this uh set to kill gets gets triggered I don't know why this is called Val as well it's a little bit weird I would just call this bird or something so it's like very clear what it is but instead Val which I'm assuming stands for value is a little bit of an interesting uh use of language there anyway so down here if you collide with the bluebird you get score right and and all all of the time you can see the amount of code that's duplicated here right like this exact code you can see it's it's showing up here it's being highlighted by Visual Studio that's the same this is the same this is the same the only thing that changes or the only two very variables we really have here are basically like this score number and the lives number so these are the two things that could be somehow parameterized right so that basically the same exact thing happens and by the way the actual if check that's doing the Collision right that's obviously going to be identical because it's just a like one hitbox versus another hitbox right we're colliding two rectangles together we're seeing if two rectangles overlap it's really what we're doing like that's obviously going to be generic and the same as well so my point is my like my really big issue with this is not even getting into like the fact that these are all in different Collections and this is bad and we'll talk about why that's bad in a minute kind of from the computer's point of view because you can probably see why it's bad from a code point of view but then also like with the whole Collision situation like you know this just really really needs to be made to be more generic so that we don't have to repeat this uh it it could be collapsed into a function really easily as well which would be very very helpful and really I think this could actually benefit from being handled a little bit more in the way that like a physics engine and would do this which is basically like you could have a collision callback and you're like okay here is object a here is object B what do I do and technically like down the whole physics engine like kind of path if this was implemented in a game engine you know like hazel or Unity then you might actually have written the Collision callback on the player class so you could have like a player entity class and then you could just have a collision call back there and then that way you at least know that you're definitely checking the player's collision with something like one of those objects is immediately known because that callback wouldn't be called if it wasn't you the player who collided with something therefore you would only ever have to be like is it a blackbird okay then you know subtract this amount of score is it the green bird then okay dude this is the blue bird do this right so you could kind of have that all collapse within that one function and then if you want it to be even more kind of special you could just have like a function called like get Bird score or something which could take in like I don't know the ID of the object or whatever the you could even be like you know I guess you could be like the type ID of the entity that you've collated with and then if it's like the Blackbirds type ID like you could have it mapped out basically in a literal map where you just have to look up a type ID to get the score that should be added and so in this case it would be -50 in this case it would be -30 and in this case it would be just 30 right because you're adding it lots of different ideas here all of which I think would be an improvement um so yeah let's kind of dive in and talk about this in more detail but first I want to mention that this video is sponsored by brilliant.org for those of you who don't know what brilliant.org is it's a fantastic website filled with a large number of high quality courses on various stem topics they have a great selection of beginner computer science courses that can really help some of you out if you're just taking your first steps to being a programmer and then personally what I love brilliant for is their catalog of math courses they have everything from like the the beginning like the Everyday Math course which is really simple and we'll cover some basic math concepts all the way up to to something that would be more useful for even someone like me who has quite a lot of experience and by that I'm talking about things like calculus which every now and then I do need a refresher on and Brilliant is great at that why specifically brilliant because all of their courses are extremely engaging and interactive they present all of their content visually and on top of that give you like these widgets that you can play with to see like how the numbers affect the results this kind of visual way of learning I find is well for me it's the best way to learn and on top of that they'll also quiz you after all of these lessons to make sure that you're actually learning what they're teaching you and that really really helps to retain that information the best thing about brilliant is you can get started for free they have a free 30-day trial which you can use to check out everything they have to offer brilliant.org the Cherno is the link for that it will be in the description below and if you do go on to like it brilliant have been nice enough to offer the first 200 subscribers 20 off an annual membership huge thank you to brilliant as always for sponsoring this video how cool is this uh coffee cup all right let's get back to it by the way as I'm scrolling through all this all of this stuff like my plan honestly for this video just to kind of talk about what I would do I realized that on a surface level like heart like hopefully that's helpful to you guys but I also realized that it might be better maybe for me to actually physically rewrite this code the way that I would the reason why I don't tend to do that is because you know just talking about it and trying to teach it that takes up like 40 minutes right and then if I try and implement it that might be another 40 minutes to an hour and so it's just too much so I'm just wondering like if that would be something that you guys would like to see in a dedicated video maybe even it could be like a patreon only video yeah I don't know so let me know like if that would be useful or if maybe my explanations are more or less like in depth enough that it makes sense for you guys to do it because that's the other thing all right it's good I I think from like a teacher's point of view like it's kind of a it's a good thing for me to explain all this for you and it's almost an exercise for you to try and go and figure it out on your own rather than having me kind of give the answers to though I guess that would be useful if you're lost or you know just to check if you were more or less on the right track I don't know what I'm doing all right so the first thing I want to talk about is why this is not great um so we when we iterate through all of these things you have to remember that there's a there's like I don't know I I think it's important and maybe I'm just a little bit too deep in like the low level game engine side of programming and that's totally okay these code reviews are really just about me expressing my opinions uh obviously this game runs fine like there's nothing really wrong with the way this game runs I don't think I didn't really measure the performance but it was fine I mean I do have a pretty beefy computer maybe it struggles on some platform problems but that's besides the point the reason why I'm giving the advice that I give is kind of twofold first of all it's because when you scale these projects and when you kind of maybe work in the context of a larger system in a team or even this game you know presumably you want to make something more complicated than this game in your life and when that happens and you scale these things some of these kind of programming techniques don't extrapolate particularly well meaning that if you start scaling them like if you suddenly make a game and there's 50 different types of birds like are you really going to have this here right it reminds me of something polymar's tweeted actually like yesterday polymods Twitter is just literally full of filled with just like memes yeah here it is this was like yesterday how high did you make your is even function it's taking forever and the code is this so my point is that like I want to like help you so that it avoids situations like this where clearly like something like this as stupid as it looks might be okay if you were just doing it for like four numbers right like of course there's an easier way to do this for those of you who don't get it you would just do mod 2 equals equals zero that's like what you return you know number mod 2 equals equals zero whatever that will tell you if it's even or not and I would still write that to be clear but if you if this is all you're dealing with then it's not really a burden to write that versus this and maybe you could make a case for the CPU being able to execute this a bit faster I actually don't know if that would be faster than like a few branches especially if it happens to be like number one also what happens if it's zero I guess it doesn't I guess there's like an else at the very bottom that just says false or something but that would be wrong because zeros I guess zero is an even number is zero an even number this is deep you know this is a deeper tweet than I thought it was but anyway the point is like if this is all you're dealing with like you're just dealing with this block here then like yeah it might be not the best way to do it but it works for your case and at the end of the day most the most important thing at least to someone who I guess releases software as well not just like makes it for the fun of programming and then never ships it as someone who ships software like the most important thing in my head is like the consumer right like the user the the person the player of my game the person who's gonna actually like be the client right download this software and run it on their computer that is the most important thing the second most important thing is probably like the rest of the team and how they can deal with the code and if they can like if it's good for maintenance if it's good for kind of you know building upon is it easy to read that kind of stuff but the most important thing is always like the client and the reason why I think that's a really that's a really important distinction is because code might look very very beautiful but it might not perform optimal and so I guess because cause the client is number one in my mind I would prioritize performance over code readability and I think that like that's important that's what I think anyway all of that aside the point is you can see how this doesn't quite extrapolate nicely like if you need you know ten thousand then you're gonna have like 10 000 lines of code and I think he even had a follow-up tweet to this follow me for an exclusive discount on my new is even function featuring over 20 millions of 20 million lines of code support for numbers above 20 million coming soon which is kind of hilarious I think so I guess props to polymers for uh that tweet I don't know if he actually came up with this if it is like recycling memes but you know I won't get into that so the 50 bird scenario is not going to work and so I just want to be clear that like that's kind of why you know we're talking about this stuff um from like a code point of view this doesn't extrapolate nicely it doesn't scale well but also from like a performance point of view I don't want to mention some things that are going on here so from a performance point of view I think one of the most critical things that you can think about that isn't like immediately obvious to people with less experience is memory and how memory is more important than you might think and the reason why see this is why these videos are so long I think it's not because I it's because I just talk about all right whatever when people or should I say less experienced like beginners and I'm not saying this in a condescending way I'm just saying it in like a factual way maybe that is conflict the kind of entry-level way to look at memory is okay I have 32 gigs of RAM so I better make sure that I don't allocate more than that that's it right so if I'm allocating more memory that I need that's a waste of memory and so people look at it as just like this is a resource you know you have a finite amount of it like 32 gigs or whatever on the system obviously other applications are using some of that I can use like 30 gig or whatever people who have computers that have eight gigs of RAM well I better make sure that if I want them to be able to run out I better not use more than like maybe like five or six gig or whatever it may be that's like the level one way of looking at it the level two way of looking at it or like the rest of the your life looking it is kind of like okay but the way that the memory is laid out and the more memory that there is basically means my CPU runs slower and that's the really really important takeaway from this it's not the fact that okay I'm gonna use more of a resource it's that my program is going to run slower if I use more of the resource and a really easy way to look at it is very simple you have an array with like a million elements in it and you need to iterate through all of them and print them is that faster or slower than if you had an array with a thousand elements that you had to iterate over and print the 1000 elements would be faster than one million elements now you are obviously just physically doing less work but that's the point if you can get your program to function the same way but do less work it will be faster and so what I'm getting at is it doesn't matter if you have enough memory for a million elements it's still going to be faster to process less data and then like I guess the level beyond that is kind of like okay well what about like the location of the memory what about like fragmentation of memory what if my CPU starts iterating over a data structure like this and then suddenly has to jump to another data structure that could be somewhere else in memory is that faster or slower than if it just had to iterate through a single data structure and to that I say pretend that this is just a off the top of my head analogy but pretend that you have three kitchens in your house with dirty dishes that you have to wash once you finish with one kitchen you have to walk over to the next kitchen and begin your dishwashing process once you finish with that one you have to move on to the next kitchen and do that as well if someone brought you all the dishes into just a single kitchen it would be faster right or at least at the very least more efficient and so when I look at all these collections there's two things in my mind and again all both of them are from a memory point of view one of them is immediately okay I literally have one two three four five six seven eight I guess if you count that one two three four five six seven eight nine I have nine different vectors that I have to iterate over presumably every frame those nine vectors are going to have nine Heap allocations that are going to be in presumably different places in memory I say presumably because like there is a way to write a vector and there is a way that you can configure this it doesn't it's not done here but you could have it like allocate from some kind of Arena allocator where it actually is more or less contiguously in memory or at least very close by which would help the CPU prefetching and all of that but obviously this is just like a bunch of vectors and they're not really you know nothing complex is going on here it's just at face value what you see nine blocks of memory now the next part of the indirection is the fact that these are all unique pointers now they more or less have to be I actually thought they had to be more than they did but like there's quite a decent chunk of data actually in here that's like step one I did think they were all kind of uh subclasses so I thought there was an actual class hierarchy here but there's really not from what I can see like none of these are subclasses of anything else which is um I'm sure a lot of you would say that's a good thing I mean Sparks is okay it's a class of explosion a lot of you would probably say that's a good thing um all of you are a p haters out there I would say that this could benefit from having some inheritance because this is clearly not set up to be a component-based system and we can talk about like you know an anti-component kind of system situation as well there's infinite stuff to talk about but let's just start with this so the fact that they are unique pointers that's okay what that basically means is that somewhere in your memory right you have this for example this is the background little static object let's just go with the birds because they're more fun to visualize we have like the black birds right and what that is is it's not actually the Blackbirds right the Blackbirds meaning like their positions and like in in the world in like the map or whatever their like velocity where they're traveling like this which Sprite to use for them whatever it's not that what it is instead is a pointer so what we're seeing here is kind of just a collection of pointers right and then that means that as we iterate through the black birds so the step one is to be like all right I'm writing a for Loop I'm iterating through this Blackbird data structure so Step One is I'm going to meet the CPU is going to be like all right well I need to iterate through this let me just grab like some memory to start iterating over it and it being the CPU is going to pre-fetch more than just that byte of memory or that like one element's worth of memory it's going to pre-fetch a bunch more which is important then what's going to happen is it's going to be like all right well this is actually a pointer so I'm going to read the contents of that pointer because that's the memory address of where I need to go next and then I'm going to reach out and grab the actual location of my bird I'm going to go to the location of this bird as dictated by this pointer and that's the that those are the bytes the data that I need to read so I'm going to grab that as well then I'm going to go back and read the next bird which is going to be in a different position in memory and this one's over here or whatever now again realistically if you allocate stuff more or less one after the other it is kind of likely to be be Sim in a similar place in memory so like when I draw these lines one going that way one going that way it's not like it's completely random it usually isn't especially because the kind of Heap space that you get allocated for your application is more or less like you know the Heap can grow but it's more or less like the operating system will try and give you something that's kind of contiguous e-ish like it's not going to deliberately try to make everything run slowly um but it's still obviously not guaranteed which is why we're kind of going by those rules uh anyway so that's kind of you going through all of this now this is still better though having like a contiguous block of pointers even though you do have to go one step further to get the data from them that's still better than basically having like you know one two three of them here and then having like another Vector somewhere else in memory that has the next three birds because you've decided to separate your black and green birds right so you have that kind of segregation going on and you got like uh you suddenly have to jump to this next Vector to get the next three birds and you you can see that after that it's like the next one and then the fireballs and then the buildings if they were all at least in one kind of collection then you know they'd be like one less kind of indirect step to get and also all of these pointers all of these locations of every element of data we need would kind of be all contiguous at least and so that's still kind of better so I guess my point is when I look at something like this uh you know from a code point of view it's immediately clear that like this isn't gonna scale if I add 50 different bird types but also from a performance point of view and from like the computer's point of view I'm seeing like obviously each one of these is is a is kind of a different place in memory and then also this is a memory interaction because this is just really a pointer to the actual block of data and so I'm kind of picturing the CPU having to iterate through all of these guys and it's like that's you know a cash Miss that's a cash Miss that's a cash Miss that's a cash Miss they're all kind of cash misses which essentially just means that there might be a higher cost to actually getting the memory we care about so that we we can it right over it because it's not already in the CPU cache and to be honest like there are tools that can actually tell you like valgrind comes to mind obviously but I actually haven't I don't think I've used any tools on Windows that do this I'm sure that maybe Microsoft probably has something kind of like address sanitizer but a little bit different maybe there are tools that you can use to actually literally measure how many cache hits and misses you have so this isn't all arbitrary like you can actually find out and that's why like I'm I'm really thinking that rewriting some of this code might be useful because then we can kind of compare the before and after but anyway long story short you can see all of these are practically the same like they all do Val update if the update functions were different from from blackbirds to green birds I would probably use inheritance here to make a base bird class and then have a subclass called Blackbird greenbird Bluebird that had a custom update function that would be totally fine and that would kind of help you collapse them in this case they are the same and again they are the same and they're all birds right there's only one type of bird here um they're all birds but they're split up because they need to be for other things but if you look at something like you know um the static background for example right so here it is background right this is a static object like I feel like you could probably have a superclass called object that has because look I mean you know some of these things are kill kill enabled I guess that's whether or not you should destroy its next frame uh but like you know all of this kind of stuff it pro this I'm sure this applies to birds as well you know and also not sure why you're maintaining both of these things because this looks like it's just made up of that and then the hitbox presumably is also similar to that but um anyway you know you could kind of have yeah like you can see there's a lot of functionality here that really is duplicated and so I would kind of organize this a little bit more so that you have all of your common things in one object then you can have just a um kind of a vector of all of those specific objects the other thing that you're doing here is like you're keeping track of individual um objects as unique pointers here instead of having them in a collection and that means that if you want to draw stuff like this instead of it already being in a for Loop so that you can draw it you have to call it by its variable name specifically because it's not part of a collection and if you want to destroy it you'll have to also do it by name so that's also unnecessary what I would do is again I would ultimately the way I would rewrite this is I would just have one collection that would basically be like of Base Class object or something that was just like all of my entities right or all of my objects they're all in here and then if I wanted to I could have some kind of like weak ref or whatever like you could literally just have I guess a static object pointer or something of that but the key here is that would point to something inside the vector now that itself is really dangerous to do because if the vector ever reallocates this point is now lost right so that's probably not the best way to do it and that's why at the end of the day like in I guess more complicated systems you might have like an actual map and every entity could have like a uuid or some kind of identifier and then you could just look up the map to get that object's data by its ID but that's more or less how I would do that part uh when it comes down to all of the physics collisions the way I would rewrite these is well first of all straight up this should just be a utility function right like this should just be one object versus the other let me know if they they overlap if they Collide right the reason I guess it's not like this is because it's I mean here these could all easily be collapsed right like literally in two seconds I can just make a little namespace called utils even if I wanted it to be like a utility class static bull you know uh collides uh you know we could take in like a player and we could take in this bird presumably um and then we could just return that right like that's done in like two seconds and then suddenly that means that we just need to do player and Val inside that like utilities function or whatever now play is a unique point of one thing you could do as well is uh you could just make this be unique pointer reference a const one because you can't copy unique pointers of course get rid of this last parentheses um yeah and that's kind of it right so this just immediately gets you can see how much that kind of collapses this code which is nice makes it a lot more readable um the reason why it also makes it readable I think which is really important is because like you know it's pretty obvious to someone who's been around the block a few times what this is doing right like we're comparing like hitbox things and that plus that is that greater than that like you know obviously I can tell that this is too uh you know two abbs or two hitboxes two rectangles whatever and we're just checking to see if any kind of corner of them overlaps that's what's going on with this code here that's easy to read uh to someone who knows what that looks like and who has done it before now this is easier to read though utils collides right so like you know if collides player and this bird right which again I would rename Val to bird I already said that but if if collides play a bird then then do this like that just makes sense more in English and it's a lot I think better to read objectively you know than this code here so that's why I like you know I think that's something that you should always think about is not only can I save code duplication here a billion times but also like what does the code actually read like when it's done right and this would just be a lot kind of easier to read and you can see that it's literally like four times I think I've replaced this so it really was just the same code over and over again now this actually though is player building that's where things change and that's why you can see that what was going on here was the same thing it was still get hitbox because this kind of get hitbox is present in basically everywhere like if I search for this you can see we've got it in bird we've got it in Fireball we've got it in player we've got it in static object and that again is why it would be so nice to just set up a bit of an inheritance hierarchy here with the common functions some people will say that's bad I honestly don't see the problem here um and if you want to argue about like the V table and interaction like I guarantee that's not going to show up in any profiles or have any problems with this specific use case and from a code point of view it obviously makes it a lot easier now the last thing that I want to talk about um because I think you guys kind of get the point with this and I could go through and rewrite all of this but you know that would take a lot of time and hopefully what I've said is kind of clear enough the last point of view would be like what if this wasn't because there's kind of two ways you could go about this you could put all of your common functionality into like a base class you know that would be like your object class or your entity class and then you could derive from that for things like the player the birds the Fireballs whatever that's kind of The Inheritance approach and that's kind of like the more old school approach the newer School the newer School approach the kind of modern approach is more data oriented and that is through an entity component system and the reason why uh that is better is because you can see that if I want to go through and for example you know check the physics or you know move one of these objects or whatever what I'm kind of doing is I'm iterating through every single Bird right which is going to have a different location in memory and then I'm going through and I'm trying to kind of get all of its data and move it but I don't need all of the data that's in the bed for example if I'm just simulating physics I kind of just need like that hitbox right the bounding box maybe like its position as well in World space that's all I really need I don't need information about like is it dead is it like what Sprite do I use to render it uh should there be some kind of piece of audio that is playing and what's the you know what's the point to the audio like uh data or whatever right I don't need any of that however I've got all of that because when the CPU looks up this bird and gets you that memory it doesn't just get you what you need it prefetches obviously a bunch more and so you have all this memory that you never end up using so a more data oriented approach would be okay well why don't we bundle up like say we need to go through and like just move every bird by its velocity right what if we just had the position and the velocity of every bird like that in one little package here and then we had a collection of that so position and velocity and position velocity of the net of an expert so this is about zero bird one bird two if I iterate through a vector of these guys especially if they're not pointers and they're just kind of you know in the actual memory because they're just going to be like two vect threes or two vectors or whatever that would obviously be a really nice scenario because all of this that's here is all contiguous in memory it's all very cache friendly and I don't have to I don't have to kind of look up this bird get a bunch of memory that I don't care about and discard half of it and just use like you know two vectors we'll say it's a 2d game so like 16 bytes of it for these two vectors just use that portion and then discard all the rest of the bird information because that's not useful to me that would obviously be a better approach than any of this but it's also a little bit harder to set up an anti-component system you can use something like entity en Double T that's what we use in Hazel for our N2 component system it's really good performs really well fairly easy to use like it'd still probably be easier to set it up like this not to mentioned you wouldn't need any libraries but what it does enable you to do is kind of also also forget about the kind of different nature of all of these entities so what I mean is at the moment we obviously have like even if we do make a Base Class called object we have all of these different classes and the reason why they're different is because they have different attributes to it so the idea of having an energy component system is obviously you have components and you can use them to compose these objects and so what that means is that like all of the things that these have in common for example they have a position in the world and maybe the static objects don't have a velocity but like you know birds or whatever might all of these probably have like some kind of physics bounding box as well you could just simply add those components to each of these entities and so the idea is you have like these building blocks that you use to build up all of these different entities rather than having to write everything in like a static kind of class you just instead write all of your components separately and then kind of like Lego just put the ones you want onto the entities that you want and you end up with this kind of really cool system of just flexible objects that you can just pile on different components onto but again it retains that data oriented aspect of I can Loop through just the parts that I need and the systems like the physics systems they don't have to worry about all the other functionality that might be present in that entity they only care about the data that they need to care about and that improves performance so that could be another interesting exercise taking something like this uh through a few levels and maybe making it first The Inheritance based approach then maybe like a data oriented approach um and like with an anti-component system that could be a really good exercise for you guys as well the code like this code is linked in the description below I hope this person who sent it in doesn't mind us using it as an example but I guess that's the point of the series I honestly would do something like that if there's enough interest for the for this I might do it and if I have enough time maybe it will be useful to show like the conversion of something like this into various different systems and also maybe good measure performance or CPU cache hits and misses that kind of stuff but other than that I hope you guys enjoyed this video if you did please don't forget to hit the like button don't forget to check out brilliant.org the channel link will be in description below send in your code to channelreview gmail.com and I will see you next time goodbye
Info
Channel: The Cherno
Views: 130,420
Rating: undefined out of 5
Keywords: thecherno, thechernoproject, cherno, c++, programming, gamedev, game development, learn c++, c++ tutorial, code review, scaling code, polymars, the importance of scaling code, how to not repeat code, code duplication, better code, better c++
Id: DphbUR_s1K4
Channel Id: undefined
Length: 32min 10sec (1930 seconds)
Published: Fri May 12 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.