How to Optimize Your Unity Projects for Max Performance - P3 Optimization Framework + Example

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right welcome to this lesson on the p3 optimization framework so before we get into the abstract world of theory and blah blah blah let's start with one simple question that i want to make you so why do we need game performance nowadays the simple reason is that players demand it you just need to look on the internet search anywhere and you will see that the trend is to increase the frame rate and the heads that our hardware and software has to support you will very often see monitors popping up with 240 hertz and this is only increasing you will also see android phones gladly supporting 120 hertz and also vr headsets like quest 2 supporting 120 hertz so what do players expect from you that your game or experience also renders at these high frame rates so the problem with this is that 60 fps can be hard okay i take that but to achieve 120 fps is hell you only have 80 milliseconds as bad yet to reach 120 fps and if we are talking about 240 fps that is almost impossible that is lots of sweat blood and tears very often a combination of them so if you think that 80 milliseconds is a lot of time think about it just the baseline rendering process let's assume that you are just drawing a cube in urp on a mobile device that can take very well three milliseconds of your cpu time that leaves you with just five milliseconds and guess what unity takes even more time to do other types of processing like physics okay so this is really hard okay so what can we do about this well you need a structured optimization process to succeed in the industry of video games in the upcoming feature okay you need a framework that whispers to you what your best next step is at any given point in your game optimization journey okay so you need a framework that leads you from 0 to 72 fps from 72 to 90 fps from 90 to 120 from 120 to whatever else the market is going to demand in the future just like an algorithm you need a process that takes your input for example the profiler data that shows the information about your bottlenecks and gives you an output it gives you a high performing experience or video game so let's skip this short okay i heard that some people don't like theory so let's keep it really short so the p3 optimization framework seen from a simplified high level overview is this first you want to define your performance targets okay i know how to spell i know targets is with one s but i put three on purpose to make sure that you understand the importance of this there is just not one target there is just not one frame rate we also care about loading times we also care about memory budget we also care about these little things that player notice and review your game for okay so there are more than one performance goal that you have to set for your game after that what you want to do is to iterate over the p3 optimization loop this loop every time that you iterate over this is going to give you a substantial performance boost that will get you closer to your performance goal the first part of this loop is the profile phase here this is all about gathering information gathering intel that is going to let you decide in the future what your options might be okay this includes any type of profiling tool like render. copier metrics unity profiler pix vtune etc so now that you have your information documented by the way you are going to move into the plan stage here you are going to investigate and wait at least three different options that you have in order to combat this bottleneck and you are going to wait these options according to four metrics the third stage of this performance loop is the perform stage here is all about executing the performance optimization option right and i'm talking about the performance optimization strategy that fits you best for this specific iteration and this is something that you will be able to decide according to the four metrics that you used in the plan phase finally after you have repeated several iterations of the p3 optimization loop you will just end at your target fps and at your target performance goals whatever they might be here you want to do a performance retrospective in order to learn from previous mistakes and here you will just create your do's and don'ts list so that you avoid repeating the same mistakes in the future projects now this is a very simplified view each of these steps has sub steps but i cannot just reveal to you all of this because this is a paid product and i will be doing added service to the customers that i have so if you want to learn more about the pc optimization framework this step-by-step framework that allows you to optimize your game from zero to your target fps go to p3 framework dot the gamedev.guru but let's cut the chase i said i would keep it short so what i'm going to do right now is to show you a quick demonstration on how i apply the p3 optimization framework to one specific problem in unity now remember this is just the simplified version of the p3 optimization framework so you will not see all of the steps that a normal user will have to do but this is going to be nevertheless quite useful to you so let's watch this video that i pre-recorded together and let's have some fun all right let's see what we have here for optimization in unity does this sound familiar to you well if it doesn't then i still have spoiled that you can probably see the title of this project this is a project called 3d beginner by unity obviously this is not a project that was intended for vr let alone mobile so we can kind of expect this not to perform so well okay but nevertheless what should we start with you know just deploy it to quest and let's see how it goes i would say right let's see how this looks like okay this is nice 72 fps so let what's this oh crap oh you don't really want to see this man okay we're about 20 fps if i just turned around that's not good like i feel already super dizzy don't trust me here have a look by the way don't try this at home okay it's quite dangerous so you see that 20 fps she doesn't want to do this at home okay not really neither playing at 20 fps or putting the headset into the webcam so you saw these numbers just a second ago right like the frame rate and such these numbers are coming from a tool called ovr metrics and subtly this ovr metrics tool is not displayed anytime that you stream through the oculus developer dashboard like here okay you have the streaming option and sadly it doesn't show but it shows on the device and that's what you saw okay you saw the numbers and you saw that if i turned my back and stopped facing the wall 20 fps is not good okay you're just going to make everyone sick so the first step here in the p3 optimization framework was that you have identified that you are not reaching your performance goals in our case it is quest so it could be about 72 fps once you know that you are not there what we are going to do is to execute several iterations on the p3 optimization loop if you remember from the presentation there are three steps in this loop one is profile then plan and then perform but we are going to focus right now on profile we know that we are not reaching 72 fps which is our performance goal actually we are rather far away from that we are at about 20 fps so i wonder what is going on and i guess that you wonder that as well so the first thing we are going to do is to profile and see what is happening how do we do this with tools now we have over 10 tools that we could use to see what is going on but one of my favorite ones is the unity profiler also ovr metrics and we might as well need the frame debugger okay but for the sake of being fast for this presentation we are just going to start with the unity profiler so for this you have to press ctrl 7 on the editor or command 7 if you are on mac then you need to attach to the player to your quest device it's normally called auto connected player make sure that you're recording and just put the headset on again and don't look like just close your eyes if you are at 20 fps otherwise you're going to feel very sick now it might not work because oculus developer have might be still in the adb server instance if that is the case just try again okay just clear the errors and then go to android player let's see if it works now maybe we can just select the other alkyl squares there it is so here i am facing the wall okay so it's looking good now as soon as i turn around to see the main contents of the scene just be careful and watch what's going to happen ok now that we have this capture what we're going to do is to analyze the numbers all right so here we can see that indeed there is a period in which we are below 30 fps let's see what is happening here you can do this several ways the way i like to do this is by using the timeline itself so let's see what the cpu is doing here okay so basically we have our main thread and we are spending about 40 milliseconds in rendering you see that this is in general way too much for rendering if you attack it in quest you always want to be top six milliseconds in rendering room when it comes at least to the main thread so 40 is definitely not good we could as well inspect the render thread but we are going to see something quite similar right taking way too much time now you have to ask yourself first if you see this where is this bottom leg coming from we're talking here about the cpu right and the cpu is being very busy you see we cannot find any label saying something like waiting for gpu or similar we see the cpu doing actual work you see the screen blocks this means that the cpu has quite some work to do can you guess why is this the case well if we run this scene in the editor we can open the stats panel and here you will see a few numbers that we need to examine right we see patches about 2 000 triangles about 3 million and blah blah blah the rest is not that critical so if you have some experience developing for quest you will instantly know that we are way above our budget regarding batches and triangles batches is just another word for draw calls okay and draw calls is just the way that your cpu has to tell the gpu to render something so basically we are just rendering too many objects okay basically this is the way that we can simplify the explanation for now so this number that you see here 2000 is above our budget do you know what our budget is for quest about 200 so this is 10 times above this number this explains why if you go to the profiler you will see that we are spending way too many milliseconds on the rendering side and if we go to the wall you know this is not the case we can say that in the rendering section when we are looking at the wall we have 20 23 calls and when we are not we have about actually 5k right so it is even worse on mobile so the question is why do we have so many draw calls and what can we do about that does it make sense remember we are still in the profile phase so we are just gathering intel we are just gathering information that will let us decide later on okay so what other type of information do we need we need to know what is happening with the patches what is happening with the draw calls and for this we have a very juicy tool that is called the friend debugger just go to window rendering and not rendering analysis and frame debugger okay so here you can just click enable don't do this on quest uh for now just do this on the editor is not going to be one to one the same that this is going to be much faster for you to examine if you want it to be super accurate just do it on your target device but for a fast and brief look you don't have to do this just click enable and of course click on the game tab because unity decides for some reason that is nice to troll you and switch back to the scene view so here we are going to see the composition of our draw calls we see that most of our dracos are coming from the opec geometry and here we see that 300 of them are coming from shadows so that's very good information to write down yes you need to write down in the profile phase you can do that somewhere wherever you want right you can do that in the notepad in a physical one in your hand it is up to you we just need this information to be documented so 1700 drupals are being spent here in the opaque rendering section so if we have a look at these draw calls you will see the reasons that unity has in order not to patch it okay so we can just be going down and down and down and it is quite easy to see that most of the times we have way too many draw calls because draw cool batching is not working because objects are affected by different forward lights okay so that is very interesting if that is the case let's continue our profiling phase so it is complaining that we have too many four what lights what are these lights we can just type t on lights and indeed we have quite many lights and it should be no surprise to you there are all most of them real time is anyone surprised at why this is at 20 fps i am not sure it looks fancy sure it looks good but if it runs at 20 fps i'm not going to care about it okay i'm just going to probably spend a few hours shaking in bed with fever and super pale so anyway what can we do about this if it says that we have way too many lights then the obvious thing we can do is to remove some lights or at least to you know bake them in a way that we don't have to do these calculations in real time that's one option okay and by thinking about options we already advanced to the plan stage okay so option number one remove lights okay so one example could be to select these lights and disable this in play mode okay i'm just giving you an example if we do this you see that the number of patches dropped to 400 and this is a really juicy improvement okay so that's one option and most people we just go for this option and call it a day they would just do this optimization go home and say hey i'm proud of what i have done and done with this if you do this however you will see a huge difference in quality right just compare the visual difference here it is pretty significant so more options are don't know what do you think because we're in a presentation i cannot just wait 10 minutes for people you know to come up with questions so what i'm going to do instead is to give you a few more options because remember in the plan stage what we want to do is always to think about three options otherwise you risk going for the cheapest option that is usually not the best option for the players okay so second option would be to go to these lights and say something like baked okay now this is grayed out for some reasons that are not important but nevertheless i will show you so basically we need to go to the lighting tab then and enable the baked global illumination okay so again we can go to mode and go and say baked okay we don't want these lights to be calculated in real time however if we do this and then we back the lighting let's select for now uh something like subtractive and then generate lighting okay so this is done i by the way reduce the light map resolution to four taxes per unit just to iterate faster so this is not the final result but don't worry about that let's see what happens if i play now okay so 400 batches that's amazing that's what we wanted for and at least in the first p3 optimization loop however you know this still looks better than first option which is just to remove the lights however we lose quite some you know use effects like for example the flickering of the light is gone okay so you know this is still a better option so you could totally go for this of course you need to tweak a bit how the scene looks like right you need maybe to adjust its static flags and change the lightmap resolution and all of the settings so that it looks good and you know this could be one option however we said that we want to at least consider three options so just think about that what other options do we have now if you have some experience in game performance optimization you will instantly recognize something here if you go through the scene and disable the gizmos of course you'll see that this looks quite an indoors scene doesn't it and what techniques can we apply when we are in indoors scene and we want to reduce the number of draw calls that we have how cute is this guy let me tell you that occlusion colon occlusion calling is all about not rendering what is being covered by other objects so if we have a wall here then no need to render this stuff okay so that will be option three let's just give it a try okay we are just doing super fast testing here we don't care about the perfect solution like you can see on the left right i mean this doesn't look perfect to me although it looks fine enough okay so what we are going to do here is to undo our optimizations so we are going to our lights because control set is not working suppress and actually it worked my mistake all right and then we're going to go for option c which is to bake occlusion cooling how do we do that window rendering occlusion colon and then let's just change the parameters to something that makes more sense no worry about these parameters and back all right so now that we baked occlusion colon everything disappeared what happened here no worries we are in the visualization mode of occlusion calling so that means that for this camera that we see here we are going to see what we render and what we don't render we can disable that visualization by just switching to another tab like back so i would say that we are already making some substantial gains okay so basically now if i hit play we were at 2 000 patches and now we are at 79 very interesting however the lights are still not flickering so something might have gone wrong let me check i know what went wrong i just forgot to clear the back data but boy ruben let's play it again alright so let's see the stats 200 patches all right this is still quite good it is quite similar to what we had before by removing the real-time lights and even by converting the real-time lights to backed lighting and this looks much better we're talking about this room of course if we go to another room of course we will need to do some other types of optimization okay but step by step we don't want to optimize everything at once we want to do small steps that lead to small p3 optimization loops that lead to big gains so now that we have played with three options we could ask ourselves is this actually what we need for a performance jump you know we might not have k we might not have come up with the perfect parameters here but that's fine what we can do is just to deploy a build and see the difference so let's just press ctrl b and be right back all right apparently i have found a way to stream this screen including ovr metrics without putting the headset on the webcam i think this is a better alternative now 72 fps when i'm looking at the wall i wonder what's going to happen once i turn around i'm still going to keep my eyes closed just in case all right so ready let's go whoop okay don't move all right that's it i saw the number i think i saw a 40. okay that was maybe a bit too optimistic i think i saw a 38. all right so we went from let's say 20 fps to 38 fps just by doing this i wonder what would be the result if we did it correctly so what you have discovered right now in the plan phase after some investigation is that you have three alternatives one of them is really detrimental to the player that would be the first one which is all about removing the lights that are not the main one the main directional one that's no good even though for me as a developer is quite convenient to do right but i care about people and i wouldn't do that second option is to make them static but there if we do that then we will miss out on the flickering effects okay option three activate occlusion collin and call it a day so you know those backing parameters might not be the perfect ones but there are a good start from all of these three options i would say that my favorite one is by far backing occlusion column this could be a good p3 optimization loop now it will be the time to go to the perform stage which is the last step of the p3 optimization loop here we will just need to come up with the right backing parameters for the occlusion colon okay this would be the time to do it right okay then once you're done then you submit to perforce git tortoise svm whatever it is and then there you can call it a day that will give you a jump of you know about 20 fps if you want to calculate that in milliseconds then just do the math of course in milliseconds it's a better metric right and since we would still not be reaching 72 fps then you would need to execute more p3 optimization loops okay of course there might be more options than the three options that we have seen feel free to explore them in different iterations but honestly if you manage to go from 20 to 40 fps in just one iteration i myself will be super happy about it and now i just need to iterate over this speedy optimization loop until you reach your performance goal what's your performance goal is a question okay it could be 72 fps if you're talking to inquest but it could also be 120 fps if you're targeting quest because as you know oculus just rolled out the 120 fps upgrade so now performance optimization especially in the area of rendering becomes more important than ever can you imagine how well your application or game will just sell if you manage to promote the message that your application supports 120 fps and hertz that would be a nice unique selling proposition i would say and you know reaching 120 fps is complicated but if you have a structure if you have a framework like the pc optimization framework you can just do that you just need to follow a series of steps that will lead you from 0 to 120 fps and that is the power of having a proven framework alright i hope this was useful to you and now is the time for you to shoot questions and you know i might not know all of them but i will try my best to answer each of your questions let's go okay shoot if you like this video then let me tell you there is something much better than this join my game performance task force because here is where i post all the good stuff here i'm talking about 30 to 40 minute videos where i explain all i have to explain in the area of game performance let's have a sneak preview in this membership you are going to get one video per week each month that means in each month you're going to get the following in the first week what you're going to get is a lesson on professional performance here is all about making you a high performing developer the second week is all about making your cpu run faster so yes that includes physics ai rendering and everything alike in the third week you are going to receive a lesson on gpu optimization here is all about making better graphics cost less and finally in the fourth and finally in the fourth week you are going to get a lesson on memory performance this is all about reducing the memory usage the memory bandwidth the loading times and things alike related to memory access now the best thing of this is that once a month you are going to get access to one live lesson there you will be able to ask your questions live and connect to other high-performing unity developers so if you would like to stay up to date in the games industry then join my game performance task force because there is a risk-free guarantee you can just join for the trail and if you don't like it you can leave at any time but i'm sure you will love this this is the reason i'm promoting this for you i know that is going to be super useful for you no matter at which stage of your game performance journey you are at so go ahead to taskforce.thegamedev.guru and see you there
Info
Channel: The Gamedev Guru | Unity Performance Expertise
Views: 3,836
Rating: undefined out of 5
Keywords: unity, unity3d, the gamedev guru, thegamedevguru, unity tutorial, unity csharp tutorial, unity c#, unity csharp, unity optimization tips, game optimization unity, unity 3d, unity game engine, csharp
Id: SsGdyRye1xI
Channel Id: undefined
Length: 30min 41sec (1841 seconds)
Published: Tue May 11 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.