Unite Europe 2017 - Squeezing Unity: Tips for raising performance

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Great content, great presentation, should be required viewing material for all coders.

๐Ÿ‘๏ธŽ︎ 3 ๐Ÿ‘ค๏ธŽ︎ u/Hightree ๐Ÿ“…๏ธŽ︎ Feb 15 2018 ๐Ÿ—ซ︎ replies

Really interesting stuff. Had no clue how broken Layout Groups and Animators on UI elements were.

๐Ÿ‘๏ธŽ︎ 2 ๐Ÿ‘ค๏ธŽ︎ u/ayumumu ๐Ÿ“…๏ธŽ︎ Feb 16 2018 ๐Ÿ—ซ︎ replies

Probably one of my favorite talks. Really good insight on getting the best performance.

๐Ÿ‘๏ธŽ︎ 2 ๐Ÿ‘ค๏ธŽ︎ u/joofay ๐Ÿ“…๏ธŽ︎ Feb 16 2018 ๐Ÿ—ซ︎ replies

I really wish they would finally fix those UGUI oddities that he exposes every year. It's baffling to me why they even fixed the more trivial aspects mentioned in this talk.

๐Ÿ‘๏ธŽ︎ 1 ๐Ÿ‘ค๏ธŽ︎ u/Xane ๐Ÿ“…๏ธŽ︎ Feb 16 2018 ๐Ÿ—ซ︎ replies
Captions
good afternoon hello everybody welcome to squeezing unity if you are in this room looking for a nap you are probably in the wrong place for the next hour we are going to have a speedrun through the world of performance and how it relates to unity I do want to say a few things before I start and the first one is I am going to post these slides online later in fact I have had to cut so much material so many examples for some of the performance metrics that I'm going to show you that I'll have to post the full unredacted version online which includes a lot more background information than I will be able to present in the next hour okay so hello welcome my name is Ian I am a developer relations engineer with unities Enterprise Support Team what that means is that I visit our enterprise customers and I help them solve problems whatever those problems may be as you may have guessed from the topic of this talk they're mostly performance problems people have a game they've written some code the way their code works and the way unity works do not necessarily align and so I come in I try to make some find some way to make these two things play together nicely now you'll note there's not just one name up on these on this slide unfortunately my colleague mark could not be here today so I've had to take over his half of the talk you know I've only had a few days to practice his slides so if I stutter a little bit if I halt a little bit please bear with me all right so what are we going to talk about today of course we're going to talk about performance but that's a very broad area as with most of my performance talks this is going to be something of a grab-bag now my original agenda when I first conceived this talk was as so I was going to talk about a bunch of unity api's that may do things you do not intend them to do when you're calling them well then go and talk about various ways to use C sharps built-in data structures and to misuse them then we're going to have an add-on to a topic I talked about last year in lining functions in order to eliminate method call overhead then after that I was originally going to talk about the real ways to squeeze performance out of c-sharp the ways you can get an extra three or four nanoseconds out of your boolean comparisons but when I actually finished building the material it really looked more like code golf I was just doing it to show off and it wasn't really going to help anyone in the real world instead I looked at the unite schedule and I realized that the top creator of performance problems in unity projects was not going to be discussed so I'm sorry for the bait-and-switch but we'll be finishing with unity UI okay this is a reminder we're all professionals here but please when we're talking about performance I am going to issue a lot of things that sound like hard recommendations the way that you should always do it this way but please remember that you should make your game work first and you should make your game fun before you try to make it fast a lot of the things that I'm going to tell you to do will introduce additional complexity additional complexity means higher maintenance overhead and the possibility for more bugs so profile your code profile your game first before you begin applying this advice okay let's go right into it we're going to start with unity API is that do things you may not intend the unity API is you can't necessarily trust and the system I'm going to start with is one it's not often discussed when talking about scripting performance we're going to talk about the particle system you see in unity 5 when you call one of the particle systems main API is like start or stop or even the simple is a live check it it's recursively by default through all the chow the children in the particle systems hierarchy now it doesn't do this in any particularly intelligent way it actually goes to the particle system finds all the children of that particle systems transform calls get components on every single one of those transforms and if there is a particle system yes if that invokes the appropriate stop start is a live method but if those child transforms have their own children it recursive into those as well now if you have a particularly deep transform hierarchy this can definitely become a problem so what can you do well all these api's have a with children parameter defaults to true set it to false it eliminates this behavior it will only change the particle system that you're calling directly now it's common for VFX artists to create particle systems or visual effects that have particle systems spread across several trial transforms so you may actually want to start and stop all of these at the same time apply the old standard unity remedy to get component calls cache a list of them at initialization time then call start or stop or is alive on each of those in turn and make sure to pass the the false parameter false as the width children parameter now there is one other thing about particle systems that might make you sad when we're working with unity we often want to drive down the number of times we allocate memory on the manage teeth we want to drive our GC Alex down to zero there's a problem here as of unity 5.4 we introduced a little problem into a couple of the particle system api's if you call stop or you call simulate we will allocate a small amount of memory yes it doesn't matter if the particle system is already stopped it's still going to allocate memory every time you call it it will allocate and this is because most Unity API is the ones that you use are actually just c-sharp wrapper functions now in particle systems case they actually wrap some internal c-sharp functions with then which then do the actual work the problem is in 5/4 and all the way through 2017 dot 1 we introduced a closure and does all good c-sharp programmers know when you close over a local variable we must allocate a reference on the heap to keep track of it we've talked the particle system team about this and this will be fixed it is already fixed in 2017 too but some of you may be close to shipping and you can't wait that long so let me give you a little hint remember I said that those public api's are your c-sharp wrapper functions around other c-sharp functions here's the internal ones here's the internal ones all the internal api's are conveniently named internal underscore and then the actual function that you wanted to call so what you could do is you could either write an extension method or you could address them via reflection and cache the cache the reference function and then you call them yourself if you want the function signatures here they are the all the arguments these functions are identical to the ones in the public API except for the first which is just a reference to the particle system you want to stop or to simulate again these slides will be posted later if you want reference material let's talk about a classic one next API is that return arrays now you may not be aware of this you probably are you may not be aware of this but anytime you access a Unity API or a Unity API property that returns an array it's going to allocate a fresh copy of that array every single time you access it or every single time you call it this is primarily for safety reasons if you have one system that accesses an array modifies it somehow for its own use we don't want some other systems elsewhere to get an incorrect view of what unities internals internal state is canonical examples of this mask vertices access that you'll get a fresh copy of the meshes vertices input touches you'll get an array containing all the touches currently down in your frame so if you are using these ap is minimize the calls to them there are however many unity ap is that now have non allocating versions and of course it's better to allocate nothing than it is to allocate a small amount so with input touches there's of course input get touch input touch accounts for all the physics and physics 2d cast all's so that's ray cast all sphere cast all box cast all we now have ray cast all non Alec box cast all non Alec and so on these have existed since 5.3 similarly if you do use get components or get components and children at runtime there are now versions which accept a generic template list and it will fill up that list with the results of the get components call now this is not purely non allocating if the results of get components exceeds the capacity of the list the list will be resized but if you're reusing the list if you're recycling the list or pooling it somehow then at least the frequency of allocations will drop as your applications lifetime continues now for my favorite one camera dot main which looks innocuous I hear a lot of laughter some people already know the punchline every time you call camera dot manage is not a direct reference to the main camera we're actually calling object dot find object with tag every time you access this property remember this one because I'm actually going to come back to it in a little while okay let's talk about data structures so it is key to know how C sharps data structures work internally so you can pick the correct one to achieve to to accomplish the goals of the algorithm or game system that you are writing often though I find people picking data structures that are convenient to use rather than the ones that have the performance characteristics that align with their goals let's have a quick review if you have a list internally it's just an array and if you are randomly indexing into an array or a list or you are iterating through an array or a list that's effectively going to happen in constant time there's very very low overhead so if you're iterating through a list of things every frame list array on the other hand if you need to have constant time addition or removal you may want to use a dictionary or a hash set these are backed by a hash table a hash table has some number of buckets each bucket has a hash value which comes from meat which is comes from the hash code of the objects passed to it and it stores the objects of the matching hash code in that bucket the bucket itself is basically just a list so adding and removing things from there is close to constant time depending on the capacity of your hash table and the number of items you're inserting into it if you're actually relating data in a key-value manner you're saying this piece of data is related to this other piece of data in a one-way manner you probably want to use a dictionary because you're probably are trying to look up that other data by your by your key data in that case that's what a dictionary does but I often find people misusing dictionaries if you're trying to simply laits two pieces of data saying there is a relation between data a and data B but you want to iterate between that over that data it data or relation every frame often people use a dictionary because it's expedient it already has a pairwise relationship between two objects the problem is you're iterating over a hash table when you iterate over a hash table we must iterate over every single bucket in that hash table whether or not it's full so there is considerable overhead in iterating over a dictionary instead consider creating a structure or a tuple and storing a list or array of those structures or tuples that contain your data relations much faster to iterate over every frame now it is not always as cut and dry as this I realize that sometimes you have multiple concerns let's consider a very common case something I discussed last year was the Update Manager pattern this is where you write a system that distributes update callbacks to different systems inside of your game and systems can subscribe to receive updates when they want them so what are our requirements well because we are conceivably updating the many many systems and because we are updating them every frame we want our overhead to be low we don't want our Update Manager itself to become a performance problem at the same time because we can at any point subscribe to updates from another system we effectively want constant time insertion we want to be able to add things to our update manager without much overhead finally we don't want to allow systems to subscribe twice that could cause unintended bugs elsewhere in our game a character might run twice as fast for example so we need to be able to form duplication checks we don't want that to take too long either so if we examine each of these requirements we can see that there's no single data structure that actually meets all of them if we want it iteration low overhead iteration we want an array or list and so on if you want constant time insertion yes we could also use an array or a list as long as we make sure to insert at the end we could perform some tricks to do have constant time removal by swapping items from the end into the position of the removed item but the duplication checks we cannot resolve unless we sort the array or list so we effectively need to use a dictionary or a hash set if you want to have quick duplication checks what's the answer here don't just use one data structure used to maintain those maintain a list or an array for iteration but before you change the list use a hash set or some other kind of indexing set to perform the check whether the item that you're adding or removing is actually present the downside to this of course is that you are maintaining multiple data structures so there's a higher cost to to addition and removal and there's more memory overhead now you could also in it if you are using reference types you could also use a linked list or an intrusive a linked list to represent your data and intrusive a linked list is where you take your data item and you actually add the previous and next links into the data item itself so that so you're basically mixing the concerns of the list and the storage of data a little bit dirty but fairly common in video games I don't have too much more time to describe that pattern but Stack Overflow has several excellent articles another quick tip about dictionaries common thing I see is when people have some unity entering content some mono behaviors some scriptable objects they want to relate that mono behavior that scriptable object to some other piece of content so they use a dictionary and they use the mono behavior or the scriptable object as the key to that dictionary this isn't a priori bad now what happens is if you use a unity engine object of which scriptable objects and mono behaviors are both derivatives we use the default dictionary compare by default that calls the object equals method and I say object I mean the plain c sharp object equals method now unity engine that object overrides that method and forwards to another method called compare based objects as long as you're not running an editor mode that performs a bunch of additional checking and make sure there's no no no's involved before before calling some additional methods and finally in the end it ends up in the C sharp space object reference equals method which just checks that the two references you've passed it both point to the same thing the same object now that doesn't sound too expensive it's just a few extra checks but we've introduced a couple of extra branches to our quality checking and that introduces a small amount of extra overhead hold that thought I'm going to give you another piece of advice every unity engine object has an instance ID I don't have too much time to go into what this is but it instanceid is always unique it is guaranteed to be unique no to unity engine got objects during the lifetime of an application will have the same instance ID and this instance ID never changes from when an object is created until that object is destroyed and it is just an integer you can access it by calling we get instanceid method now what are what am I going to be doing telling you to do well since its invariance that never changes over the lifetime of an object and you are awake or on enable callbacks you could call get instance ID and store the instance ID on a public field or a public property and then you could use that as the key into a dictionary how does that how does that compare in terms of performance again because I'm telling you to do it you can probably guess but the degree is actually quite surprising that cashed in key version where I actually grabbed the integer ID at initialization time and reuse it as my dictionary key over many many iterations is three times faster when indexing into a dictionary compared to using an object regardless of the platform you're on I used a very slow tablet and a very fast laptop file to cpp and mono the results were effectively the same the difference is if I did not cache the instance ID if I called get instance ID every time I wanted to index into my dictionary on IELTS EBP I still realized some performance benefit it's still about you know thirty percent faster than the object version under mono though it ends up being slower the thing is get instance ID actually invokes some unsafe code so this seems to affect Manos JIT compilers ability to optimize my loop for me whereas il-2 CPP doesn't seem to care as much further if you end up using the integer P dictionary you're not using the default equality compare anymore built-in types like float and int have dictionary to have dictionary have dictionary types in the standard c-sharp library that have been hand optimized by the xamarin team that is handwritten il doing your comparisons for you which is going to be faster than using the default one now don't do this all the time this does introduce again some additional complexity reserve this for the dictionaries that you are accessing hundreds or thousands of times per frame for example if you're building a strategy game or an RPG you may have some dictionaries containing your characters attributes that are keyed off of perhaps a scriptable object well in those cases when you are indexing into thousands of times per frame this can shave a millisecond or more off of your frame time I've seen this happen in the real world okay next thing we're going to talk about method call overhead sorry for the abrupt cut okay every time you call a method in c-sharp there's a small amount of overhead involved we have to maintain the stack we have to push variables on to the stack pop them back off when you exit the function we have to jump the instruction pointer around now normally this is not much of a concern the amount of time this cost is very very small usually measured in nanoseconds or milliseconds or microseconds sorry but most programming languages if you come from a C or a C++ background you know there's a way around this you can use the inline operator on your methods to get rid of it and what inlining does is instruct the compiler to take your method body and effectively copy/paste it into the place where you're calling the methods so the easiest way to get rid of method call overhead is simply not to call a method in the first place hypothetically this also works in c-sharp the c-sharp grid compiler hood inline trivial functions in practice when we measure this in unity c-sharp compiler both the old one and the new one it does not seem to occur we still see performance penalties regardless of whether we're in lining them or not or whether we ask them to inline or not so I ran a little test I wrote a simple program what this program simply does is iterates through a bunch of lines of text and tries to count the number of times it sees a specific put word in that text the key function here is find a number of instances per word what I did is I passed in Samuel Taylor Coleridge's The Rime of the Ancient Mariner which is 1900 lines of poetry that was enough to expose performance problems so find a number of instances of word I'm not going to show you the method body you can kind of guess how it works the important part is at the top they're highlighted there's a thing in c-sharp called the method implementation attribute and this provides the compiler with hints at how it as to how it should treat your method when getting it or when cross compiling it with Eyal to CPP now in this case as a control I have used the no inlining attribute and what this does is of course instructs the compiler to not inline anytime I call this function please make it a natural function call this is not the only thing you can do if you're using the new 4.6 net runtime there is now an aggressive inlining attribute and this is like attaching the inline keyword to your function in C or C++ this asks the compiler to copy/paste your function body in and again as a control what I did what I call manual inlining and what other people might call disgusting copy/paste I simply put my method body into the middle of the method I was copying it how does this perform okay in 3 5 with no inlining and with manual inlining there's about a 10% performance benefit in this case depending on the size of your method depending on how heavy the body of your method is you will realize more or less performance benefits there's a greater performance benefit for smaller methods however and the interesting thing is this is actually relatively invariant between mano and IL to CPP on the other hand when we asked for aggressive inlining in mono we almost caught up with the manually inline function but when we cross compiled it with IL CBP that performance benefit disappeared results effectively if you have some code that is in a very very very hot path and is true and is very trivial consider manually inlining it if you're in the 3.5 runtime this is a maitenance nightmare so use this technique very sparingly but I have seen this used in real-world game studios and has brought their computation times down significantly on the order of sometimes 10% on the other hand if you are fortunate enough to use the D 4.6 experimental runtime then you can use aggressive inlining and achieve almost all of the benefits of copying and pasting your code around with none of the maintenance headaches on the other hand I went and I asked the scripting team about these results spoiler alert aggressive inlining is simply not yet implemented in il-2 CTP however it is coming soon TM actually I do expect it to be quite soon I don't have a specific release date to give you right now but please continue to check our our patch release notes - they will be in there whenever I get it whenever it comes out there are other places where this sort of thing is relevant though where this method call overhead is relevant if you create a trivial property now the C sharp e way of declaring a variable on a class is often to create a property with just a public getter in a public setter however every time you get or set the value of this variable you are actually invoking a method at least under seat unity c-sharp compiler so again if there are variables they are using in the hot past in your code consider just converting them to public fields there's no functional difference there's no additional protection difference to your code but it is actually more performant okay now we're going to Tibbett i've just been talking about a lot of things where i've had to say only in the hot path only in the hot path don't test this first now I'm going to talk about something that affects pretty much every single project that I visit it is one of the biggest causes of performance problems in unity and of course that is unity UI itself so let's start with the basics how does unity UI work the basic component of unity UI is the canvas the canvas owns the meshes that are that are generated from the UI elements that you place on it and it takes these measures and submits them to the GPU for drawing it actually issues the draw calls it is also responsible for ordering the regeneration of those meshes whenever necessary now generating those measures can be quite expensive because we want to collect them into batches and that is not cheap so we want to do this as few times as possible we only want to do it when something changes the problem is something changes means when one or more things on your canvas change when one or more UI elements are updated or otherwise changed yes one UI element on your entire canvas can dirty it and many people build their entire game's UI in one single canvas with thousands and thousands of elements they change one text property and suddenly they have a five millisecond spike this is why now what's actually going on and that rebuild what's so expensive there's three different major steps the first thing we do is we go to all of unities automatic layout systems the vertical layout groups horizontal layout groups and if they're dirty we tell them to rebuild their layouts relay out their trialed items the second thing we do is we go to every enabled graphical element on the canvas every UI text every UI image and we ask it to regenerate its vertices and yes we do this for every enabled image we don't look at the color property we don't look at the rect transforms position if you have moved things off screen or if you Alfred amount to zero they are still going to generate meshes they are still going to be submitted to the GPU if you have alpha doubt quads on screen yes you are still paying the sampling cost for them don't do that the last thing we do is we regenerate the materials that we use to draw our UI this is really a relatively quick step I'm not going to talk about it more alive never found it to be a cost performance problems which is a rare thing to say now the thing is while these systems all appear to be able to be dirtied individually in practice whenever we dirty one of them we end up dirtying all of them there are a few very notable exceptions if you change the color of a graphic we will only dirty its vertices if you have a UI image and you change one of its fill properties like fill amount or fill center we will also only dirty its vertices now after we've run those three systems run those three updates the canvas takes the meshes and materials that have been submitted by its UI elements and it tries to divide them into batches it wants to draw them in the least number of draw calls possible long time unity veterans will be familiar with this problem now back in the 3.5 days we all struggled to get our draw calls down below 100 and you I was often one of the major sources of draw calls so unity I was built to try to solve this problem for you by doing it automatically so what it does is it takes the input set of measures and it sorts them by depth the reason for this is that all UI geometry is considered transparent it is submitted to the GPU in the transparent queue no matter what if you have an alpha list image it doesn't matter we're still going to consider it transparent it's still going to be drawn back to front so yes if you have a lot of large stacked up greebles or debt board or background decorations those can cause significant overdraw problems you can use the over draw view in the editors scene view to actually diagnose this just look for those nice hot yellow areas in the scene view and you'll note that I said it involves a sort by depth yes this is a regular old sort operation which has scales n log n with its input sets this means it's performance drops faster as you add more and more things to the canvas but worse as you add more and more things to canvas it's also more likely that any given frame one of those elements is going to get updated so you're more likely to dirty your canvas and by dirtying your canvas you're spending more time rebuilding it - vicious cycle how do you solve this I've hinted at it already you use more canvases each canvas is an island it isolates the things on that canvas from the things on other canvases if you have UI elements on one canvas and UI elements on another canvas the first cannabis updates the second canvas is still happy and will not regenerate its patches this is the main tool you have for resolving batching problems of the unity UI fortunately you don't just have to use many different root canvases you can nest canvases which allows designers to create their nice big hierarchical UIs without having to think about where different things are on screen across many canvases the other nice thing about this is that you still maintain that Islands characteristic child canvases isolate their contents from their parent canvases and their sibling canvases they also inherit their parents rendering settings but can override some of them and this can become important so some quick general guidelines if you have big canvases divide them but try to group things together based on when they get updated if you have multiple elements that are updated at the same time they will all have to be rebuilt in the same frame so put them on one canvas the general way you do this is you take things that aren't updating ever they're only updated once from the canvas is shown like a background element some static text static icons put them on one static canvas then take the things that update frequently you know anything that your designers want to bounce and dance around linked in and out put those on a separate dynamic canvas so that dynamic canvas update every frame but everything else will remain pre batched as an example I took nd touches inventory screen demo from his unite aja optimizing the unity UI project as you can see we just have some character statistics for unity chand we have some descriptive text and we have her inventory and this was meant to simulate a late-game scenario so we have a scroll rect the scroll rect has a thousand entries each entry having three graphical items so we effectively have three thousand different things that will need to be batched when we scroll that scroll rect anyone who has used a scroll right for the large number of items knows what the next slide is it takes way too dang long takes 25 milliseconds on a very high-end laptop standalone just to scroll that list if you're on mobile it's going to take more than 100 milliseconds that's not acceptable so what can we do well the way we built this UI I've taken a rect transform made the parent of my scroll rect this is important I've made that that parent of the scroll rect occupy the area of the screen that the scroll rect exists in so that inventory outline with the background and the under heading as well as the scroll earth itself are inside one rect transform then I added a canvas to that rect transform this means when I scroll the scroll rect we're not rebuilding the other 100 or so elements on the outer canvas but more importantly I've also turned off pixel perfect designers usually want static UI's to be pixel perfect they don't want any fuzziness on the text on the icons but when you're scrolling a list you know you're not going to be able to tell when something's a little bit fuzzy for a single frame and turning off pixel perfect because actually one of the biggest performance games you can make when you have dynamic content in fact just making this one change isolating this sub canvas and turning off turning off pixel perfect brings our updates down to 5 milliseconds there's a reason that only is in quotes because if I were the program on this project I would still not consider that acceptable we would have to keep going further we would have to optimize our scroll rect more how would we do that the first and most direct way of course would be to pool the elements that we have in the scroll rect disable the ones that aren't visible somehow either add new elements in at the bottom and remove them at the top or enable and disable them as they enter or leave view this will require custom code the other thing you may want to do is add some code to clamp your scroll rect velocity unities full rect code right now does not aggressively clamp velocity so when you scroll if you have the inertia turned on and you flick your scroll rect to scroll for several frames often for two or three seconds after you finish after the thing has finished apparently scrolling it is still actually moving by about a hundredth or thousandth of a pixel every frame so it is still marking your canvas and dirty as dirty and you're still seeing that five milliseconds being taken up even though nothing is changing apparently on-screen ugly ok now I'm going to talk about the graphic ray caster this is the component that translates your input into UI events so it takes the mouse position on-screen takes the touch events from your screen and trades latham into things like mouse enter events button click of a pointer click events and so on and it sends these events to the interested UI elements UI elements on the canvas that are interested in receiving input do note that you require one of these on every canvas that requires input even sub canvases so when I added that new canvas to my scroll rect I had to also add a graphic ray caster to make sure I could still actually interact with it now despite its name the graphic ray caster is not really a ray caster what it actually does is it takes the set of UI elements that are interested in receiving input on a given canvas and it performs point to rectangle intersection tracts it the input point on the screen and it says okay is this inside the rectangle of this UI element is it inside my rect transform if so it dispatches the UI event and allows the element to handle it edy yes this relief is just a simple for loop there's no intelligent code inside of here now how do you know which UI elements are interested in receiving updates that is the rake at what the raycast target property does on UI text and UI images so the first thing you can do is if you have some things that are not interactive if you have some text on a button or some static icons on your UI turn off raycast target this directly shrinks the amount of work that your graphic ray caster must do every frame now I did say the graphic right caster isn't really a read caster but it kinda is if you have worldspace canvases or camera space canvases you know screen space camera then you can set a blocking mask in the blocking mask instructs the ray caster whether you would like it to cast raised through 2d or 3d physical space what this does that takes the input screen point constructs array from the origin point through the link linear clip plane of the camera all the way to the edge of the far clip plane and sees if anything in those physics worlds a 2d or 3d physics world's intercepts the Ray before it hits your UI and does that film the event camera events camera that's this property the event camera property on a world space canvas and on a screen space camera space canvas it is actually the render camera property now we do yell at you if you don't set the render camera property on a camera space canvas we get a big yellow warning box in the UI but you can leave the world space canvases event camera blank now what happens if we leave the world space canvases event camera blank we are going to use this in the graphic ray caster we've got an event camera property that we access now some people tend to believe that if you leave the event camera blank that means your canvas is not interested in receiving events so they don't set one up let me show you the event camera property look at the final line there if we are in a world space canvas and we do not set the event camera property we instead fetch camera main dang it now okay I heard once at least one facepalm out in the audience but you're posing okay okay that's bad but it can't be that bad they're probably caching the access they make to that property nope depending on the code path we take we can't act access event camera between seven and ten times per frame per graphic ray caster per world space canvas that is an excellent question why and as a reminder I think you all remember this cameraman is find object with tag now here's the other thing fortunately I know everyone in here shivers and they see find in a function call from unity we all know that find object by type is not a good thing to be calling at runtime fortunately find object with tag is not quite as bad as find object with type we maintain an index of all the items that have tags in your receipt special index and we only iterate over that one when looking for tags that said many games end up using tags for design things for QA things or no describing gameplay deploy you know determining who your players are what your obstacles are so it is very common to have thousands or more game objects that are tagged so how does this perform I created ten world space canvases and in one case I left the world space camp the event camera unassigned and another case I assigned the event camera now in the case where I assign the event camera there is no performance difference in the graphic ray caster when accessing the event camera property no matter how many tagged objects you have it doesn't matter but performance degrades rapidly as the number of tagged objects increases I'd say actually a thousand objects which in this case take nearly a millisecond just to access the camera main property per frame is low and is being run on a high-end laptop if you were running this on an iPhone this would be a much larger number possibly as large as four or five milliseconds unacceptable so what can you do first avoid using camera main cache the references to your cameras at initialization time at the very least to start up your update loop unity could also take this advice and I know the lead UI programmers sitting in the back somewhere the other thing we can do is create a system to track which the main camera is now we can't do this for you we can't actually cache the cache this in between frames because you can change what the main camera is at any given time but if you have a complex camera setup you could at least create some way to track your camera system and tell your code which camera is the one should care about of course this matters mostly in things that update every frame but graphic right casters do update every frame so what can you do there always assign a world camera always always always if you have to write a monobehaviour or some kind of code that will update the event camera property when the main camera changes do not leave this empty I want us to actually add a warning in the UI and you want in the editor UI about this the other thing you can do is you have you eyes that are not interactive regardless of whether they're worldspace screen space overlay screen screen space camera don't add the graphic ray caster it's just a direct you are directly reducing the amount of work unity UI has to do every frame and this is actually quite common people often have worldspace UIs for like the health bars and names above their characters heads and by default when they create these world space canvases unity adds a graphic ray caster and they don't remove it because they don't think it's going to be a problem get rid of that save yourself some time ok let's move over to talking about layouts the layout system now many people are probably not I don't use unities layout system I don't have to care about this section no it's not going to be that simple so unities layout system like like I went I said before works on a sort of dirty flag we tell a layout system that it is dirty when one or more of its child elements changes which means whenever a child element changes it must invalidate the layout system that owns it the layout system you'll note this terminology the layout system is the set of contiguous layout groups that are directly above a layout element now layout element is not just the layout element component UI images are layout elements UI text our layout element scroll Rex our layout elements and also actually layout groups you see layout groups are just components too you've seen these vertical layout group horizontal layout group and so on and they are always components that are directly parent game objects of layout elements so how do the layout elements know which layout system they need to dirty or if they need to dirty one at all I could describe this to you but I would rather give you a tour we're going to go through the unity UI source code now you can download this online I strongly recommend you do it's a great way to find find out why UI is behaving the way it does the one we're going to start with is graphic CS this is the base graphic class which is the parent of both UI text and UI image so this this class has a set layout dirty method and it does what it says on the tin it tries to dirty it is called whenever we need to mark this layout element with this graphic have we mark its layout system as dirty the first thing you can see is that we have an early out if our components is disabled or if our game object is disabled we don't actually move try to mark the layout as dirty this is important then we go into the layout rebuilder system and we say oh here is my rect transform that this is the rect transform that is trying to dirty a layout system please mark my layout system as dirty well that looks innocuous it looks like it could be good could be built smartly okay let's go into that method the mark layout rebuild method I've omitted the unimportant parts takes the transform you pass it and it begins walking up the hierarchy so on the first parent it says is there a valid layout group here and if long as there is a valid layout group it continues walking up the hierarchy it's trying to discover that contiguous set of valid layout groups that I mentioned earlier and until it finds either a route transform or it finds no valid layout group it continues this loop then it will break out I'll mark that route layout group as dirty because it's the route layout group that is the master of the layout system so we can immediately see that the number of times we are going to check for a valid layout group increases linearly as the depth of our consecutively good layout groups increases so we have to layout groups we're going to double the amount of valid layout calls three sorry two layer groups will triple the amount of valid layout calls three layout groups will quadruple the number of valid layout calls so what does that function do surely it does something intelligent surely get components we go to the transform we call get components and try to find all any of the enabled layout groups on it if there is one or more we return true and we found a valid layout group now how do we actually determine whether or not there's one or more valid enabled layout groups nervous laughter from the crowd we have a link you query that modifies a list link you does not does not belong in the hot path list modification is also a linear time operation and does not belong in the hot path worse everyone in here can think of a better way of doing this we've got a list we only care whether there's one enabled thing in it why don't we just iterate with a for loop through that list and and early out at the first enabled layout group where's Phil so what do we learned first even if you are not using the layout system every you default unity UI element that you use will try to dirty its layout which will result in at least one get components call and it will do this every time it tries to mark the layout as dirty remember we don't cache that layout group so we do not know if the layout group has already been marked dirty we must do this every time further as you begin nesting layout groups the cost rises multiplicatively just adding one layout group doubles the amount of get component calls that we took that we perform to layout groups triples it and so on now what marks layout elements is dirty on enable on disable both of them mark mark mark layout groups is dirty this is usually there is the cause of performance spikes when people are enabling or disabling unity UI they're seeing they're actually seeing the results of a massive number of UI components all walking up their hierarchy trying to mark lads as dirty and calling getcomponent dozens or hundreds of times similarly if you reparent an active component we have to dirty the layout of the old parent because we removed it from its set a set of things that it cares about and we have to dirty the new parent because we've added something to the set of things that that layout group cares about further if we if we have Meccan m applying any properties to our UI we also mark the layouts as dirty we actually also mark the vertices as dirty as well and also if we resize the rect transform if we change the size or the anchors of the rect transform we also mark that layout as dirty in fact we mark the layout a pole that rect transforms children as dirty as well so everything Dirty's layouts absolutely positively everything there are very few exceptions so let's have some solutions step one avoid layout routes use anchors if you have to do proportional layouts if you have to lay out two or three or four things side-by-side or vertically you can set the anchors to say okay I occupy from the left side to 25% of my parents with 25% of my parents height now this is not it's not always a cut and dry as this I realize that but as much as you can use static layouts if you have to have some kind of complex dynamic layout where things are being added or removed at runtime try to write your own code to do this you know how your game operates you know when you're adding or removing something that will actually cause multiple components to have to be have to move around on screen or resize so you could restrict updates to those cases we can't do that we aggressively dirty layouts the other thing is if you're pooling UI objects don't do it the naive way don't do it the way where people normally do it where you re parent something and then disable it instead disable it first you will pay the cost of dirtying dirtying that hierarchy once but then when you reparent it you will not dirty the old hut you will not pay the cost of dirtying the old hierarchy a second time and you will not dirty the new hierarchy at all similarly if you're removing something from the pool re parent it out of the pool first again don't dirty the old hierarchy don't dirty the new hierarchy then when you're removing something from the pool you're usually about to use it so update all of your data first before you enable it because each time you change a UI text text each time you change a sprite and images sprite we are again going to mark the layout as dirty so you can skip all those mark layout as dirties by updating it while it's disabled and then enabling it you may also want to show or hide some you eyes you want to show or hide a particular canvas or sub canvas what you can do is instead disable the canvas component itself the canvas component will therefore not discard its vertex buffer will keep all of its measures will keep all of its vertices and when you re enable it it won't trigger a rebuild it'll just start drawing them again also because you're not enabling and disabling the game object that the the canvas is attached to we're not sending those on enable and on disable callbacks your entire hierarchy so you're again immediately skipping lots of get component calls the main thing you have to be careful of when doing this is that if you have written some code and it has in an update or a late update some kind of expensive operation you will need to manually disable that whenever you enable or disable your canvas you no longer get those those convenient enable disable callbacks and this can be become a bit of a maintenance headache now in extreme cases in really extreme cases the source is open you could rebuild Unity engine GUI from source and just strip out the layout system this is a lot of work because now your designers are going to want to add layout groups and have them function so you may end up having to write your own pet system that's not going to be fun okay we're almost done we're almost done up here last thing I'm going to talk about is animators in all of unity tutorials we at we show you how to animate your UI by adding an animator to it in fact if you actually have a go into like the UI button component and you say oh I want to want to animate this UI component it actually has a button for adding an animator and so here's the four states just put your animations in these four states that's fine it puts empty animations in the States by default let me give you a quick tip remember that on did apply animation property properties callback it means that if you're thinking about using animators on you I don't here's the reason animator was built to be performed the animation system was built to be performed and it was built kind of on the assumptions of character animation which means the properties are animating it expects to basically change every frame it's going to be interpolating between two keyframes and if we had to introduce a branch for every port for every time we interpolated two keyframes and on each property to check whether or not it had change that would actually slow down the animator considerably so if you have any animation at all in the active state that you're animating or that you're blending to or from then the animator will say okay well I have to update these properties it'll write those properties no matter whether they've changed or not fire the on did apply animation properties call back and you will dirty your layout even if nothing has apparently changed in your UI now if you do have things that are always changing like a little bouncing widget or something that your designer wants to put in that's okay you know that's going to be changing every frame already so it's not so bad to use an animator there but if you have things that change rarely or the only change and responds in response to events for a short period of time write your own code use a co-routine you know you write your own cleaning system or something I can guarantee you it will be a lot more efficient than using unities built in animator and that's all I have to say about that okay we are effectively out of time and you
Info
Channel: Unity
Views: 91,622
Rating: 4.9800363 out of 5
Keywords: Unite Europe 2017, unity, unity3d, gamedev, indiedev, game engine, performance, best practice
Id: _wxitgdx-UI
Channel Id: undefined
Length: 50min 6sec (3006 seconds)
Published: Mon Jul 10 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.