Unity Code Optimization - Do you know them all?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Before anyone now goes through their Unity project and spends hours upon hours on applying each of these things and in the process completely messes up the readability of their codebase and introduces a dozen bugs: Don't optimize what doesn't need to be optimized!

Use the Profiler to find out which parts of your code are actually responsible for the most processing time. Then see what you can do performance-wise for those particular code sections. When there is some code which only runs once every couple updates, then it makes no difference whether it takes 200 microseconds or 500 microseconds.

👍︎︎ 97 👤︎︎ u/PhilippTheProgrammer 📅︎︎ Mar 13 2022 🗫︎ replies

"... kaesh ... kaesh ... kaesh"

Say it one more time i dare you!

👍︎︎ 37 👤︎︎ u/SendingTurtle 📅︎︎ Mar 13 2022 🗫︎ replies

Linq gets lots of hate for no reason. If you're sorting/ordering something, its not usually at a performance critical state in your game, at least shouldnt be.

👍︎︎ 18 👤︎︎ u/henryreign 📅︎︎ Mar 13 2022 🗫︎ replies

The comparison between finding by tag and finding by type is a bit misleading because usually if you're finding by type it's because you're interested in a specific component, not a specific game object. The fastest way to get a specific component might still be to use tags on GameObjects and then use GetComponent from there, but I expect it depends on a few other factors.

👍︎︎ 4 👤︎︎ u/kylotan 📅︎︎ Mar 13 2022 🗫︎ replies

If you're serious about this, you have to compare the performances in builds not in the editor. I don't know about FindObjectsByType specifically, but similar functions are significally slower in the editor, and generate garbage too.

👍︎︎ 5 👤︎︎ u/lbpixels 📅︎︎ Mar 13 2022 🗫︎ replies

You should also try these in a build with il2cpp. Caching in particular, will yield different results.

👍︎︎ 13 👤︎︎ u/notsocasualgamedev 📅︎︎ Mar 13 2022 🗫︎ replies

One to test is TryGetComponent(out component) vs GetComponent where you are also checking if the component is on the GameObject after the call.

Since you need to call if(component != null) in this case is it faster to use TryGetComponent?

👍︎︎ 3 👤︎︎ u/Romestus 📅︎︎ Mar 13 2022 🗫︎ replies

How is assigning transform to a local variable ("caching it") mitigating the performance cost of the property accesses? It's still an instance of Transform and the implementation isn't changing, no?

👍︎︎ 3 👤︎︎ u/BHSPitMonkey 📅︎︎ Mar 13 2022 🗫︎ replies

You put out good content, Taro! Keep it up.

👍︎︎ 2 👤︎︎ u/midge 📅︎︎ Mar 13 2022 🗫︎ replies
Captions
hey there so today i thought it would be fun to go through some unity optimization techniques that uh you hear around to see how much impact they actually make and if they're even worth using i've just set up this uh little test bay here all right so the first one up is send message when i first started unity i found this send message function and i'm like well this is super convenient as long as you've got a reference to the object you can just call send message but as you hear time and time again send message is not good practice and it's slow in the first test here i've got uh it's going to be running send message in the second test i'm going to be grabbing the uh component directly and then calling calling the function directly all right so let's run send message and we'll do max iterations and there we go so as you can see get component is two times faster than send message and on top of that i mean you also get the class ready to go so so we can now manipulate and uh call other functions in this class if we want whereas send message you have no idea you're kind of just like shooting in the in the dark there's no way to check here to see if this object that we've got actually has this function uh there's really no reason to use send message ever right like you should always you should always just grab the class well at least i haven't found a good reason to use send message maybe someone else hasn't you could leave it in the comments i'd be interested to know okay next is the intern call caching i've just started using writer over the last few months and i had no idea this was even a thing but every single time you try to use transform if we just decompile this here you'll see that it's actually an external call so basically this is uh them shooting this across to the c plus plus side um and then it would shoot back and then we would get our transform so for example if we were just doing this in an update loop right this would be three calls to the c plus plus side and then back again for us to get the position rotation and scale so a more favorable way of doing this is caching it locally and then grabbing it from that cache but an even better way is to globally case it and then uh and then grab them from the global cache and i will show you so if we do 900 000 calls so as you can see there is actually a decent difference here fully caching it is two times as fast as not caching it at all so yeah this is 900 000 calls uh sorry that honestly the the difference is not too major like if you're talking about something like 3700 calls uh you're looking at you know not even one millisecond and uh i highly doubt you're doing that many uh api calls to the to the unity api per per frame so uh you know take it with a grain of salt don't do it for performance do it for maybe just the reason that you could possibly rename this to t or something and make your code a little bit shorter like that i know a lot of people don't like doing that they like descriptive names so do i but you know performance wise it's not gonna save your game okay so this next one is interesting uh and interesting because it goes against what i see everybody's suggesting and that is that you should not use vector3 distance right in this top one here because inside here if we decompile it it makes use of a square root which is known to be slow you should use this alternative here which is just a simple square magnitude but according to all of my tests it is like one or the other okay so 900 000 calls uh square root is actually faster there slower there right slightly slower there they're honestly neck and neck uh so there to me also this is far more readable than this right this distinctively tells me i'm trying to find the distance here so when it comes to distance in my opinion just use vector3 distance scrap this square magnitude nonsense uh because it's it's simply just like not routinely faster sometimes it's slower uh obviously i'm this is one simple test right with with uh hard coded numbers here while these are random but you know there could be a time where square magnitude might be faster but i have not found it so there you go okay so this next one is quite interesting uh find objects so on startup here uh when this test starts up all i'm doing is just generating a bunch of objects so you can see under here a bunch of trees in the tree there's a bunch of layers and then in the layers there's a bunch of objects on the objects i've got them tagged find and i've also got this find helper which is just an empty class so in this first benchmark i'm just finding them by tag in the second benchmark finding by type so let's do that's right so this is actually very slow so i'm doing recommended 1000 iterations as you can see it's quite slow and there we go find object of type is significantly slower so i actually don't know how these work behind the scenes but someone in my discord made a good point in that find objects with tag probably just looks at the transform level checks the tag and says yup good to go whereas the find object of type probably has to go through every single object looking at every single component right the transform whatever else it's got sprite renderer image uh collider all this stuff going through them and then finally returning it so it has to do an exhaustive search of every single object of every single component so uh that would make sense why it takes so much longer so yeah after seeing these so let's let's just maybe do four thousand probably gonna be waiting for a second four thousand let that load [Music] i've actually got a webgl build of this and if you did this amount on webgl it would absolutely crash your browser there we go so final objective type took 18 seconds to do 4 300 of them uh so yeah my recommendation is never use these two functions in any game loop definitely not an update i honestly wouldn't even use them in a in a state change on like a turn-based combat game right because they're honestly so damn slow usually there's a better way to find your objects right have them in a list of some kind on a manager or any any number of things the only time i would ever say use this is once on the initialization of your your classes at the very start of the scene and start up or awake that is the only time uh otherwise avoid them like the plague because they're super damn slow okay so this next one is very interesting as well it's about using the non-alloc versions of the uh physics functions so for example here in our benchmark one we're using we're grabbing the results of physics overlap sphere okay and if we uh just have a look here i've just got a uh oh gosh i've just got a bunch of colliders in this area here and uh when we click it it's just gonna overlap sphere and grab them all and in the second benchmark here we're actually using overlap sphere non-alloc and we're sending in our pre-made uh collisions array right so it's just going to be reusing that same array instead of creating a new one and returning it to us uh let's see how that one goes let's use max iterations okay apparently i said that that should be the maximum so let's do that instead and as you can see the non-alec version is slower let's actually do a little bit more than that something like that yeah so you know it's coming in close to almost double uh the speed but that doesn't mean you shouldn't use it so let's actually remove the normal one the overlap sphere one and let's head back into unity okay so let's press play and let's open the profiler and run the non-alloy so then this will be calling the actual non-allocated version um and let's call it again and again as you can see there's no garbage that's being allocated which makes sense right it's we're using the non-allocated version so now if we just swap those like that and now we're using the actual uh normal version the allocating version and we press play let's run that we will see here that it just allocated 38 mb of uh garbage right so obviously uh arrays uh reference type so it goes straight to the heap and when that goes out of scope eventually the garbage collector needs to come and pick it up so let's just run that again yeah 38mb of uh garbage so yes the non-alec version is slower to run but it doesn't generate any garbage whatsoever so yeah you really just need to know do i want it to be super super fast or do i want to allocate no garbage and in most cases you're probably going to want to not allocate any garbage so this one will win most times right but uh everybody's game is different and you might not give a damn about garbage you might just care about the speed okay the next one camera access now whenever you see any code snippet of someone using camera.main in update you will absolutely see the next comment of someone saying you shouldn't use cameraman in the update function which is fine because that's what we're all told but then just recently actually i was making a tutorial and i was cashing the camera just like this here and someone made a comment saying you don't need to do that anymore because unity now caches it uh i thought oh that's cool but then i tested it and there's still some weird results so let's just run this let's do max iterations here and you can see using camera main find with tag is obviously slow as we've discovered that final objective type is even slower but as you can see caching the camera is still superior and if we look at camera main we will see that it is still an external call so it's still going to c plus plus so they may have cased it but they've changed it on the c plus plus side so if for some reason in update you're uh calling cameraman 53 000 times uh you're only gonna lag for eight milliseconds right so calling it one time or even two thousand times you're not even it's not even going to slow you down by one millisecond okay link versus loop so everybody says don't use link in unity ever i think that's i think you certainly should use link but ensuring that you use link in the correct places and at the correct time so let's just run this link loop it's saying 1000 max iterations or else would be here all night okay so link is obviously the slowest here we've got uh for loop which is faster cased for loop which is faster still i'll show you what that is in a second and a four-inch loop which is even faster now i'll show you the code so uh we're just making a uh list here of this which is just an internet float and all these tests are doing are just filtering a little bit and then adding to a list so this link one is just checking that this in value is more than this threshold here and then it's just selecting all of the remaining ones uh float value these are doing the exact same thing just just with loops uh so as you can see here we're looping through them all if it is over the threshold then add it to this list this cached for loop uh was actually just caching the count okay so sorry instead of having data.count here and doing it every iteration we're actually caching it my buddy just wanted to check to see if that actually makes a difference i was curious too and it actually does it always seems to be just that a little bit faster and then the 4-h loop so 4-h is generally faster if your for loop has to access the index of the array more than once so if you're only accessing it once for loop will always be faster twice or more you should probably use a 4h but yeah so i built this and i put this in webgl and these numbers are all back to front so in the editor is completely different to your built webgl game who knows if it's a built standalone game it might be different still i would be really curious if you guys want to go to the webgl and run these yourself and tell me if any of these are different to what you found here it's really weird i really want to know what what what is up with that also my friend did this test in both edge and firefox drastically different results so man it's like super hard to know what is performant and what is not and the very last one here is string builder now i know i've done a few community posts saying you should definitely use string builder i just want to show you why so here we've just got a phrase subscribe which you should do account 100 so we're going to basically do subscribe 100 times in a string so this top one is just simple concatenation we're just creating a string concatenating to it the second one is using a string builder looping appending and then finally to stringing it so if we run this now uh max 3000 just so we don't lag you'll see the difference is ginormous right so let's actually do a bit more than that let's do 13. oops 22 why not oh it's a lot of string so yeah absolutely use a string builder not just for speed performance but also for garbage allocation uh string builder is the way to go unless it's just like two things if you're just concatenating two things together do it who cares but yeah use a string builder otherwise um and i've actually got this one last one that i wanted to show you which is order of operation so the idea is just that floats are more expensive to do arithmetic on than integers vectors more expensive than floats quaternion's more expensive than vectors so on so you should order your operations in that logical order if you can say for example this top one it's float times float times vector this next one is float times vector times float and then vector times float times flow and if we run this let's do max iterations you'll see that it is actually two times faster let me just give you a little example of this in action so let's say transform position plus equals uh let's say you're wanting to move left um and you're doing time speed times time dot delta time right you even see this in the docs if you go look through the docs you will see unity uh doing this so this is an example of doing it incorrectly right this is vector times float times float so this would actually be two times faster if we flipped this to the other side so yeah just keep that in mind and if you go to this link here uh you will see that unity themselves actually do recommend it doing it this way so yeah that's it i hope you enjoyed uh these as much as i enjoyed making them because i thought they were quite interesting if you've got any other benchmarks that i should add let me know in the comments and i'll add them here because i'm interested to uh make an exhaustive list and yeah that's it see you in the next video bye
Info
Channel: Tarodev
Views: 34,099
Rating: undefined out of 5
Keywords: unity optimization, code optimization, unity code, unity tips, speed up code, speed up unity, unity efficient code, coding tips, unity coding tips, unity code optimizations, sendmessage, vector3.distance, game development, unity tutorial, unity 3d, unity performance, improve performance
Id: Xd4UhJufTx4
Channel Id: undefined
Length: 15min 49sec (949 seconds)
Published: Sun Mar 13 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.