Multi-threading in Unity: Introduction to DOTS Job System

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hello everyone, today we will learn the fundamentals of the Unity job system. The job system is part of the data-oriented technology stack and allows you to take advantage of modern multicore CPU architectures. Thanks to its dependency management you will be able to write multi-threaded code in a safe way. Let's do a quick overview of what multithreading is and what issues it may introduce. Then I will explain how the job package solves these issues. Up until now we have been executing all our code on a single thread, called the main thread. Modern CPUs have from 2 to 64 cores and twice as many threads. Multi-threaded code allows you to spread the workload across many CPU cores making each simulation tick faster. But this is usually tricky since you need to be wary of race conditions. A race condition occurs when two threads are working on the same area of memory or data. Let’s say for example we want to increment a value by one. If only one thread operates on it, everything is fine. But if we have 2 threads trying to increment the same variable, we may end up in a situation were both threads will read the same value, each thread increments it and write by the result. But the result is wrong since we performed two increments, and the variable is only incremented by one. This kind of behavior is a difficult to debug and if the execution order varies from tick to tick, it’s even worse. Thankfully the job system solves this issue by enforcing some constraints and handling job dependencies. Let's see the basic syntax to write a job. For that I made a new project using Unity 2022.2.1. There is no need to import any package since basic jobs are part of Unity core. I can now create MonoBehaviour script. Yes, this is not an ECS system. The Job system is independent from the ECS package. So, you can use Jobs as well as Burst in your regular MonoBehaviour projects. Now in your file, create a struct that implement the IJob interface. Declare an int value and perform a simple operation before logging the result. I’ll do a factorial from zero to the number and log each intermediate result. Now in your update method instantiate that job, pass in the parameters and schedule it. Setup your scene in Unity and enter play mode. As you can see in your console the job logs each factorial. In the profiler's timeline we see our system. Bellow, in the job section, we can see all the worker threads and the job they executed. Great the job is executed, but the result of the computation remains in the job. Let’s change that and log the result from the main thread. Enter play mode and you will notice two things. First, the final result is not correct. It remains at its initial value. Secondly the final result is logged before each intermediate results. That’s partially because when we schedule a job, it doesn't start right away. It's put in a queue and picked up by available worker thread. We can force the job completion, by calling the Complete method on the job handle returned by the schedule method. This will allow us to make sure the data is finished processing by the job before using it for something else. Entering play mode after these changes you will notice we did not completely fix our issue. The final result is now logged after the intermediate results but is still incorrect. That’s because jobs are limited to blittable type just as Burst is. This also means data is passed by value to a job. When a variable is passed by value, as opposed to by reference, any changes made in the method, won’t be reflected on the original variable. Since jobs can run any time after they are scheduled, it eliminates the possibility of a race condition with the other threads. To get the result out of the job, Unity provides us with various native containers. We’ll see the simplest one in this video. If you want me to cover all the others types of native containers, let me know in the comment section below. And while you’re at it, don’t forget to leave a thumbs up if you enjoy learning from me. It means a lot to me and show the YouTube algorithm that my content is valuable to you. Thanks! Now to use native containers, we need to import the Collections package. Like for the ECS package, we can do that through the package manager, and import it by its name “com.unity.collections”. Go back to your code, and instead of a simple int, use the NativeReference<int> generic type from the Unity.Collections namespace. We can use the TempJob allocator for it. As a reminder, this allocator needs to be disposed and can live for up to 4 frames before being considered a memory leak. Use it in your job and Debug.log statement and don’t forget to dispose of it. Now if we enter play mode, we should see the final result being correctly logged. But did we not just create the same situation we described for the race condition in the beginning?? No, we did not thanks to the safety system that comes with the native containers. If you don’t trust me, try commenting the line where we complete the job handle and let me know in the comments what you see in your console. Ok, I’m about halfway done with the fundamentals. We have solved the race condition problem and managed to perform some work off the main thread. Or did we??? If you look now at your profiler, you will notice something like this. Your job is executed by a worker thread, but your main thread isn’t doing anything in the meantime. That’s because the Complete method of the job handle forces all jobs to complete before the main thread can continue. Sometime, even the main thread will act as a worker thread to help complete the job faster. In this condition, we are not actually improving our performance. The thing to keep in mind with job is that it’s always best to schedule as early as possible and complete as late as possible. For that we have several options. The first one is to call the complete method in late update. But that’s not always possible or what we want. Another option is to use the dependency management to schedule another job. To do that we create another job, I’ll simply duplicate mine. And when we schedule it, we pass in the job handled returned by the previous job schedule. If you do that and enter play mode, you’ll see that we now log the result of factorial 0 to N and the results of factorial O to N factorial. And as you can see in the profiler, the second job waits for the first job to execute. That’s great but we still are waiting on both jobs to finish to continue with the main thread. Here they are even performed by the main thread. Here they are even performed by the main thread like we said earlier. It would be great if we could specify a call back to execute, we the job is complete. We can’t exactly do that, but we can use a coroutine to mimic that behavior. Let’s create one and move the logging and dispose logic to it. Now if we execute that code and look at the profiler, we can see that our jobs are execute sequentially and the main thread keep doing its work in the meantime. This example is very interesting because the second job even overlaps the next frame. And on the next frame, the coroutine on the main thread waits for that job completion to log the final result. That’s awesome, we can execute code on a worker thread without stalling our main thread. But I’m just using one worker thread. I have much more than that. How can I perform parallel execution not only between the main thread and a worker thread but between all available worker thread? The solution is so simple, that you will like it, just like this video. All we have to do is to use a different job handle to schedule our job. So, if I duplicate both my jobs but don’t assign a job handle to its first job. I will create a new job chain parallel to the first one. As you can see in the profiler, I now have 4 instances of the job running in parallel 2 by 2. The first job of the first chain runs at the same time as the first job of the second chain. And when the first job of each chain completes, the second one starts. I can even combine dependencies using JobHandle.CombineDependencies. That will let both chains run in parallel. And then wait for the result of the combined chains to perform some other work. You can now admire the result of you work in the profiler. In this example we can see it performs faster than executing everything on the main thread. And that’s not even using the Burst compiler. With what we learned today and what we have seen throughout the series, go check it out if you haven’t yet, you now have the fundamental knowledge to unlock the full potential of Unity. I’ll see you in the next video to combine this knowledge and take our tower defense prototype to the next level. Take care and see you next time.
Info
Channel: WAYN Games
Views: 2,604
Rating: undefined out of 5
Keywords: Unity game development, DOTS Job System, Multi-threading in Unity, Performance optimization, C# programming, Entity Component System (ECS), Parallel computing, Game development skills, Data-Oriented Technology Stack, High-performance games, Unity engine, Game development tutorial, Game programming, Unity performance, Unity optimization, Unity 3D, Unity game engine, DOTS framework, DOTS ECS, Unity game design, Performance Optimization, Cross-Platform Game Development
Id: LP0wmX9dzAM
Channel Id: undefined
Length: 10min 50sec (650 seconds)
Published: Fri Feb 24 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.