The ECS Architecture - Performance in UE4

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
over the recent years the ecs architecture has become increasingly popular in the games industry the most appealing feature of anesius is the excellent performance it can bring to a program it does this by using data oriented design techniques in order to get the maximum out of the hardware in order to understand why data oriented design leads to speed ups in a program we need to understand the hardware that we're working with in particular the cpu and main memory the root of the problem is that there is a significant performance gap between processors and memory performance of cpus increase at a rate of sixty percent each year while memory has been improving at less than ten percent each year meaning that the gap in performance grows at fifty percent every year accessing data or instructions from main memory is a terribly slow operation that can take hundreds of clock cycles however the hardware designers have acknowledged this problem and as a solution they put small bits of fast memory into the cpu named caches accessing data stored in caches is very fast typically only taking a few cycles however because the caches are small in size it is important that the data transfer to the caches is not wasted if data doesn't exist any cache it is loaded in from main memory which again is slow this is known as a cache miss but once loaded the data is then transferred into the cache so we won't get another cache miss but in order to fit the new data something most likely needs to be evicted from the cache this is usually something that hasn't been used in a while but imagine if the data that was just evicted from the cache is needed sometime again later in the program this will cause another cache miss and the data needs to be loaded in from main memory again and then something else needs to be evicted from the cache and so this cycle will use continue this is going to lead to unnecessary latencies in your program simply just waiting for data to be loaded because of or cache utilization to use the cache more efficiently data should be used all at once so that it doesn't run the risk of a cache miss data is transferred in blocks of a fixed size typically 64 bytes called a cache line this means that when a piece of data is accessed you don't just get the memory for that data you also get all of the nearby data as a cache line if you were to access an element from an integer array you'll get 16 integers at once from memory as a cache line even if we're only interested in the single element that we're accessing so because of cache lines it's important that data which should be used together are stored next to each other in memory so that cache lines are not wasted the ecs architecture is specifically designed for game development and aims to use the hardware in an efficient way by applying the things that we just discussed ecs stands for entity component system an entity is usually just a unique identifier for a set of components a component is just a passive data structure meaning it only contains data and no behaviors however a system only has behaviors and no data systems take component as inputs and transform their data which is usually picked up by another system the ecs architecture favors composition over inheritance meaning an entity is built using components instead of inheriting from a base class to give an example say we have two actors a and b actor a moves towards a random location in each frame b has the same behavior but also rotates each frame using inheritance we would write the logic for moving in actor a then use that as a parent for actor b we then write the rotation logic in b if we instead use ecs with component compositions we would create a location component and a rotation component would then have two systems a move system and the rotation system the move system updates the location of all location components while the rotation system updates the rotation of all the rotation components actor a would only have the location component while actor b has both the location component and rotation component we've now achieved the same behavior using composition instead of inheritance while also allowing for greater flexibility and co-reusability since we can more easily construct entities using these small components we could now have a third type that only rotates it just needs the rotation component because systems run over all components we don't need to write any additional code it's already done for us adding this third type using inheritance would already be a problem since ue4 doesn't allow for multiple inheritance of view object types so working around that will usually just create a mess components are laid out contiguous in memory to give better cache utilization we rarely want to update the location of a single entity in a frame we'd probably like to update multiple locations at once it is for this reason all components of a certain type are stored next to each other in memory so the cache line only contains the data of those components this way we reduce the amount of cache misses when iterating over the components thus giving us a significant performance boost in short you could define an ecs as a database as they share a lot of similarities you have rows of components and columns of entities each row stores all components of a certain type and each column therefore translates into an entity complete with all its data obviously not all entities should have the same types of components which is why ecs like a database needs different tables for each unique combination of components in this example we have a rotation location move composition which is just something that can move in the world then there is a rotation location move health composition which is also something movable like the previous composition but this has a health component implying that it is damageable or healable this is done by the location targeter damager or healer composition which is static in the world since it only has a location component and no movement component these entities target anything with a health component by using the target component and then deals damage or heals them but why should we care about this g4 is a very capable engine that drives some of the best games out there today and provides frameworks for optimizing games without using any of the aforementioned techniques to give some understanding i'm going to quote noel at games from within picture this towards the end of the development cycle your game crawls but you don't see any obvious hot spots in the profiler the curl pit random memory access patterns and constant cash misses in an attempt to improve performance you try to parallelize parts of the code but it takes heroic efforts and in the end you barely get much of a speed up due to all the synchronization you had to add to top it off the code is so complex that fixing bugs creates even more problems and the thought of adding new features is discarded right away sound familiar for those of us that have ship games this does indeed sound familiar at least to some extent i know i can relate to the problems he describes we constantly have to trade the vision and scale of games to meet performance constraints and this is especially true for console and low to mid-end pcs i think at some point soon as we set out to make more ambitious games we'll hit the limit of what the traditional way riding games can give us performance wise so how well does the ecs work in practice here you can see a void simulation running on my working progress ecs framework in unreal a void simulation replicates the flocking behavior of birds and is defined with three primary rules avoiding other flockmates steering towards the average heading of nearby flockmates and steering towards the average location of flockmates so there is some fairly complex logic involved each entity needs relation for all nearby entities so that the correct calculations can be made on my machine with the ryzen 3950x this demo runs 1000 actors with a total simulation time only being 4.8 milliseconds this means that in theory i'll be able to run this demo at almost 210 hertz however the demo is gpu bound so i could only run it at around 100 hertz if we compare this demo that's been written using an acs to something that's been written in the traditional way the differences are substantial with the same 1000 hectares the total simulation time is 14.4 milliseconds this means that the version using the ecs is three times faster but keep in mind that my cpu has a one megabyte l1 cache which is more than most cpus on the market so the effects of caching are not felt as greatly the difference is probably going to be even bigger on a less powerful machine because of the smaller cache size so how was this demo actually implemented using ecs the star of anything in ecs begins with a data layout in this case the components first we have a move component these components store all movement related data such as the location velocity and heading then there's a void component this stores the information of nearby entities and is used to calculate the data and the move components finally there's the owner component which is just there to give a reference to the actor so that the wall location and rotation can be set on the actor itself next up are the systems starting with the void system the purpose of this system is to calculate the data in the voids component by iterating over all entities in order to access the components a query is defined that queries all entities with a move component and a void component the first step is extracting the location and heading of each entity since those are the only properties needed from these components we do this in order to fully pack the cache line the next step iterates over all entities in the query and inside iterates over all entities again here we measure the distance and if within range data for the void component is set next is the move system this system calculates the values in the move components using the new dataset and the void components first of all the acceleration is calculated using the board component data and finally the velocity is calculated and thus a new location and direction can be determined finally there's the unreal move system this system is just there to set the actor location and rotation on the owner by using the move component data that's the meat of the implementation now all that's left to do is to hook up our actor to the ecs module in the actors begin play an archetype is created using the components then an entity is created and finally the new entity is added to the archetype along with the default values for the components and that's it a highly performant bode system in unreal using an acs architecture simply putting an ecs in your game is not going to solve every issue but it will help a lot with writing high performance code in creating highly modular and flexible systems as with everything it does have some drawbacks this is very different from what programmers learn and are used to we are taught to write code in the same way that we view the world while that is easy to understand and learn it won't yield the best performance results unlearning and changing your mindset from what you've been doing for years or even decades is going to be challenging it took quite some time for me to adjust it's also very difficult to interface with existing code and systems as they're usually not developed in isolation so you won't really be able to get the benefits of an ecs unless the entire subsystem is written using it and so it's for this reason that a game probably needs to be built using an acs from the start but as i've demonstrated the advantages of an ecs are clear it truly helps writing high performance code but that doesn't mean ecs is right for every game highly depends on the scale performance targets and the target platforms of your project a very simple 5v5 shooter intended for pc at 60hz might not be the best candidate for an acs however a large-scale open-world mmo game or something that needs to ship on consoles and low to mid-end pc might benefit from an ecs these topics are something that first caught my attention when watching rare's unreal fest 2019 presentation on either wrote a custom scheduler to increase instruction cache coherency ever since it's led me to a path that's given me a new view on programming as a whole and made me question the way i've learned and written code for years and it has in a way made programming fun for me again
Info
Channel: MazyModz
Views: 4,701
Rating: undefined out of 5
Keywords: UE4, ECS, Performance, C++
Id: kXd0VDZDSks
Channel Id: undefined
Length: 16min 28sec (988 seconds)
Published: Sat Feb 06 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.