CppCon 2018: Stoyan Nikolov “OOP Is Dead, Long Live Data-oriented Design”

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Did anyone watch this who can summarize?

👍︎︎ 20 👤︎︎ u/shortstomp 📅︎︎ Oct 26 2018 🗫︎ replies

Haven't seen (yet) the talk, limited connection right now, but glad to see another bulgarian presenting it, but coming from gamedev - Mike Acton, and his https://www.youtube.com/watch?v=rX0ItVEVjHc video should be for everyone to see.

👍︎︎ 7 👤︎︎ u/malkia 📅︎︎ Oct 27 2018 🗫︎ replies

I skimmed the talk; for me it's the usual DOD stuff:

  • lots of claims about how OO is bad in every way; performance, maintainability, etc
  • lots of concrete stuff about the performance aspects of it
  • nothing even close to concrete about the claims that DOD is easier to maintain, test, etc

I'm still waiting for a DOD talk that has anything interesting to say that's not basically summarized by: memory layout is important to performance and typical OO doesn't always yield optimal memory layout. That's fine, and the specifics of how performance is improved are interesting. But if you don't actually have anything strong to back up the superiority of DOD in terms of anything outside of performance, maybe stop making claims? The non-performance related aspects of DOD are honestly (after watching grandiose claims for years that are never backed up by much) feeling like snake oil to me.

👍︎︎ 15 👤︎︎ u/quicknir 📅︎︎ Oct 27 2018 🗫︎ replies

So I wonder: In what ways is data-oriented design different from functional programming?

👍︎︎ 6 👤︎︎ u/qqwy 📅︎︎ Oct 27 2018 🗫︎ replies

C is a great language for data oriented design.

👍︎︎ 2 👤︎︎ u/cafguy 📅︎︎ Oct 27 2018 🗫︎ replies

This talk goes on to use object oriented programming concepts in the talk. Kind of muddies the message a bit.

👍︎︎ 1 👤︎︎ u/PeabodyEagleFace 📅︎︎ Oct 27 2018 🗫︎ replies
Captions
- Good morning. My name is Stoyan and first of all, thank you for waking up early to come to this talk. Today we'll be chatting about data-oriented design and I decided to do this talk because, when looking at all the resources, all the talks, all the blocks on data-oriented design, I was unable to find a good example of people showing a real-world production system made with data-oriented design, comparing it with objected-oriented programming and outside of games. So, I've been in the video games industry for about 10 years now and I've been doing games, I've been doing mostly, however, game technology. I work at a company called Coherent Labs where we create the technology for games, so it's very likely that if you're playing and if you're a gamer, you machine has executed some of the code we'll be looking at. For the past six and a half years, we've been working on products based on Chromium, WebKit, and now our in-house proprietary game UI and the browser engine. If you're not familiar with the terms, Chromium is the open-source project that is at the heart of the Chrome browser, of Slack, of Skype, and many other applications that are probably running on your machines, and WebKit is the original project that the core of Chromium is forked from and works in Safari, on IOS, on Macs, so if you're an Apple user, you're using it as well. So, we have these successful projects and around two and a half years ago when I was working on optimizing and improving the software based on these technologies, the whole time I felt like I was in a cage and that cage was the architecture of WebKit, which is also the architecture in many ways of Chromium, of those layout engines. So, we gather up with my co-founders and did the only sensible thing, of course, which is to start from scratch, leave all that behind and build our own in-house browser engine for games, and there was a very big question for us: can a browser engine be successful with data-oriented design? We didn't know that. We had successful projects with data-oriented design, our graphics engine was built with that, we knew about a lot of games that have been successful by employing the design, but we didn't know of anything with the scale that we were looking at and we didn't know if it scaled to a software that is very object-oriented in its core. If you look at all those open-source projects, they are all object-oriented and this has many reasons: it's part legacy, it's part the way that the standard of HTML rendering requires you to do some things. So, to see how that went, let's do a very quick demo. Okay. Let's run a test page that I did with Chromium. This is a build that I have here. Okay, it's a very simple example. These are 3000 rectangles. All that they do is they move left and right and they change their color and at the center there's a big rectangle whose only idea is for us to gain an insight on the performance. So, if you're a keen gamer or somebody who works a lot on performance, you already see that the frame rate is not so great, and the exact number is... It's a bit difficult to see. It's around 13, 15-ish frames per second. So, to zoom that a bit I'll use a very old school method of doing this, I'll paste it in Paint and zoom it and the reason is that the Windows magnifier actually reduces the performance of the Windows presenter, so we will skew our test if we did it with the magnifier. So, we get 15 point seven frames per second. It's very far from the 60 frames that we would like to have for a smooth animation. Okay, let's see what we did. Okay. If you look down on the right, we got around 50 frames. Actually, in my hotel room, it was 70, but who knows what's happening? So, we have a significant improvement on the performance. And the answer to that question happens to be yes, we were successful, and we'll see how that was implemented. So, today's agenda is we'll be looking at the basic issues of object-oriented programming. I'll just glance at the things that we believe can be done better. We'll look at the basics of data-oriented design. There are a lot of free sources. I have a final slide with references that you can look up later. The idea for today will be to design a specific system, to look at a design made with object-oriented programming, then to redo it with data-oriented design and see the difference between those two. What's so wrong with object-oriented programming? Why people have a problem with it? I believe that a picture speaks 1000 words, so this is an inheritance hierarchy. It's a bit difficult to follow, right? So, you've got diamond inheritance and whatever. So, object-oriented programming at it's gist marries the operations with the data they're done on and it's not a really happy marriage, because in the end, you end up with those giant objects with a lot of unrelated data in them that you poke in different parts of your software. If we imagine a game and you have an enemy, you might have a data about its physics, some data about it's AI, some data about it's rendering, and so on, but every time you access that object, for instance to draw it, you're actually polluting your cache and getting a lot of data from the other stuff that you're not using right now, and object-oriented programming also has the feature of hiding state in many places, and by hiding I mean that you have all these Booleans, all these hidden ifs in your methods that, in the end, increase substantially the complexity of your code. I know those things have a real impact on performance, scalability, modifiability, and testability and these will be the four quality attributes we'll be analyzing. Data-oriented design takes a different approach. It puts the data first. You can imagine it's in the title. So, if we look at the lower side of our slide, in object-oriented programming you usually have this logical entity with all its fields, all the stuff that logically belongs to that object. In data-oriented design, we do a bit of a different approach: we separate the data according to its usage. So, we might have data A to C with data A with fields A to C that is used in system Alpha, we get data B from fields from D to F, we use that in Alpha and in Beta, and we transform that data to a new collection that is used down the pipeline. So, the gist of data-oriented design is to separate the data from the logic. We try to use the data where necessary without polluting our caches and trying to keep our logic simpler. So, we embrace the data, we don't try to hide it. We avoid hidden usage. We'll see how. It's an interesting technique. We try to avoid virtual calls because, very often, we just don't need them, and data-oriented design is one of those things that, at its core, are simple, but they're very difficult to master, a bit like the game of Go. It promotes domain knowledge. You cannot separate well your data and build a good system if you don't know that data, if you don't know the problem that you're solving and the system you're designing. And as I mentioned, there are references at the end of the talk for additional information. So, let's design a system and the system I have selected is a system for animations, from the browser engines that I showed earlier, and to give you an idea about what is possible with such a system, this is a very small example from one of our samples, but it gives you the idea that with animations you can do a lot of things in CSS, so the example that we showed earlier for the performance was just moving rectangles and changing their color, but you can do more complex things like transformations, moving text, opacity, and so on, so you can modify different properties. The details of the system are not that important. It's more important for us to gain an insight on how to create it. The definition of the animation is again very simple. We have textual definition, we have our key frames, everybody who's doing animations know what the key frame is. It's I want the animation to start here and end here and has this property here and here and we have the duration of the animation. In a real world system, there are other properties, obviously, but these are the basics and the others don't change the design that much. We already saw that we can modify different types of properties and this will have a real impact on our design. This means that when we implement this in C plus plus, we'll have to interpolate different types of C plus plus structures or objects or whatever. We can have numbers, we can have colors, we can have transformations, and so on. And another important design is that this definition is not static. We want to allow the user at runtime, to modify the animations. So, let's try to implement this with object-oriented programming and for the example today, I've selected Chromium, and I don't know how many people are familiar with the Chromium code base, but I really, really encourage you to go and study it. It's an incredible piece of software done by one of the best engineers in the world, so there's a lot to learn. Of course, it's a giant code base, many millions line of code, but when you get the gist of it, it's very cool. So, Chromium actually has two animation systems. We'll be looking at just one of them. The other is similar in its design, so all the things that I'm talking apply to that as well, and it very closely follows the HTML standard, so that's why probably they have this object-oriented design in the first place. As it's object-oriented and we are creating animations, what do we expect to have? A class called animation. Right, that's exactly what I looked up in the code. There is a class called animation. That's great, but wow. It inherits six other non-trivial classes. At least it's final so we know there are no others down the line. And you remember the picture that I showed in the beginning about the inheritance hierarchy? It's actually the inheritance hierarchy of this class. So, up you get a lot of inheritance. It's already difficult to follow, right? Okay, so to gain a better understanding of the system, we'll be looking at the most common operation, which is ticking or updating an animation, that is from the time and the definition to generate the new value for this frame of the property that we're changing, the new color or the new position of our elements, and we'll delve into the code of Chromium. It starts very as we would expect. There is a method called service animations. It takes time at the second line. It takes the current time and it calls update on every animation pointer that is in our collection. There is a collection of animations needing update, so clear so far. However, we already see some strange things happening. They are not walking over the animations needing update vector. They are copying all the pointers and walking over and iterating over another temporary vector, and the reason for that is that, in this call stack, but also probably in other call stacks, they are deleting from the original vector. So, we have some unclear lifetime semantics. It is very difficult for us now to know when those animations get created and, most importantly, when they get deleted. If they are writing this code, there is a reason for that. So, let's look at the update method, what's happening right there. Fairly simple, it takes the reason, but the very first line is the hidden state I was talking to you about. If there is no timeline, which is some pointer, whatever, they return false. So, the animation will do nothing in some condition, but we're already paying a big cost just for this if because the animations are all located on the heap, so we're just touching that field just to see if it's set or not and we are paying for a cache miss and we might be having a lot of branch miss predictions as well, because if we have a good mix of active and inactive animations, the branch predictor will probably get confused, so we're paying some performance cost and we're exponentially the states and the complexity of our system. Okay, so if we have this content, which is another condition, what do we do? We delegate, actually, the animation to another class, which is held in the declaration of our animation as a member. So, Chromium does garbage collection in C plus plus, but you can think of this member type as a shared pointer. So, we are again paying the cost of a possible cache miss and this animation effect read only is actually the definition of our animation. All right. So, I'll be skipping some of the call stacks. I had a lot more slides, but had to remove them for time constraints. We'll be skipping the call stacks and looking just at the interesting parts. Next up, we're going to update our value, so we skipped a bit and we see some interesting stuff that happens a lot in object-oriented programming. In some cases, the animations have to call an event. Happens. So, if there is an event delegate, we'll call an event condition, and here we have brought coupling between our animation system and the event system. Moreover, we have no idea what the on event condition will actually do, so it might call JavaScript, which will destroy all the whole data that we have in our caches, it'll have to load all the call data for the JavaScript, execute whatever it's doing, and they you'll have to reload all the data that we just threw away. So, a possible very heavy performance heap. There are probably also some instruction cache misses because we don't know where this code is going in our giant source. So, we have calculated all this and comes an interesting part that I mentioned in the beginning. We want to be able to interpolate different C plus plus types because our animation could be on a color, could be on a number, whatever. So however, we still want to keep that in our system somehow, so how do we do that in classic C plus plus, in classic object-oriented programming? We want the method that is different depending on the type that we want to interpolate and they have different sizes. Anyone? Yep, virtual functions on abstract class. That's exactly how they do it. So, there is a class interpolation, which is abstract and it is inherited for every possible type that can be interpolated and the virtual interpolate method does the magic. Unfortunately, this type of dynamic type erasure in C plus plus is very costly to do. You have the cost of allocating all those interpolations and in our example earlier, that's 6000 animation, so 6000 of those, and you have the very likely cache miss when you call the virtual, and a very likely instruction cache miss, depending on the types of properties you're animating. Finally, we have calculated the new value, the interpolation did its magic. We have to apply that new value to the properties, to the objects that are being animated, and again we do it in a very straightforward manner. We have a member variable target, which is the element that will receive the new value. And again, we bring some coupling between the systems because now our animation system, whose only job was to calculate new values according to time, knows about the document object model, about the styling system, so that it can call the appropriate method of the element. In our case, the element is called set needs animation style recalc, and if we look at the implementation, we'll see something very, very scary. This is again a symptom of object-oriented programming, I believe, where you call a method, but as it is a black box, we have no idea what's going on, so you might end paying a very big cost on that and we're indeed paying a very big cost because set need style recalc, which is called there, actually walks up the document object model tree for every element and sets a Boolean, and on each call up this tree, we'll very likely be getting a cache miss, so we were doing simple animations, now we're walking up the DOM tree, we're again changing the context of our program. So, to recap very quickly, to do our animation, we've used more than six non-trivial classes, the objects contain smart pointers to the other systems, they are coupled with the other systems, for the interpolation we use abstract classes, and we are basically drinking the poison of object-oriented programming all the way. All right, now let's try do it with data-oriented design. Let's clear our mind, forget what we saw, and return to the drawing board. So, let's design our system around the most common operation and this is the operation of ticking, of updating the animation. It happens like 99% of the time, we have other operations obviously, like adding, removing, posing an animation, whatever, but the most important one, the one that defines our performance, is the tick. For this tick, we have two very simple pieces of data as input. We have the definition, obviously, of the animation, and we have time, and as output, if you think about it, we also have some very simple piece of data. We need the properties that have changed, we need the new property values, and we need a mapping the elements that will receive those properties. And again, as a cornerstone of data-oriented design, we will work on this system like there are many animations. With object-oriented programming and the thing that we saw earlier, it works as if there is just one animation, so you call update on the animation, it does what it does, but this is not what's actually happening. You don't have one animation, you have dozens, hundreds, even thousands. So, how will our tick work? It can be done very simply. Let's create a class called animation controller or whatever name we find interesting. It will contain an array of our animation states, it will take time as input. Those states are generated from definitions, and it will output exactly what we need: An array, a vector of the changed properties, an array of the elements that have changed with the links back to the new properties. And here, the element pointers are just DOM pointers for our system. It doesn't know what they are. It doesn't directly call to the document object model. With that data as output, we can leave it to the next systems in our pipeline to decide what to do. So, the events system can walk these changed properties, these changed elements, and decide to call the events where necessary. The styling system can walk over these elements and decide what to do, knowing their new properties. So, we've reached a lot of the coupling, a lot of the separation of concerns. How do we build those animation states? Well, it's very easy. Again, go flat, forget about methods, just have plain old data style struct with all the data that we need. The details are not important. This is the runtime data that we need, so the start time, the post time of the animation, and here is a bit of a twist: we also have all the animation definition copied for each animation state, and we did this for two reasons: the first one is that empirically, we found out that the performance is better this way and second, it makes some operations very easy. Imagine you want to change the duration of a certain animation. If the definitions were shared pointers, you would have to implement some sort of copy on write or other EDM to do that, but now that they are separate, you change just the property for that animation, all the others are completely unaffected. So, think about how that might help us, for instance, doing this in multiple threads. Okay, so we have these properties. However, we haven't solved the problem of having to interpolate different types of different C plus plus types. Remember, they were doing it with the abstract class, but we don't want to pay the cost for that. So, to avoid type erasure, C plus plus has a very good way to do it statically: we can use a template. We can use a template because we know exactly all the types of properties that we will be modifying. There are no new ones. So, we now have different types templated by the animation key frame and we would like to iterate over them to run the animation. However, obviously they have different sizes in memory, so we cannot put them in a nice, neat vector and walk it over, but as I said, we know each on of them, so what we can do is very simple. Let's have a vector for each type. Now, they are the same size, so we have as many vectors as types as needed, and when we want to interpolate them, it's very simple. We go over all the types A, all the vectors of that type, then we go to the second one, then we go to the third one. So, we are now having an order of magnitude less cache misses than we had with the object-oriented case. The reason is that we will be having a cache miss when we change the type of the vector we are iterating on. However, and this is again a bit of the main knowledge, but we know that we'll usually have a handful of properties that are really animated, so we'll end up with a bunch of very large vectors that we can linearly iterate and the prefecture of the CPU will do its magic and we'll not be paying the cost of cache misses and we'll end up with a lot of vectors that are simply empty and we'll skip over them. So, instead of having one cache miss for each animation, as it was in the object-oriented case, so O N, we're actually having one cache miss for each type of animation, which is an order of magnitude less. Actually, in the performance example I showed earlier, there were 6000 animations because there were 3000 rectangles, but they were changing the color and the position, so these are different animations, so in the object-oriented case, we got 6000 cache misses each frame. In this case, we're just walking over the color and the position, so we'll be getting just two cache misses, probably. And on the implementation level, it's very easy. We can template our decontamination function, according to the property and just have the same code running. I mentioned that object-oriented programming tends to hide a lot of state. We saw that with the activeness of the animations, so you can have some that are enabled, some that are disabled, and in the easiest way of implementation, you can have a Boolean is active or in the case of Chromium, it was if there is a timeline, which is basically the same thing, and do if this, then that, and whatever. And in our system, the activeness, inactiveness is the most important state, the most important Boolean that we have, so we can employ something that we call existence-based predication. We just have to erase. One array is for the active animations, one is for the inactive animations. The act of enabling an animation is just moving data from one to the other or vice versa. In this way, when we iterate and have to apply an operation on our active animations, we know that there is no hidden state, we can apply the exact same code, the exact same transformation on all of them. This is a bit difficult to do for every possible state that you have, so you can prioritize by the most important, by the one that has the most volatility, or you can try to reduce the number of states. So, to look a bit at the code that's final, the details are not that important, but if you look at it, it's very simple and in many ways beautiful because there are no ifs, there are no branches, and you can read it like a book. So, we interpolate the key frames, we interpolate the values, everything works as we would expect. And this get interpolated value function that I have highlighted is just a template that will do the right thing and call the right function for the property we're changing. And though this code can be safely in our C plus plus file, it won't pollute with templates all over the place, and will also keep the code together, very likely, in our final executable, so instruction cache misses are very unlikely. Okay however, our system is now working. It's very high performance, we are very happy with it, but we want to add an API to the user and we want it to be a convenient API. And here, object-oriented style of classes with methods and so on is really, really convenient. You can give the user just an animation object and she can call play, pause, and whatever. So, we want to give that to the user. So, let's create our animation object. Now is the time. It has all of the methods that we would like, but contains only one simple piece of data and that's an animation handle, just an ID, and the first all the operations to there functions the animation controller. So, our animation class is very DOM, but it gives this convenient API to the user while we can still rework our internal data and make our system as performant as possible, and we are using a simple handle to simplify the lifetime of those objects because now we know that the lifetime is completely controlled by our controller. There is no shared ownership over this data. And to implement the DOM API in the controller is very simple, you just have all the methods that take the ID. So, to quickly recap the concepts between OOP and data-oriented design that we saw, in the Chromium case we were using an inheritance of six classes and all the logic was crammed in this animation object and its links. In D O D, we did a very simple flat animation state struct, templated, and we had other methods that worked on that. We used a list of dynamically allocated interpolation, so abstract classes, but in C plus plus you can do better. You can use templates and if you know the types that you're having, you can just have an array for each one of them. OOP used Boolean flags for activeness and again, by flags it's not just a Boolean, it can be the presence or not of a pointer, whatever. And in data-oriented design, we used different arrays for the different states. And to give an API to the user, with object-oriented programming we directly give the API of the class. However, our user just wants to play an animation and immediately gets 100 methods that doesn't know what to do. We solve that by giving a very clear API only with the stuff that is needed. And finally with object-orientation, you reach out to the other systems in the program directly. With data-oriented design, we just output tables with the new data and let the other guys in the pipeline decide what to do with them. So, the key points with data-oriented design: Try to keep your data flat, try to use existence-based predication to get rid of this state all over the place, try to use ID-based handles to reduce the pointers and to simplify the lifetime of your objects, and try to use table-based output because this table-based output will actually be the input for the next thing down the pipeline. Let's analyze how we did. Okay, the first thing we'll be looking at is performance and I'll be looking at the performance of this system. It's six times faster. So, this is a hidden treasure for us. We did not change the algorithm, we did not invent some very clever way to do animation, we're executing and doing pretty much the same thing. Only by reorganizing how the data is used, how the code is structured, we've gained six times better performance on this system, and think about it, it's very likely that you have such systems in your code bases as well and the treasure is waiting to be found. Next up, let's talk about scalability. How could we hypothetically scale those systems and hereby scaling, I mean a very simple thing, making them work on multiple threads. Well, in the object-oriented case, it will be very difficult because we will have to account for all those dependencies for the data races that are very probable in the system, so we don't know when we say to the target to change its style if some other thread is doing that, we don't know if all the methods we are calling are thread-safe and we have to make sure they are. Another problem with the hidden state is that if we hypothetically solve these issues and run, let's say, half the animation on core A, half the animation core B, if they have a very different runtime profile, so if core A is lucky and gets a lot of inactive animations, then it will be idle for a lot of time. We won't have an even distribution of work. With data-oriented design, things look simpler. Our animation states are completely independent of each other, we duplicated some data to achieve that, so what we can do is just let's cut our vectors in half, half goes here, half goes on the other core, every one of them outputs its data, and in the end, in a classic fore-conjoined fashion, we can merge that data together and leave it to the next system. What about testability? Well again, with the object-oriented, it looks difficult. I wouldn't be the one that tries the only test for all of this because I will have to mock a lot of things, for instance, the events, the document object model just to test my animation system. Moreover, I have a lot of hidden states, so a lot of ifs and elses and so on. So, if I want to cover all of them, I have this combinatorial explosion of tests that I would have to do. In the data-oriented cases, again, looks a bit more simpler just because we have really isolated the animation system from everything else. We have our very clear input, our very clear output, so as long as that contract is not breached and the right input gives you the right output, you are fine, the system works, and the other systems down the line have the responsibility to use that output in the correct way. For the modifiability, such large and complex objects throughout the hierarchy stand to harden, and by that I mean they become so complicated that the programmer is very much inclined not to modify them, so it's very difficult for us to, for instance, to experiment with changing the layout of the object or moving some data from one class to the other, just because everything will break and we have to fix a lot of things. So, those things are usually left to big re-factorings that in the end, never happen. In the data-oriented case, as we are completely owning our data and it's very clear, we have a lot more wiggle room. We could separate our animation state into arrays, our animation could continue to use both of those, but we can send part of that data to another system and we can try and reorder some of the functions, it's a bit clearer. However, there is one thing that object-oriented programming excels at and this is what I call quick modifications. Imagine that we add a new state, like if it's Tuesday, the animation runs twice as fast. With object-oriented programming, it's very easy to do. If Tuesday, do whatever, else do the other thing. So, we add a new state and it just works. If we are to keep the pure data-oriented design, this is a lot more difficult to do because we would have to reanalyze our data, we would have to look for ways to keep this state out of our main kernel of the function that is running the animation, and it will probably take us a lot more time, but in the end, it will probably pay off. Obviously, data-oriented design is a tool like object-oriented programming is, like everything we do in programming is, and as every tool has, it has its place, but it also has some downsides and the biggest downside, I believe, is that the current data separation is actually very hard to do. I mentioned that earlier, but data-oriented design is easy to explain, but very difficult to master and very difficult, in the end, to really achieve what you're looking for; and here comes a bit the domain knowledge that is good for you to have. If you don't know very well the problem that you're solving, it's very difficult for you to design the data in the way that it will be optimal, so know your problem. As existence-based predication is also a bit difficult to achieve if you have a lot of those states because you start having these multi-dimensional arrays with this and this and whatever and my advice there is try to reduce the state. If you've done GPU programming, there are great examples there. You can try and reduce the state by executing the same operation on multiple data or other techniques. Just get rid of those ifs and elses. Quick modifications can be tough. That, I mentioned already. And we as programmers might have to unlearn a thing or two. So, we have been wired with this object-oriented type of thinking since forever, actually, so in the beginning especially, it's difficult to start and rethink everything in this new framework. And unfortunately, the language C plus plus is a bit difficult to use in some cases, is not always your friend. So, should we get rid of object-oriented programming altogether? I don't think so. It is a tool, as well. It has its place in our code bases and it's the antithesis of my title. But sometimes we have no choice. You might be using third party libraries that use object-oriented programming, so you're out of luck. You might have API requirements that make you need to use object-oriented programming. You might have seen that, but object-oriented programming is not a synonym for classes and methods and so on. With data-oriented design, I still had classes, I still had structs, it was that I put them first, I put the data first and not the operations, not the code. Polymorphism and interfaces have to be kept under control. I don't believe that their role is in a very low-level system, like interpolating a couple of numbers in the animations, but they have their place in high-level systems, in client-facing APIs, they are amazing if you do a type of plugin and you really need some dynamic polymorphism. And remember that C plus plus has amazing facilities for static polymorphism. You can go a long way with templates or just when you know all the types that you have, you can generate everything, actually. It works surprisingly well. Finally, are there some changes in C plus plus that will make it easier for us to write data-oriented design code? I have a dream, which will probably never come true, but there are languages that allow you to change the memory layout of objects very easily and to experiment, for instance, with array of structures, structure of arrays, and not having to rewrite half of your code for that. I don't see that coming, but we can always hope. I didn't show that, but in games, very often data-oriented design goes hand in hand with entity component systems, and even if you don't need the whole flexibility of an entity component system, it will be great if it was possible for us without paying a big cost to reorder and to split our classes into components, and this is actually doable in C plus plus and I'm talking about doing it without a pointer indirection because if you do it with pointer indirection, it's very easy, but it requires a lot of custom code that it's a bit of a mess. For me, the ranges proposal looks very exciting. I believe it will be a great addition and it will help a lot with some of the constructs, especially to make the code more clear, more readable, and this is a bit of a pet peeve of mine, but unordered maps and unordered sets are not that great because they do allocations. I think that we need in the standard another hash table that has some relaxed requirements and will allow us to use another scheme of addressing and reduce those allocations. So, obviously there are a lot of open-source solutions, we wrote our own hash table, and actually reduced the allocation count in one of our versions by 10%, just by changing the unordered maps and sets with a scheme that uses open addressing; and this again brings you back to the know your machine and design software for the machine that will run it. Don't design for a hypothetical abstract machine like it's in the standard because it doesn't exist. So to conclude, object-oriented programming is not a silver bullet, but data-oriented design is not one either. The thing is that you must use your best judgment. They are just tools in your toolbox, so they have their places. It's your job to find them. Thank you. (audience applauding) We have around 10 minutes for questions, so please. - [Man 1] Yeah, you're advocating and compile time polymorphism, and I like that as well, and then there are always people who say to me, yeah, but then you probably will stamp out a lot of codes and from your experience in also knowing the machine, when do you hit problems in that area? That there's too much code, so that you get instruction cache misses, and do you know of tools, how I can see whether those problems occur? - Yes, this is a concern. So, what we try to do is keep the templates inside implementation files. In the animation system, this is doable. If we see an explosion of templates, this is a problem that we try to cope to case by case, so you have two problems that you might explode your build time, which is obviously very bad for productivity, and the other thing is what you mentioned, and we have seen in some cases, the compiler producing a lot of binary, and increasing our instruction cache misses. There is no formula to solve that, to be perfectly honest. Either you have to reduce the templates or you can rely on some black magic and this is what we've done in some places, honestly. So, if you have the fortune on working on a platform that really easily shows you where the instruction cache misses happen, which we have access to, so on game consoles, for instance, it's very easy to see. You can go there and try to reorder your functions or you can try to use link time optimization. Actually, in our experience, link time optimization helps a lot with some of those cases, but sometimes you're out of luck and you have version one where everything is fine and then you change a line somewhere in code and you see five percent slower, just because Clang has reordered some stuff, and when you're working really close to the maximum possible performance, it's a thing of life. I don't have a formula to solve this all the time, unfortunately. - [Man 1] Great, thank you very much. - Thank you. - [Man 2] Hi. - Hi. - You've described and architecture for this system that includes data and operations on that data and I'm not familiar with this domain frame much, but I'll take your word for it that it's a good choice. But with those structures, you may want to maintain certain variance in the data, for example, and you can do that with constructors or methods. - Yes. - And you've also described that for certain parts of your system, like the user facing part or the client facing part, you may want to have those abstractions. - Yes. - And so, I guess my biggest question is what exactly do you consider to be object-oriented programming and why is what you've described not that? Because I think really you're just talking about choosing different abstractions for different parts of your system, and I don't really think that that contradicts this idea of OOP unless you have a different idea of what that is. - Got it. Yes, it's different way of solving the same problem, so it's the same in variance, that's true. However, we have turned a bit around the way it is done because with object-oriented programming, what I believe it means and what usually people end up doing is having this object that we saw animation that contains all the data, all the invariance, and all the operations for one animation, right? That's what we saw. However, this does not map well to the hardware. It also does not map well to the complexity of our system because we end up with these giant clouds that knows everything and can do everything. - [Man 2] Let me try to refine my question a little bit. So, the title of your talk is OOP is Dead, and I know you yourself sort of weakened that stance at the end a little bit, but what exactly is dead is what I'm asking? What precisely is OOP that you want to die? - The notion of having these giant logical classes that contain unrelated data and I believe they have to be reworked in different collections of data used by the different systems. Did that answer your question? - [Man 2] That just sorta sounds to me like bad engineering versus good engineering and I don't know if you can call that OOP. - Well, a lot of code base, unfortunately, end up with the thing that I showed because when you have the class animation, every developer usually goes and adds a new Boolean or adds a new data or adds a new method and you can call it bad engineering, you can call it however you want, but it's a fact of life. No matter how much you try and definitely the Chromium and WebKit are done by some of the best engineers, but in the end, you end up with this giant inheritance hierarchy and so on. - [Man 3] Hi, so if I understand correctly, there's a difference of about 15 to 20 milliseconds per frame in your initial demo versus the Chromium demo. Do you have a sense of where those 15 to 20 milliseconds were going? We talked a lot about virtual indirections and cache misses, but we didn't really talk about how many milliseconds per frame those were taking up, just that they were occurring. So for example, Chrome's GPU process is heavily sand boxed and it has to actually communicate via IPC just to even read the frame buffer, and so that's a potentially dominating source of cycles that obviously your program doesn't have. - Absolutely. So, my demo ran a bit slower than I was expecting, to be honest, but it's probably because the PC's not plugged, but I have an idea. This is definitely... It has some affect on the performance. That's why when I do the final comparison, I compare just this system. The first one was just a way to engage and to see that the overall system works better and there are a lot of reasons for it to work better. The GPU process, honestly, is probably not the biggest one. The biggest one is usually the less cache misses, the simpler call stacks. I don't know if you're familiar with the code bases of Chromium, et cetera, but they have very long call stacks and there are reasons for them to do that. I get that. But for instance, the animation system could be made. It just does the same thing and it's a lot faster and the milliseconds are here, so six, eight, and this is on this PC versus one, one. - [Man 2] So yeah, that kinda jump me to my next question which is that this simple demo is simple and Chromium is not simple and so, do you think that the data-oriented design scales well with actual required design complexity? For example, you got probably 20 other sub-systems that need to hook into the animation system and modify it or be notified when things happen or run JavaScript code and maybe yours does that, I don't know, but - Yeah, it does that. - Yeah. - It does that. And one other point on the performance, here, we usually see on PC around four to five times better performance and this was an artificial test, but I'm talking about real UIs and so on, but this is because this PC has eight megabytes of health-free cache. (alarm ringing) I don't know what that is. (alarm ringing) Okay? - Okay. All right, yeah. - If you go to a platform that has a lot less cache, like a mobile phone or a console, the difference in performance goes way up. - [Man 3] Okay. (alarm ringing) - [Man 4] Hi. - Hi. - [Man 4] So yeah, continuing on the line of what the previous person on this microphone was saying, I don't think you're being entirely fair to object-oriented programming and not all implementations of object-oriented programming feature all of those features you listed. - Sure. - [Man 4] And also, yeah, this is a case for, obviously, the classic approach with object-oriented programming is not applicable, but actually, I wanted to ask you a bit about this: six point 12 on what type of CPU is this? - On this? - [Man 4] On this one, yeah, because honestly, I think that it would be even more on an ARM CPU, for example. - Oh, yeah. - [Man 4] That have much lower tolerance for cache misses. So, perhaps use the ARM CPU statistics for your next - Yeah, I mentioned that, but the problem was that I had to build Chromium for ARM and I didn't want to do that, but definitely, if you have less cache on your processor, this number goes up, and to answer the first part, yes, I'm sure that there are in the world well-written object-oriented systems that have their place. That's what I say at the end. - [Man 5] Hi, I can see some challenges in debugging a data-driven system like this. I wonder if you had some challenges and how did you manage to solve them? - Yes, there are some challenges. The biggest challenge is we are accustomed to we see a bug, we go into the debugger, we have the subject, we can inspect all of the data, but when you're doing data-oriented design, you might know which data belongs to which logical entity and what we usually do is that, in the bug, we annotate our data so that it's easy for us to relate it to other stuff in they systems. That's what we do usually. I don't have a lot of - [Man 5] Do you think there could be a tool that would help debugging that would access the data because the data, if you have one entity, the data is scattered all over the place, so for a human to understand it, it's way harder. - Sure, I haven't seen generic tools. I have seen a lot of tools ad hoc like the ones that we have just to solve these, so we annotate the data, have some extensions for the debugger to be a bit easier for us to find it. I've seen a lot of game companies, especially when things go multi-threaded, it's even more difficult to find it. A lot of companies have some type of graph that they visualize or try to do it that way, but I'm not aware and I cannot really think of a generic tool that will really help. It's very domain-specific. - Thank you. - Thank you. - [Man 6] What is you're experience with long-term support and maintenance using this approach? My biggest concern is that, as time goes, you add more data to the structures and you keep multiple coop is, so the data would be like data upload, so in a few years, you pretty much has to reorganize all the layers and adjust them to the architecture the CPU's using to the caches and pretty much redoing the whole thing. - Yes. Well, we pay attention to what we do in the first place and second, I believe that this goes a bit on the slide that I said about the systems that tend to harden. I believe that it's easier to re-work and eventually split the data if it is done in this way than if it is done in a giant hierarchy of classes. So as time goes by, requirements change and the system has to change. We've had so far very good experience, so we've worked for years on Chromium's base and for some years on this stuff and actually, the maintenance time and the solving a bug and implementing a new future time is an order of magnitude faster and of course, it might be related to the fact that Chromium is millions of lines of code. Ours is very big, but it's not that big, so it depends. I believe that in time, this is easier to maintain. - Thank you. - Thank you. - [Man 7] You talked about that there was read-only common state between all of these classes that you needed to duplicate across all of this state. Could you talk about the challenge as to why that couldn't be simply pulled out and read-only state separated from mutable state? - Sure. Because it's not read-only. The fact is that the user can modify that definition at any time, so one way to do it is to do a copy on write because you might have multiple animation that refer that. The other is to copy it. So, as we have little data, it's an empiric finding that we did that it's easier and faster for us to copy the data. As it changes, that fact of life might change and you saw in the Chromium base, you might have noticed that the class was called animation effect read only, it's not read only. They have cost methods that do cost cache, remove the cost, and change them, so that's not true. So, it's mutable in their state as well. Unfortunately, we are out of time. I'll be here, so everybody who has questions is welcome and thank you again for your time. (audience applauding)
Info
Channel: CppCon
Views: 153,213
Rating: undefined out of 5
Keywords: Stoyan Nikolov, CppCon 2018, Computer Science (Field), + C (Programming Language), Bash Films, conference video recording services, conference recording services, nationwide conference recording services, conference videography services, conference video recording, conference filming services, conference services, conference recording, conference live streaming, event videographers, capture presentation slides, record presentation slides, event video recording
Id: yy8jQgmhbAU
Channel Id: undefined
Length: 60min 46sec (3646 seconds)
Published: Fri Oct 26 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.