Mathieu Ropert “This Videogame Programmer Used the STL and You Will Never Guess What Happened Next”

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Was hoping for a more thorough performance analysis. Having just one "accumulate" benchmark doesn't really carry his point across.

👍︎︎ 47 👤︎︎ u/flactemrove 📅︎︎ Oct 10 2019 🗫︎ replies

His point at 31:35 about open addressing hash maps is beyond ridiculous. He says that, unless you are on a server cluster, "where memory is free", bucket-list based hash map is the better choice, because it doesn't waste as much memory.

WHAT

You use the standard library, which allocates all over the place and fragments your memory, but in your quest to prove the haters wrong, you suddenly care about that 10KB that your open addressing hash map wastes due to keeping a low load factor?

Nevermind your vectors, potentially doubling in size when you insert an element. Nevermind your textures beeing multiple MB each. Nevermind even phones having multiple GB of memory. Nevermind that entire lua interpreter running in the background, because gameplay wanted to use a scripting engine instead of native code. Nevermind the use of virtual functions, that adds a v-table-pointer to each of your objects. Hash-Maps is where we have to save memory!

This talk is ridiculous.

👍︎︎ 55 👤︎︎ u/pulp_user 📅︎︎ Oct 10 2019 🗫︎ replies

Clickbait titles are particularly annoying for videos (since I can't even skim for a spoiler). What happened? I used to work on games and templates were banned (along with STL) on my teams. There were numerous reasons, and one was past experience. It wouldn't be as bad with just one disciplined developer (rather than a team), but sub-optimal compile times for a large project could also be a concern.

👍︎︎ 45 👤︎︎ u/NewFolgers 📅︎︎ Oct 10 2019 🗫︎ replies

I stopped watching this video because there was so much strawmanning going on. Taking common complaints about the stl and disproving them is a neat idea, but if you then proceed to strawman the hell out of those complaints, you won't convince anybody. I don't know if he does this intentionally or not, but it leaves a very sour taste.

For example, his take on "STL performance is bad" criticism:

The function he benchmarks is a completly random example. A very specific case that does not have the bottlenecks of a lot of real code. He then builds his entire argument on profiling that function.

He shows a 4x speed difference in debug for stl vs non-stl and then basically doesn't consider it relevant?

He compares C and C-Style C++. That is not a relevant comparison, comparing C and "idiomatic" C++ would be meaningful. The point of modern C++ is that you DONT want/need to programm like in C.

He basically concludes/implies that, since both C and C++ are an order of magnitude slower in debug, differences between them don't matter. Which is a strange thing to so say, since I can definitely still tell the difference when running my game at 60fps while debugging vs 15fps.

👍︎︎ 32 👤︎︎ u/pulp_user 📅︎︎ Oct 10 2019 🗫︎ replies

Honestly I don't quite understand why people are so eager to make game programmers use the standard library. The way I see it, if you're making a "small" game, even the STL debug performance will be good enough, and you're not going to be that bothered by compile times. And if you're not making a small game, writing some data structures that the STL has equivalent classes for is a tiny amount of work compared to writing the rest of the game, so even if the benefit of having your own classes is very small, you're not going to convince anyone to use the STL by saying "it's not worse". It would have to be substantially better. But obviously it fundamentally can't be better than something purpose-built for the game.

IMO the standard library just isn't right for everyone, and I don't see how it conceivably could be in the next 10 years.

Language features on the other hand I understand. "Sutter-Exceptions" are something that I am excited for and I hope a lot of people will adopt. Or at least, I hope that eventually we will have some sort of unified error handling mechanism that most people are happy with. But that's because using these language features actually means using a completely different paradigm. Using or not using exceptions is almost like writing in a different language. Not using the standard library really isn't. You can still use all the same principles when "rolling your own". If the classes are well designed they're just a couple of custom data structures in the midst of probably 100s of other custom classes and data-structures, they just happen to be similar to things found in the STL.

👍︎︎ 20 👤︎︎ u/vaynebot 📅︎︎ Oct 10 2019 🗫︎ replies

haven't seen the full video. but I do like it sparks so much conversation. This specifically sparks interest in me as I am a gameplay programmer myself.

Our company does not use STL, but I kind of wished it had. The library we use is very old so my guess is that it is because back then STL was not compatible with everything yet, or it just did not offer enough yet.

In my opinion I like STL, It is very standard, everybody knows about it and you can take that knowledge with you to other companies. I suspect that the performance in release mode would be pretty much the same as with other alternatives. Debug is a whole different story though You can use alternatives on locations where it is slow in debug, and just not completely ban it from an entire project

I don't see a reason to ban it for a project... there is so much useful stuff, I am specifically bothered about it on the company I work at, because we don't actually use any other libraries. if we want to have a thread class we need to implement it ourselves, and make it compatible with other systems. its such a pain in the butt. you can save yourself 1 week of work by just using 'std::thread', which just takes a second... and i'm not even talking about the mutexes yet.

I'm very curious to see the opinion of somebody who has worked with a company that actually uses STL and dislikes the use of it, since I have never really experienced it..( only personal projects ). And I can only imagine for it to be a better place but that's all it is.

👍︎︎ 9 👤︎︎ u/jessedis 📅︎︎ Oct 10 2019 🗫︎ replies

There was a comment about flat_map and flat_set at 52:57 which I suspect the speaker misunderstood because he starts talking about "open addressing" which I think has more to do with hashed containers. The "flat" containers are closer to using a sorted vector as mentioned at 22:16 except that they use binary search instead of linear search.

👍︎︎ 5 👤︎︎ u/HappyFruitTree 📅︎︎ Oct 10 2019 🗫︎ replies

Doesn't msvc not let you publish benchmarks? With so many MS people here maybe can get an answer to this?

👍︎︎ 4 👤︎︎ u/ReDucTor 📅︎︎ Oct 10 2019 🗫︎ replies

The point about std::accumulate is misinformed. std::accumulate by definition is non-associative and non-commutative, meaning all operations will happen sequentially and will only be applied left-to-right.

template<typename InIter, typename T, typename BinOp = std::plus<>>
T std::accumulate(InIter begin, InIter end, T val, BinOp op)
{
    while(begin != end)
    {
        val = op(val, *begin);
        ++begin;
    }

    return val;
}

Invoking op in this way prevents the compiler from reversing the operands. val being a dependent variable in each iteration of the loop prevents the compiler from performing the operation out-of-order with two members of the range. This prevents incorrect results when calling std::accumulate with an operator that isn't commutative or associative (e.g. with std::minus<>).

It would be better to instead compare "RawAccumulate" with "std::reduce", which tells the compiler exactly what to do in order to reverse operands or perform the operation out-of-order.

👍︎︎ 1 👤︎︎ u/NoHomeLikeLocalHost 📅︎︎ Oct 11 2019 🗫︎ replies
Captions
come everybody to this video game programmer you is to use the STL and you will never go you ever guess what happened next well even I can pronounce it so about this title studied as a joke you know I make jokes on Twitter sometimes and sometimes people like them and I kind of take that as an encouragement so I thought about like maybe I should have done something you know let's click let's keep a key like I don't know like I steal and video games but then you guys would have needed to go on my on the page and then read all that to know what the talk was actually about and I mean we all know reading is for nerds so I save you some time I think that way I could also have done like for something like STL video games considered harmful but I think I am not the right person for this talk anyway so this video game programmer use the STL we all noticed yet right I still have a Fleur in my face but just quickly show of hands who knows the STL or heard about it okay you're Ryan's I guess okay so yeah all comes back to the work of step enough a long time ago proposed in 93 adopted in 94 I don't know what the committee is doing this day but at the time it was efficient but all the STL proposed adopt understand at one year great offers you know a lot of generic containers algorithms iterators the whole shebang for people who are not like as old as me this is what 1994 looked like just just for frame of reference to just another check who was born in 1994 that that's not many hands I feel old now congratulations many if you hold life right so it has evolved since then right like we we had all the great books that Scott made about the STL you know effectively to use it we even have a map of this TL right so it's literally not uncharted territory we have a map we know where to go and then you move to the video game industry and you hear stuff like no we don't use the STL here what all right so hello name is matyoo I make video games yay I'm a cool video game so got history and I talk about stuff on my blog and not always beasts build systems so what are we gonna talk to you with you about today first we're gonna see the case against the STL because there's been some controversy these to say the least see the past year or so against this year from the video game community so we're gonna try to delve a bit into that and see if there is any any point to all that noise we're gonna see a few containers in practice every time I give this talk I cut some containers out of it because I ran out of time but we can talk about them after don't worry about we're gonna talk about some alternatives because yeah sometimes we don't use the STL and we have alternatives and there's gonna be a word about debug performance because that's one of the big thing that usually comes back what the this talk is not about because I only have an hour questions included so I won't talk about the locators all the problems with alligators I won't talk about exceptions and I won't talk about build times you can bring questions about that after the talk or and after after the talk we can talk about it but I don't have time to this so I'm just gonna sweep that over alright so let's go with the most obvious question is dlc oh so bad so what what we hear exactly when we hear people complaining about the STL we hear that it's unfamiliar you know with two iterators and why is not just member function on containers like that that's just real maybe you just say like ah but I haven't using this platform it doesn't support the SEO that's that's also a problem right like you can't use it if you don't have it we hear that is bloated you know people they keep having stuff that I really don't need and of course the performance is so let's go over one by one so familiarity it's probably the argument I have the hardest time hearing because it's been 25 years some people in this room are younger than the stl you had virtually your lifetime to learn it all so it's not like it was just a one-shot thing and then we forgot about it like if you look at the api's of other libraries popular libraries boost up cell you name it they kind of follow the same thing so it's probably because these guys were onto something and also resources are plenty like even at this conference there are probably there's probably a bunch of talks about containers algorism VHDL all that stuff we we get that all the time there's a lot of a million reference books talks everything is freaking just like if you have the internet and you're doing C++ I suppose you have the internet if you're watching this online you definitely have the Internet and look at the suggestion on the right there's probably something about the STL and whatever and I mean the research from from from Alex that gave us this yearly sound like I don't want to be the guy who has to argue against step enough that that our approach to generates is wrong I don't want to be like that guy I don't think I can maybe you can and if you can please do it because there is probably something interesting and I will just be like watching like two guards talking but I just can't make a document right now maybe though we need to be better at teaching it cuz I don't know about you but the way I learned C++ was the hard way as in I made a great UML diagram and then I put a lot of inner essence of virtual classes and I was super high product myself these days not so much I try to forget about these episodes but yeah what I'm saying is that yeah maybe we could be better at teaching it we have the resources maybe universities I mean that's that's a big topic that is g20 is working effectively on end and I think we're making progress availabilities another concern right but major vendors have a reasonably good implementation of it I mean if you use GCC or clang or MS VC which is probably do a vast majority you have it if you use consoles well if anything about console is that some of the SDK or under NDA so you can't even talk about it but I have from sources that they also use one of those so you probably have it too and I mean it's software right as any software yes maybe it has bugs and they should definitely be raised and you should definitely talk to you maintainer to all vendor to tell them that a you use softwares but because you know your your actual users won't won't shut up about bugs in your software you shouldn't do ivory the icebergs in the software years just just tell people about it and of course keep up with updates because yeah sure if you still resource to you six yeah maybe you have a few bugs also and there's a more important issue if your vendor is not able to provide you a good implementation of the STL I would be worried about everything else you provide you I'm not gonna name any about anybody in particular Oracle but uh but yeah chances are if you can't have a reasonably good STL given by your compiler vendor I would be worried about the kind of code it generates so ya consider open source alternative again I don't have time to talk about some experience I had and how GCC was so much better even on a proprietary microprocessor but it was that brings us to the bloat because I mean and I'm kind of with those guys on the thing I want to use some containers some algorithm some functionality but it comes with a bunch of stuff I don't need and I don't want to pay for it and it it kind of look over complicated for what I'm trying to achieve you know like have you ever tried to make a random number generator I look at Mersenne twister every time I have to do a random generator on the internet I just can't remember it but I mean the problem is we have to we have easy standard has to work for everybody and it's general-purpose usage right we have to do to fit stuff from like from from like video games from like embedded to like huge servers bill farms and all that stuff and it's kind of a hard position to make like I'm not even sure truffle for them could like be able to do that but it's hard but they try very hard and I think they get somewhere and sure the design principle says if I don't use it I shouldn't pay for it in practice sometimes you can have ten policies per class just to make sure that you can disable all the bits you don't want and even if you had that there would be a compilation time overhead because of all the skinny magic that you would need to actually disable everything you don't need movies comes techspring concept is getting better but still there would be a cost so yeah for example a big thing I keep hearing is the vendors have debug features and I don't want to be helping debug features I know what I'm doing so why am I paying for these things I don't want these debugging generators that that are not gonna want me that I am gonna compare to iterators from different range or I'm gonna increment past yeah another I know what I'm doing I don't want pay for this well turns out I can cuz you know I almost promised on Twitter I would not talk about Bill I lied sorry there's this slide yeah there is a bill flag somewhere but define something that you can use to turn off most of those debugging feature and stop paying for them also and that's another thing debug checks do not make no optimization there is no reason why you can't have assets and a reasonable number of optimization CMAC does it wrong that's not why you shouldn't be able to do it but since we're talking about performance because that's that's that's the big thing right the reason why we care about like we are worried about debug and extra features is mostly performance we we're doing video games right we we have a time box we have to render in X milliseconds or there's gonna be like jitter or freeze frames or whatever lots of thing that we don't want to see so we don't want something unpredictable we don't we work we're worried about working scenarios and all that stuff and and there is a common wisdom to that is that if you think about it like the first thing that comes to mind is that the better control you have because you the lower level you go to better control you have over performance right because you don't have abstractions you can just write pure row C or C++ but mostly C and then you're doing control right your low level you close to the machine you know what you're doing and you avoid all those abstraction penalties that's that's what I at that that's the common ID you might keep here I don't like show friends who think this is actually like true that basically yet the lower you go the better control you have and that's that's the way you go for performance come on don't be shy no it's his first press conference but don't be shy okay I can't see everybody but okay I'm not many people you see what I'm going maybe all right so sure we come with a degree of abstraction we have time place we have interpreters we have tracked iterators we have proxy iterators requires a good optimizer to yield good performance and if you want to for example make if you want a good example let's let's do a benchmark I have a I'm trying to just do a stupid accumulate function right I generate like 10,000 integers with with a Mersenne twister it is hidden somewhere and I found on the Internet and then I just loop around the thing and just accumulate Weaver roll like this is as close to the machine as I can do without doing assembler and then we do the same thing with the STL so we run start accumulate and then we bench it yeah yeah that sucks just yeah is like four four four four seven times slower that is not good if I try weave weave new implementation it's a bit better it's only four times slower instead of five that's still not great right I'm talking debug okay I know optimization whatsoever so yeah when you think about it sounds true if you had abstractions you're gonna pay for it at runtime unless you optimize it's gonna be slower so yeah sure go closer to the Machine and you have less problems and that's why this guy comes and it's like yeah that's why you see C++ is so bad without optimization like now just forget it so let's go back a bit to 1994 which is an interesting year for example is to miss my tenth birthday so it's obviously an interesting year but do you guys know what this is so fine okay so I guess it's the same people who are older than 1994 so it's it's a great 486 you guys know what's so great with the 486 what's the what's this what's the thing with 486 but not at the sorry yeah that was that's not what the thing I was leading to but he had a math coprocessor yes know the great thing had was it was a simpler time because that's the last CPU that did not do all of order execution that is the last time your CPU when you wrote C just as dumb bus way possible executed the instruction one by one in order waiting for whatever fans memory burial you would eat without trying to do anything else even if it was completely unrelated 1994 most of you are older than this most of you have never had to program against the CPU that actually was sequentially executing instructions that's the thing your assembler is lying to you your C is lying to you what you're writing is not what I what's happening your CPU is trying to do out of order execution because turns out it's much faster and so here comes a question ho-how is that important like if I write grossie that gets translated as the pure row assembler I think it does that has been made for my great cereal not out of order per CPU shouldn't be too much of a difference right anyone want to guess if I take the same algorithm I show right now and I enable optimization or not on the sea anyone want to guess is it gonna be like the same ballpark is it gonna be faster yeah yeah I mean at least it's probably gonna try to vectorize sure so I at least expect like 4x right something like something around for maybe eight depending on what Cindy I have on my cpu cuz you know my dumb see does not actually use a Cindy but my optimizer might be able to so four to eight eight times that sound about reasonable right so let's write I write this grade benchmark and then I abuse a lot of quick bench sorry Fred it's not the worst I've been doing yet and I just tell crying it's not optimizing function for me and just put the same one in the same bench oh wow turns out I have no idea what I'm doing with my row see if my compiler doesn't come to my group pure C and actually save me from myself and completely reorder whatever I've got a meaning I get 26 left 26 times slower sure if I can if I manage to vectorize manually I can probably get 4x better so it would be like I'm bad at math but something arounds still six times lower so yeah as it turns out yes C++ is soon sorry spoiler warning C++ is slow in debug but C is not that bad that better I mean sure is better but it's still terrible compared to what you actually get but you have the illusion that by writing row C you get closer to the metal and get the best out of it no no no your compiler gets the better best out of it you just you just you just playing a losing game and actually since I like to talk to a quick bench I manage to do a benchmark with different level of optimization in the same translation unit I'm worried of sharing the code because of what people would make with it but here is the ID you have Oh J which is minimal optimization for debugging with GCC both with the rosy version and the C++ version and as it turns out the C++ version is faster in OJ and then you have the Row the row see pure version without any optimization optimization whatsoever and then you go on and on and on and all freeze when is when vectorization kicks in so you see another boost of performance but the thing to remember is Rossi is terrible if you don't optimize it Rollo C++ is even worse but if you just turn the minimum optimization from your compiler you can even beat C so yeah that's basically what I just said C++ a fraction will be slower than row C but actually both C and C++ all stupidly slow compared to optimize really the builds its we're very far away for more 1994 processor we really like you need a PhD in or two to be able to write good assembly for for our CPU and turns out your compiler is made by several PhD so it knows how to do that and it knows which version of the of the CPU you have and how in this particular case is gonna do something better sure you can might be you may be smarter in some cases but on average even if you Guru row C you just it's just gonna like don't show up it's just gonna be a completely one on one sided and again minimal optimization enormous games so like if you're working on GCC oj oj is gonna save you by a huge margin and then yeah we get into the area of compilers and actual vendor optimization some are good at it GCC is good at it a mess this year some knobs you can turn to get okay performance and still be able to debug it variable made a lot of efforts to be able to debug optimize bills this probably some room for improvements I mean just as Chandler about oh one on clang and he's gonna tell you that it's lower than a zero i bench it is true no your bill flags oh sorry that's the second side of our build anyway let's talk a bit about those containers now so again I I need to cut slice every time I rehearse this thought so I'm just gonna try to focus on the most used containers so we're gonna talk at race we're gonna talk dynamic arrays we're gonna talk associative containers and we're gonna talk to hash tables that's basically the big four thing you use I mean you guys know you shouldn't be using lists right good so let's talk about vector we all know vector will love vector it's a heap allocated array that can be resized it's the go-to container in STL I think it's probably what people send more and more like if you don't want to thank goo for vector even if you don't want to think a lot still go for vector it's shift to move each shift random access it's as fast as it gets to iterate over and the big reason is and I think I'm not the first one to tell you this is caching who's again your CPU is not a 486 from 20 years ago it has a lot of cash because it turns out accessing Ram is slow and I'm gonna highlight the important part I think thanks for high-tier for the phone is perfect by the way if you're going from register to register is basically one cycle maybe even less I don't even know how you can make what less than one cycle but they managed to Congress you vendors like that's magic if you hit something that is in the level one cache it's gonna be I don't know like three four cycles still still slow but acceptable and then the further down you go the worse it gets and if you have to hit main memory this graph is like two or three years old you're gonna be wasting like 100 150 200 cycles just to get your data and continue that's a lot and that's why we like vectors because everything is contiguous in memory and you constantly or most of the time eating K eating l1 cache which is mega fast and that's great and since we have so much of an impact of course the big takeaway is that sometimes being always lying to you sometimes you do in a linear operation on a vector and it's faster than the logarithmic operation will never contain you because of caching because you're starting with a sorting with a minus we've a factor of a hundred over as a penalty so yeah if you have a small set and small set may vary you might have to bench it brute force through a vector will be faster than like a map or whatever and again another quick suggestion if if you're doing like really intensive associative sets but you only read it consider solving it just solve your vector and and just do a linear search to it it might be faster depending on what you're doing some hash table minded will manage to do better but not all of them of course if you still if you need to keep references to elements and vectors prefer I need indexes to do pointers because when you push back something it might reallocate and move the stuff away where has you indexed will still be valid as long as you're pushing at the back just just move small IDs but what are the limitations because we're not here to talk about how great victories we are here to talk about limitation that we as video game programmers find in the STL so what are the limitations of vector there's none Hector is awesome enough vector it's great okay and maybe bias there now I can find some limitations the growth factor is not specified by the standard and it's not configurable by understand the driver and most implementations will not allow you to do to change it and there's been a fair deal of research by people who do math which I don't understand that says that 1.5 is very good and 2 is not that good and it has relationship between Fibonacci and other stuff and the golden ratio I have no idea what it means but the gist of it is some we should start just at one point fine is better and you can't control it that's not great also by standard specification you cannot do small buffer optimization and allocation may be costly because on worst case on an allocation you might have to go to the to the operating system to get RAM and the operating system might have to start doing swapping and maybe you will skid yoga process out while he's doing that and then he's gonna just stop forever well I mean as long as cycles are or or considered it's gonna be forever so you want we might want small buffer optimization which basically means every time I allocate my vector I just allocate a bunch of elements immediately in the same go especially if it's on the stack and then I don't have to pay for that completely unpredictable a location and that's great but the standard prohibits it because the standard says if you move a vector you don't invite that references and you can't do that with mobile for optimization because you're actually gonna move the bits and let's not talk about vector of blue just I would say forget it exists but no remember it exists and you should not use it so what are the alternatives the big one of course is mobile for optimization boosters one Facebook is one Google is one it's great it avoids if allocation for small sizes you can specify which size yeah just just remember that it might be a linear move and not and not constant because you actually have to move the bits from one pair from one vector to the other so depending if like if you make a small vector of very big objects you might actually notice something if it's small object sure go for it I I don't think there is a paper on this or there might have been one at some point I'm not sure where it's going now but if anyone has something to suggest just come to me after there's a raid an array is like vector except it's fixed size at compile time you say okay I want an array of X elements it's come from C++ 11 but I mean who is not using C++ 11 here come on don't oh you should shy oh is literally nobody Wow time flies great so it's great because blah blah blah has the same properties that uh that vector has in terms of cash and mundum access and yeah you have to move all the beats every time you move it so remember that not because it's move that it's cheap and that's that's the thing I'm trying to hammer on here move doesn't mean free it might mean cheaper but sometime it's not sometimes it's as expensive as copy the biggest problem with array that I usually find and I think I'm not the only one is that it's fixed size in that capacity and there is a couple of use cases when you say I know I'm not gonna be more than like 60 elements or a hundred elements at one time but I'm not gonna allocate all of them now sure you can try to put like Sentinel value and remember how far you got but that's just messy what you would like is clearly to have a stud array that you start with zero capacity we've X cap asset II you you set but zero size and then you can push back in it and then you can pop back in it and yeah steadily is not suitable for any kind of dynamic insertion and that's that's just not great so we want a fixed capacity vector basically boost is one es TLS one Facebook is one and actually someone proposed it to the standard I'm not sure what's the status on the of the paper right now but there's paper on it and I think it's still moving forward so hopefully we will get it at some point the name is static vector I don't like it but touch with me now let's talk about maps and sets because vectors probably your bread and butter but then the next thing we have is index search because we won't like fast food cups so map and said classic sort is associative containers do you want you want some you want you want some mapping and you want you want it solid it gives you usually a log right big time on accessing session and erase it keeps iterate or valid even if you remove everything else like as long as you don't remove an element dieter a toast a valid you can put as much as you want in it referenced ability might be good especially for newcomers people learning C++ or others because it's well you much less likely to shoot yourself in the foot and since everything behind the scenes is just a located somewhere it's a one to move so again cheap to move and we like it but it's almost always implemented as an hour between and that's really not cash friendly I mean if you're lucky if you have the right platform and you feel it in one go chances are all you nodes are gonna be stored contiguously memory because you kept calling the same allocate or over and over again with the same size and possibly just cut keep changing your data over another so you access one node the over nodes that you inserted might be close in memory might not be salted closed but they might be close but still not great and especially if you have something that leaves a long time in memory it's not cache friendly at all because there's no way you can predict where the data is gonna be every time you jump through one of the songs or parents of the nodes and of course it's still logarithmic and not constant and I want performance and one is better than log n I think maybe there's some exception but I'm not good at math as long as n is better than one I guess I guess it's true whatever so yeah this guy again it's like now they have terrible performance please don't use them can we do better can we do better that this terrible arbitrator has very bad cache coherency and all that stuff well actually not really cuz my standard there is a bunch of construing that out there the standard says if you insert something you don't invalidate anything else if you remove something it only I did anything else it's basically impossible to do with all the tree you have to have some kind of structure that not reallocate and move off of things around you can try having a customer locators but I said I'm not gonna talk about a locators so let's just wait that for now but if we start dropping constraints now it gets interesting for example X it doesn't have to be sorted because the minute it doesn't have to be solid I can start I can start using obvious I don't have to use a tree I can use a hash table and that's exactly what we have it's called an olden map and an ordered set it's great it's average constant time on insert the raisin lookup and we're getting close to what we wanted oh please go again yeah cuz there's this thing you know that school open addressing and it's like it's the gold standard of the hash table and if you don't use it just just go home like yeah just just stop stop making computer science I'm just kidding but that's the kind of thing you might hear cuz yeah I want the best I want second best I want the best so sure you can get better performance we've open addressing there is a bunch of papers on that there is a bunch of hash table public implementation that use it Google has one offers everyone it works great but actually it's incompatible with the standard again our dues time standard people why did they do that why do they keep blocking performance well actually it doesn't come for free because nothing is free Chandler told us that a couple days ago takes a high time and space trade-off yes based on a space trade-off to do because if you do open addressing I'm not gonna make all the demo here cuz I don't have all the time but you can can talk about it later but the biggest idea is the load factor do add the optimal load factor for an open address ash table is much lower than a bucket based bucket list based the hash table which means on average you're gonna waste much more memories with empty slots so sure that's a trade-off and maybe you're on a platform that you know memory is just free you know you're running some kind of server cluster and memories just a number sure then go for it but some people actually have to pay for that memory some people have platforms that don't even have swap and they just don't wanna afford that cost and also open addressing usually means you're gonna have even if you don't rehash you will break reference stability and again that's something that might shoot people in the food and also and that's that's the big the biggest thing I think I discovered when I researched this talk caching good open addressing is supposed to be faster because of caches because you don't have to use bucket list and so you have better cache coherency blah blah blah that is not even the first and the biggest reason why some STL implementations are slow it has it has nothing to do with the standard just don't use modulus just quickly when you hash something you have to match it to a bucket and then the common and maybe naive IDs just to do a multi lieu of the size of your hash but as it turns out oh I forgot some slide somewhere but modulus are super slow as in a lot of cycles and a lot of cycle that are blocking the next memory fetch so very slow and you don't have to use them and as as mount show a show that at CPP now I think was not yeah last year I'm going to have the date there Microsoft does this right the income where which is the implementation of DST L at Microsoft uses it doesn't use modulus and guess what it's actually reasonable performance clang in GCC maybe don't use modulo I could talk a lot more about this but mount it just do it better than me and there's this technique I learned recently that is the technique of the pros which is at some point just insert the reference to another talk and then continue so watch this talk it's great it tells you everything you want to know about hash tables and how you can even get more performance if you are ok with even more trade-offs and how is implementation if you have to be stuck with Linux or GCC your client can give you reasonable performance similar to mill to Microsoft's implementation but now I want to talk about the ACL a new a new online it's the camera there I hope it is else I would look really silly cuz he was it was the kind of tweets that started this talk I'm paraphrasing because I don't want name anyone in particular but yeah this is this kind of thing I can read my Twitter sometimes dice not okay first of all and that's usually the first thing people reply committee make specification not implementation that might sound like a cop-out but it's actually how it works and that's that's the rules the standard makes a standard it's a reference it's to assess specification and people implemented you're blaming do you usually people are blaming the wrong person when they're going for quality of implementation because never I mean some people in the committee happens to the compiler vendors or work for compilers vendors so they actually implemented but that's not most of the people and that's most of the time that the people who push the paper to the Comedy Basu C++ is a general-purpose language we have this tendency of saying why is not my use case the only use case that matters well because we all have use case that matters and at some point we have to do something that is okay with 99% of the use cases and that means for example that we will not make trade-offs that are way too expensive for some people and that for example is the the reason why - table is not using open addressing even if there is research that suggests is faster because that's the huge memories trade-off and you can't impose that on everybody and I don't know if I should have to say this but the social media rant is actually a very bad way of getting your point across believe me I tried is anyone following me on Twitter okay so let's talk about burden of proof for a second right people use the sto every day I know do you have a phone I mean of course you have a phone like I have one it's Android I think yeah so basically it's clang if my phone works I guess clang works and claims that steel works and my laptop works and I think it's made by Microsoft and I think they probably film dog food and you know everybody in the world uses clang GCC Animus DC on a daily basis and if there was a huge problem there's huge and someone would have noticed this I'm not saying there's no issue there is issue every software issue nothing is perfect hold those jokes against Microsoft or whatever Jamie come from somewhere there used to be some problems there there are no software I mean I I don't make software it's perfect I don't think anybody does sure but the burden of proof is on you because the rest of the world is just using the software and it kind of works so if she starts saying stuff and that that's the thing I hear usually it's like oh no no no no we found the bug somewhere in infrared or in array or in fact or whatever no no it just it just bad it's but we have an alternative and then you ask what was the bug and they're like we don't remember but at some point some here that was a bug do you know the story about the the monkeys it's a social experiment that was done a couple years ago probably 10 or 20 years ago they put some monkeys in in a cage and there was like a banana that was coming up from the ceiling and they could grab it and eat it at some point we introduced a new mechanism with like some kind of camera or I think it was Matt in the cult remember if a monkey grabbed the banana everybody would get sprayed without Weaver like some kind of jet water so the monkey quickly learned that they should not grab the banana that it was a trap and then they started slowly introducing new monkeys and removing the olders and of course when a new monkey come up doesn't know about it and try to grab the banana all the other monkey jump on him no no don't do it it's bad and so what vet it is that they waited a couple I'm insane generation but they replaced him all of them over time up until the time that there was no monkey that actually had seen someone grabbing the banana and getting sprayed and then they disabled the mechanism and then they introduce new monkeys we try to grab the banana and they got bashed the hell out of it because you don't grab the banana why nobody knows anymore but you don't grab that banana that's that's the same ID it's like yeah you don't use thread because it's booked house I don't know but this guy told me was but if you find a bug there might be one sure we put it to your vendor right but also if you need to fix it and you can't wait for an update or you don't get an update make a test somewhere that shows that there is a bug in the STL and that your implementation pass the test and the STL doesn't and revisit it from time to time don't be the banana eating monkey I just that is not good because yeah people fix by bugs also this year has to be good enough which means it cannot make unsafe assumptions referenced ability memory event I said that before we target the the standard curl targets the most common use case so it has to it has to make some some decisions and maybe you disagree with the trade-off and that's okay specific cases can benefit from specific implementation of course we can that's the basis of compare science if you have a very specific use case you can make some trade of decisions that a generic implementation can't and I'm not gonna stop you and I should not stop you do it but remember the corollary if you make a fit to produce alternative for one specific case you want encountered it doesn't mean it's gonna be great for every other cases that never had the problem in the first place and as the Narconon said the real problem of programmers is that we spent two five month to far too much time worrying about efficiency at the wrong place and at the wrong times basically the idea is there is a good reason for an optimization in a specific case that doesn't mean it should be the default case for everybody else I've seen too many people say oh I have this one color case in this case where a small buffer vector is better than vector so now the code standard say you don't use vector you use small but for vector everywhere it's too quick on conclusion I would say and finally I will talk about engaging with you because we all have sometimes think we want to say it's fine but usually ranting on C++ or Twitter doesn't make C++ better sometimes you make jokes and they become great things I mean we have got in the room you made a joke once and now we have in C++ that's great and sometimes people make a joke and I end up making a talk yeah I mean I'm not saying it's great you will film you at the end but it's great hope also if you give a talk about the problem please don't give it at the conference that I am totally not naming here that is a good conference but has every and all of his video behind the paywall because most of the people here and at the committee will not see it make sure that your voice is heard where the actual C++ community is and where the people who can do something about it or decide to help you doing something about it can hear it go to meetups go to conference go to go to his ISOs to the group sg4 things up and to everybody basically if everybody collaborates we will go we will make more effects faster and better progress and the bigger the sample the better results when when a paper is pushed with the comedy the bigger the sample of people that are consolidates the better you will be will be sure that actually the thing is sound and everybody is okay with it if a few rants on Twitter and then don't go to committee meetings yeah sure nobody's gonna hear it and your argument will not be here and and maybe decision will be made that are actually not bad not good for you if all of your money depends on C++ maybe you should try to have your voice heard on C++ so yeah don't be afraid of talking and I'm just not dressing that for video game programmers that's true for any field we have this unicorn same group sometime of thinking that we have problems that nobody else has and nobody has cancelled most of the time it's wrong I can tell you have been in three different industries and I've seen that all the time you bring your expertise they're like no no you you don't you don't understand you you you come from another place we have very specific problem that nobody has everybody has performance problems everybody has memory problems everybody has that she compiler then they have to work with performance problem or compiler problems I don't need to work with a Playstation to a problem with my compilers I work with Sun but don't be afraid like and that's that's the big thing we should not stop from challenging quality of implementation because sure as I said before it might seem like a cop-out to say oh the committee is not responsible but someone has to be responsible for for for quality of implementation do not be afraid of holding your compiler vendor or you whatever vendor you have accountable for the performance of something he does but if you find something if you find it actually I don't know I keep we have we had a bunch of examples in the past days about a couple features in the in the STL on those vendors that were actually clearly badly optimised and could do a lot better publish your findings just don't write the wrapper or a negative inside your company and tell everybody to use it publish it provide benchmarks that everybody else can use because if there is a reference benchmark of another vector map function whatever we can have like the acid test that we had in JavaScript for our CSS for so long and we can run it all the time and then you can and everybody can see that this compiler has an issue and and that probably should be fixed because the minute there is a contest and everybody wants bigger number things will get better that's just how we do we love a challenge oh and if you need that packaging you can ask me I'm the guy who talks about package management all the time so yeah to conclude the STL tries to be a good enough default for everybody as long as some Optima and it will be as long as some optimization or enable if you go full debug you will be slow you will also be slow in C but you will be even slower in C++ sure if you have a specific case and you can drop some constraints from the standard you will get better results and you should I'm not telling you not to there are great libraries out there that do it you can think you have a better case you have a specific case that is not covered and you want to make your own go ahead go on I'm not gonna stop you but feedback is needed to improve the experience of everybody so if you have a use case make your voice heard if you have a benchmark that proved that an implementation is bad make it hurt publish it tell other people so that it can actually be fixed because instead of fixing it for you in your old code line you can fix it for everybody else and that's not a much bigger effort as long as you it's like what to get portion of twits ah not too bad oh and furthermore I think your bill should be destroyed thank you and we have 12 minutes for questions I'm sorry can you repeat that because I have some echo and I'm not sure right god sorry yeah come here and I will repeat your question on my mic if yeah do I see trend of more and more people using the SEO and video games I don't have enough data to answer I can tell you that I'm pushing people at my company to do it and I since my technique stopped reviewing my comments I guess it's getting faster and faster but no I I can't answer for the whole thing I hope it will because I think there is more and more talks about it but I know maybe maybe I'm not the best person to answer that you're welcome another question you might have to yell it because of the feedback but go ahead well question great so the question is is it okay to make dynamic allocation in game logic I'm not saying I'm right but I do it all the time there are there are Bell there is not to talk about alligators but yeah you can I mean you might have to allocate some stuff sometimes during your game logic you might have to allocate stuff at some points and you might need a way to do it they are allocated or strategy to reallocate that kind of things usually what you want to avoid is going to the system alligator because unpredictably you can hit like as I say like out of memory let's go to the system to ask for memory and then the kernel has to page out and then it takes forever and and you can't predict when it's gonna be happening and you sure as well won't be able to debug it for my use case it's used to be good enough I'm not saying we're the best we were the poster child of a trifle a performance but my games are mostly CPU bound and we manage with dynamic allocation alligators would probably make that better but it also requires some some thinking and that there's a great talk about alligators that was made by Andre advice at ACC you I encourage you to look at it he looked at the different options you can probably also grab John Lakers which is I don't think it's in the room but I've I've sold CPP Khan he's the guy about alligators you can probably tell you more about this does that answer your question perfect question you like the monkey stories oh really right so someone is telling me that my you it's it's a it's a story I mean yeah maybe it is I didn't know about that okay so the monkey story is not true he was a lie all along it's just a story but you believed it right yeah check your data I should trade off yeah yeah I think as I said the remark was that's usually in video like for example in this person company the thing comes when in some cases they use the STL in some others we don't because there's a better alternative and that's exactly what I'm saying good enough maybe sometimes it's fine and maybe you have a specific case that weren't something better maybe you have like a place when you need Smallville for optimization because most of the time you will not hello Kate enough to need to go to the heap and it's much faster maybe you can actually afford to pay the penalty for for hashmap and you want to use open or open addressing and I think it's good what features of the new standards do I see have the most effect in terms of SEL I really like seventeen variant because it's just a better Union I like what silhouettes 11 I bring with with with the new with a new array in 20 like specially about containers and performance I don't know but in the future we might get linear algebra from guy which is something we really want to have we have the paper about the fixer psyche about the fixed capacity vector that is something we any want to have I don't think gonna have a map to cuz that flat said there is gonna be a flat map left that said I should follow more your guy is telling me that there's gonna be a flat map and a flap set so if you actually want open address and there will be a reference implementation available to you if you can pay for the trade-off oh wait I think we will going to have dynamic beat sets or I'm not sure it's done yet very good thank you no I for this talk I did not I think there will be differences because yes yeah it's not a hundred person conforming to the stand for mother remember for example I'm pretty sure they don't handle exceptions which obviously will yield some difference I should definitely try to benchmark it and see the difference there is and as I was saying we there is a need for having a reference benchmark for most of the features we use to make sure we held our vendors accountable for for the quality of implementation instead of trying because blaming the committee is not gonna help anybody but it's gonna help is if we have actually some of those benchmark that says like hey here is the benchmark and it shows that a bunch of people's implementation are faster like that what Matt melted out for example this for example for for his a hashmap implementation we still have a few minutes if anyone else has a question yes well write a question about what tools you can use to you to know about performance so yeah of course when I research talks I usually use quick bench or stuff like that because it's very practical to have a micro benchmark but micro benchmark are just gonna you just show you some things and and I think that's might be one of the issues is that how do you reduce your the case in your game - OH - to a micro benchmark and I think this is where it gets tricky personally when I have like a huge performance bottleneck and I want to find it I start with something like vtune or even the Microsoft provided profiler is is okay to get a some kind of a ballpark number think detune is a bit more give you more data you can see stuff like a cache misses or back-end front-end bound instructions and all that stuff that's that that usually helps me but at the end of the day I think what would help us is having more more more benchmark with more data it's not gonna be like one benchmark that you can fit on a slide because there's gonna be a bunch of tests but that's the thing when I research this talk I didn't find a lot of benchmarks and and that's where I think we need to again be better and yeah the best idea would have is that if it's if it's container base that if you manage to find out if you go with vtune or whatever and you find something that because you're gonna have to prove where it comes from right like if you find a bug or you find a reach you know your profile tells you there is a bottleneck somewhere in your code at some point you want to have to blame somebody and you will end up saying oh this is because this container is generating like bad code or whatever you can probably take that and reduce it to a structure that is similar to what you're actually doing and and and then get something that will help you working on and you're probably also gonna make much much better progress working on a small benchmark than actually rebuilding your home game all the time when you change one container because I don't know about you but if I change one standard container and make in my game I just have to rebuild everything and that's just not really effective way of iterating over so you probably want to isolate your your problem first it wants you located it and then the next step is to publish that benchmark does that answer your question perfect and with that I think that's all thank you [Applause]
Info
Channel: CppCon
Views: 58,182
Rating: undefined out of 5
Keywords: Mathieu Ropert, CppCon 2019, Computer Science (Field), + C (Programming Language), Bash Films, conference video recording services, conference recording services, nationwide conference recording services, conference videography services, conference video recording, conference filming services, conference services, conference recording, event videographers, capture presentation slides, record presentation slides, event video recording, video services
Id: 6hC9IxqdDDw
Channel Id: undefined
Length: 57min 17sec (3437 seconds)
Published: Thu Oct 10 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.