CppCon 2018: Arthur O'Dwyer “An Allocator is a Handle to a Heap”

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
- So welcome everybody to An Allocator is a Handle to a Heap. I'm Arthur O'Dwyer, this is one of three talks that I'm giving this week and you should all come to all of them. In fact, this is the one you can walk out of because you can already see it online. I gave it at C++ Now earlier this year at Aspen. And there I had a 90 minute slot, here I have a 60 minute slot. And so I don't know how far we're gonna get. Maybe we're gonna get all the way, maybe we're not. We'll get farther if you don't ask questions but it's already been recorded once, so the stakes are very low. If you have a question, please do ask it. Like, please do, raise your hand, ask questions, whatever. And I reserve the right to steamroll over any questions that are leading us in an unproductive direction based on where my slides want to take us. But basically, yeah, you're getting the second draft, it's already been recorded once. So, an allocator is a handle to a heap, why do I say this? Let's start at the beginning. Oh, first we even have an outline. Yeah, we're gonna talk about this. We're gonna start at the beginning, we're gonna talk about what are objects, what are values. And I'm gonna tell you what is an allocator, what is a memory resource? We're gonna talk a little bit about what I call copy only type. So we're gonna talk about rebindable families. We're gonna talk about what else is in the C++ allocator model besides being a handle to a memory resource. And then we're gonna relate allocators to other parts of C++. So objects and values. C++, it's a multi-paradigm language but part of that is it's an object oriented language. What is an object? For that matter, we've talked about value semantics. What is a value? Well, I generally try to divide objects into two categories, a dichotomy between object types and value types. Any type that you make should be one or the other and not parts of both, you know. A value is something like int. I think Scott Myers says, do as the int's do. Be like an int. Int's are copyable, they're comparable, you can pass them around, copies are interchangeable. That's a value, value semantics. Objects, object types are more classical object oriented. They might be polymorphic, they're generally identified by their address. And if I have two objects at different addresses, generally they're not interchangeable. And the reason they're not interchangeable is because they have state, I'm mutating an object. So the interesting thing about objects, as I said, object has an address. You can have a pointer to an object, you can name an object with it's address. It has some sort of unique identifier or handle. They're all interchangeable terms and that handle, that address, that pointer, is itself a value type. So the name of an object is it's self value. I have the value, the pointer value, ampersand A, that I can class around the pointer. Copies of pointers are interchangeable. Pointer is a value type, but the object itself, not necessarily. Now in this case, the object itself is an int. And I just told you that int is a value type. This is where it gets confusing for me. That a C++ object is in part defined by it's in-memory representations. Even these things that we want to think of as value types, you know, an int has the value 42 but it also has a memory address because we're in C++. And so I can take it's address and maybe there are some ways in which two ints at different addresses may not be substitutable for each other. Vice versa, I can think about having object A of type int and I can think that it in some sense not only has a state which I can mutate, I can plus equals to an int to increase or decrease the value that it has. And I'm mutating it's state by doing that. But also in some sense changing it's value. And this gets very confusing if you let it. Or if I make it so. (laughs) So just know that this dichotomy does break down. And as you go lower into the quantum levels of C++. But at the very high level, distinguishing between value types and object types I think is important. And speaking of which, there is another talk on this subject. Juan Pedro Bonaparte Fuente is giving his talk, Most Valuable Values this Friday at 1:30 so if you wanna hear more about that, go there. And, where are we going here? Oh yeah, something else I wanted to say about this slide. Two interesting things going on on this slide with our A and our B. Number one, that we have this int that I'm calling an object, we already kind of covered that. But also that these two objects, A and B, have different C++ types. Alright, one is an int and one is a long but in some sense they have the same platonic value, 42. Which is represented by the value 42 in that little cloud. That there's this platonic idea of 42 that both of these concrete objects in some sense represent that value. But they do it with different bit patterns, right? And if there was a double, that would be a different kind of bit pattern. All representing this idea 42. So there's some sense in which the value is distinct as well from the object representation. And now we're just talking about integers. Let's talk about containers. What is a sequence container? Like a vector or a list. On the one hand, a container behaves like a value. If I have a vector of three ints, I can make a copy of it and that copied vector will compare equal. It will have all the same properties as the original. I can pass vectors around by value, return them by value, they behave a lot like values but at the same time, they have these details of their object representation that are very important to us as C++ programmers. The container is an object that holds and manages elements, which themselves are objects. So on the right hand side here, on the left we have the platonic ideals of clouds and whatnot. Platonic ideal looks a lot like Python, that's completely coincidental, I promise. And on the right hand side we have the more C++ view of the world where these are all objects and they have some sort of memory layout. And we have these mysterious symbols. We have these boxes, we have two sets of boxes. And maybe we know that the left hand box there, the object v of type vector int, that's probably allocated on the stack. But then we have those other two boxes, where did they come from? They're in memory somewhere, where did we allocate that memory from? And then this thing that looks like an arrow, a little box and arrow, what is that exactly? These are the questions we have and the answers are provided by the C++ allocator model. Where does the memory for these things come from? What is that thing that looks like a pointer? So std vector, the standard vector type for the stl, is parametrized not only on T but also on the allocator type A. That parameter gets a default which is why you don't usually see it in your code. But if you're here, it's probably because you're aware that allocators exist. If not, spoiler alert, allocators exist. So the allocator answers these questions. Where does the memory for V sub I come from? It comes from a call to the allocator to an allocate method. And what is the thing represented by the box and the arrow? It's an object of type pointer. The member type pointer that the allocator has. And we'll cover that more in several slides from now. The container holds an instance of the allocator type A within itself. So a container has an allocator and anything that can be funneled through that instance is funneled through the instance. So any time the container does an allocation, it's gonna ask the allocator, hey allocator, it's time to allocate some stuff. How do I do it? It's time for me to deallocate some stuff, how do I do it? It's time for me to have a pointer to some stuff, what is the object representation of that pointer? Things like that. And so it holds that instance of the allocator type within itself. What kind of state or what kind of innards can that allocator instance have? Well, prior to C++ 17, the only standard allocator type that existed was call standard allocator and it was stateless. It was an empty class type. It had no state at all. It would typically be optimized away through the empty base optimization. And so when people started experimenting with allocators and more fancy allocation strategies, they started realizing that it would actually be nice to have some data inside an allocator. Some state that it could carry around. And they didn't have any guidance for how to do this because the only allocator type of 214 was stateless. There was no guidance either in the standard document or in standard libraries as to how do you make a stateful allocator in the first place? And this would lead to people trying to implement stateful allocators something like this. This is something that at least up to a year ago, maybe fixed as of C++ Now, but I bet someone's gonna come up to me in the hall this week and say, I'm trying to do something like this. This is, I want to make an allocator, which I'm here calling bad, it's templated on T and it has a sort of buffer, a big buffer inside itself. And it has this allocate method that's just gonna bump the pointer. And I can maybe make a std vector of T comma this allocator and then it will have the buffer inside the vector itself. It'll be like I had a fixed capacity vector, right? That the vector and the buffer are right next to each other and this is gonna be great. And then they go to try and make a copy of the vector and all the pointers are pointing into the, it just doesn't work. Bryce? - [Man] You said the object that stabilizes rather than the thing the object points at. - Yes, we're gonna get to that. Bryce, real fast. - [Bryce] Why align at 16? - Why align at 16? Yeah, why not? (laughs) Maybe they shouldn't even do this. They are really bad at programming as we can see. (laughs) So yeah. So in 17 we got this thing called polymorphic allocator. Std pmr polymorphic allocator, where pmr stands for polymorphic memory resource. Yeah, I don't know. Nested name space is considered bad. But we have this thing called polymorphic allocator. And it is, we'll see it in slide 16, we'll see the actual code. But polymorphic allocator is basically a pointer to a std pmr memory resource. You take all that shared state, the big buffer that we had before. In fact, this is almost exactly struct bad, right? It's just, we've changed it's name. We're not calling it struct bad anymore. So you can tell it's okay. But we're not gonna call it an allocator. We gonna give it a new name for it, we're gonna call it a memory resource. And it's gonna be an object type. That means it has an inheritance relationship with something. It also has virtual methods and stuff although you don't see them here. And it lives at a specific place in memory, right? That specific place is where this big buffer data is going to live from now on. We have one K of memory allocated somewhere specific. We're gonna have pointers all over the place maybe pointing in to it to allocate an object. So it's important that it not move around. It doesn't have value semantics. It is an object type. It is immobile, yes. And we're gonna have this polymorphic allocator here way down at the bottom. We have std vector of int comma polymorphic allocator int. This vector will now contain inside itself a copy of the allocator, polymorphic allocator instance. That polymorphic allocator instance is basically just a pointer to the address of this trivial resource. So, (audience member off microphone) What is the integer argument of the size of the N here? (audience member off microphone) Oh, that is a typo but you can imagine it was the size of data, for example. Yeah. So here's a summary of the two kinds. We had struct bad two slides ago and now we have something very similar looking, we just changed its name. But struct bad was object like, right? It contained it's own mutable state and that made it a bad allocator. The source of memory was stored directly in the object. Allocate and deallocate, this is interesting, allocate and deallocate were non const because they had to mutate that state. When you called allocate on struct bad, it actually bumped up the used count and changed the values and data. And so this method had to be non const. With polymorphic allocator, allocating and deallocating using the allocator don't actually modify the value of the allocator. So they could actually be const, they aren't, oops, but they could've been. With struct bad, moving and copying the allocator, as I said, is likely to leave you with a bunch of dangling pointer bugs. With polymorphic allocator, because it's just the handle, it is safe to copy. It's even trivially copyable, possibly. (audience member off microphone) Am I going to get the meaning of the quality between two allocators? At least implicitly, yes. The meaning is exactly what you think if an allocator is a handle to a memory resource, then equality means pointer equality. They point to the same resource. (audience member off microphone) Yes. - [Man] That you don't need to actually pass the same object to everyone because you're splicing. - Yes, we'll get to that later. Yes. We will get to something like that, yes. And lastly, this is, the very last line of this, is the one caveat. With struct bad, it didn't work so good points don't really matter because it doesn't work. But there was no shared state, right? It was all contained within the allocator itself. And then you copied the dangling pointers and it didn't work so don't do it. But in the case of polymorphic allocator, it is actually, it does not own the memory resource. It just points to the memory resource. And so you have to think a little bit about who does own the memory resource? What is it's lifetime? What is the lifetime of that heap that I'm pointing to? So this clarifies our thinking about allocators. Even if you're not using C++ 17, and especially if you're not using pmr, polymorphic allocator because you may not, you can still take these lessons and use them to clarify your own thinking about allocators. Old style thinking, you might have said, an allocator object represents a source of memory. An allocator is the source of memory for things. This is not true, alright? Don't think that way. It is not itself the source of memory. It is a handle to a source of memory plus some other orthogonal pieces we'll talk about. And the source of memory itself is somewhere outside of the user of the allocator. Somewhere outside the container. So here we can have, we have a program that has multiple memory resources going on. Multiple heaps going at once. It has multiple allocators. Some allocators point to a heap and uniquely point to it. Others share the same heap, which means you might have to think about thread safety with respect to can multiple allocators allocate from the same heap? Some allocators belong to container, some don't. You can use an allocator without a container if you have some sort of generic algorithm that would like to get memory from somewhere, you can pass in an allocator to it maybe. So this clarifies your thinking. However, now that I've told you all about these allocators that are handles to heaps, you may be thinking, what about stateless allocators? What about std allocators? Std allocator is an empty class, how could it possibly be a handle to anything? It doesn't have any bits in it. Well, mathematics time. A stateless allocator is a handle to a source of memory plus some orthogonal pieces where the source of memory is a global singleton. For example, the new delete heap is global singleton. There is only one possible thing that std allocator could possible point to and that is the global heap. So how many bits of object representation do we need to represent exactly one thing? Well, we need log of one which is zero bits. (audience member off microphone) Doesn't need a pointer. Let's clarify this with some code. So this is a little bit of a digression but we'll circle back around to that idea of the log of one. But first, here's memory resource. This is the actual memory resource as it appears in C++ 17 with, you know, some name spaces around it. I removed some random no accept and whatnot. But you know. Memory resource is a base class. And it is polymorphic, it has virtual methods. However, other than the destructor, which is, of course, public, all of its virtual methods are private and they all begin with that do underscore. So do allocate, do the allocate, do is equal. And these are all private. They have public entry points, which don't have the do. So when you want to allocate something with the memory resource, you're gonna call the public allocate function, it's going to dispatch down to the, it's gonna do the virtual dispatch and call do allocate for you. This ensures that you can never have a derived class and say, you know what? I'm just gonna skip straight to the code that I know that this class has. I'm gonna say, my drive colon colon do allocate, that's private specifically so you can't do that. And I kind of like this. The other thing that this means is that when I derive from it, as I'm gonna do on the next slide, here's the actually new delete resource from the standard. Modulated, this is an implementation detail that I'm showing you. What's standard is this function down here called new delete resource. And it returns a pointer to a memory resource that's just a static singleton, alright. I have this one instance. And this name, the singleton new delete resource, this is a made up name that I made up. So somewhere off in the private implementation in the CPP file, it's gonna say singleton new delete resource inherits from memory resource publicly so everyone knows it's memory resource. And it overrides the three member functions which are private. So this is kind of cool 'cause I don't have to put public up here or anything, I can just open the class and override the three things and I'm done, I don't ever have to mention access specifiers anywhere. So I kinda like this pattern for that reason. So we override them to just call operator new and operator delete. And for is equal, we'll just say, what are we saying here? If it is in fact, this object then it's equal. Because there's only one object of this type. This instance down here. Does anyone have any questions about memory resource specifically before I go on? - [Man] Is this the place where old allocator before C++ 11 left off or has it gone someplace else? - Is this the place where old allocators before C++ 11 would hook up? - [Man] Where there's no concept of about having something that functions like in place of new... - No. - [Man] Or the template does that all, issues those things, you don't do it yourself. - The container is the thing that's gonna have the allocator in itself, polymorphic allocator within itself. The polymorphic allocator is going to call the public allocate function on the base class. That's going to do the virtual dispatch and eventually get to arch out. - [Man] Standard allocator traits, you call-- - Allocator traits, we'll come back to that. Alright. - [Man] Is hooked up to that. - So this is now polymorphic allocator. Oh, one more. - [Man] Do you use the ability to do EBO with polymorphic? - Can we do EBO, empty base optimization with polymorphic allocator? Essentially yes, because it's not empty. Right? But you don't lose any space to padding because there's no padding. It's empty. Yeah. - [Man] Is it the case with the polymorphic allocator you don't (audience member off microphone) - Yeah, let me go to the next slide which has the polymorphic allocator on it. But yes, if you're using polymorphic allocator, you yourself never need to implement an allocator type, you use polymorphic allocator which is a handle to a memory resource and you yourself implement a memory resource which is derived from standard PMR memory source. (audience member off microphone) Yes. (audience member off microphone) Right. It could still inherit from it but it wouldn't get any space savings out of doing that. And so it probably wouldn't. So here polymorphic allocator as promised, it has a pointer to a memory resource. And what else has it got? It has a constructor that can construct from a memory resource star. It has this rewinding constructor that will come back to. It has a default constructor that calls this global thread safe, get default resource function. I'm not a huge fan of this but just know that you can default construct a polymorphic allocator and it will go grab whatever your default resource is, which is usually the default new delete heap, new delete resource. There's a way to turn it back into a memory resource star by calling resource. And it has the allocate and deallocate function that every allocator needs. It also has this other function that we'll come back to. And lastly it has operator equal equal and of course it also has an upgrade or not equal where you can, again, allocate or, an allocator is a value type so you need to be able to copy it and you need to be able to compare copies for quality. Here we're comparing them for equality by just saying if they are pointing to the same resource, they are equal. Or more specifically, we're gonna delegate that to the resource. So we're gonna ask the resource, are you equal to this other resource? In general that will mean address equality. (audience member off microphone) Did we lose the capacity to create a fixed capacity vector? Essentially yes, but my argument is you don't want to do that. Or at least you don't want to try to abuse the allocator model to do that, right? A fixed capacity vector is a super useful type but you should make it it's own type. And once you do, you realize the memory always comes from within the object itself and you don't need an allocator at all. You know where the memory is, there's no question. (audience member off microphone) A solution where it has some fixed capacity but could go beyond that. Again, I would tend to argue, if you're doing a small pulse string optimization, for example, well, you could think about how do existing standard library vendors do small string optimization with std string? How do they do it with std NA listed function, they don't abuse allocators to do it, right? They have their own ways of doing it and you should copy those not try to innovate with allocators because it's not gonna work, it's gonna be bad. Okay, so this is polymorphic allocator and it contains a 64 bit memory resource star. And therefore, as the purple box says in the upper right corner, the stateful allocator of 64 bits can point to any of the 264th different memory resources. Yeah. (audience member off microphone) Can you do crazy hacks? Yes, but don't. Now I'm gonna change a little bit. I had a 64 bit quantity there that could point to any 264th different things, right? Now I'm gonna do, again, this is a crazy hack you shouldn't do but in principle it's less crazy than anything we were just discussing about fixed capacity vector. This is a one bit allocator. It's an allocator which is a handle to a memory resource but it only takes up eight bits. And we're actually gonna use those eight bits as an index into this atomic ref count, I'm not showing any of that code but I did it in practice and the link is not here. So you can't prove me wrong, I totally did it. (laughs) We've got this table of 256 different memory resource pointers and we're going to update it and as we make copies, we're gonna ref count the things and we're gonna find an empty index when we need to and we can then point to any 256 different memory resources at a time. If your program has fewer than 256 memory resources, we can get away with using one byte to represent the address of any one of them. So that's very cool. Okay, now suppose we only have one memory resource in our program. Then we know where it is and we don't need any bits. And then we have this, zero byte allocator with zero bits and it can represent any of one different memory resource. Which happens to be, in this case, new delete resource. 'Cause we chose that. We could choose something else too, it'd be a different type, right? A different behavior. But right. Cool. That is essentially std allocator. Question? (audience member off microphone) In the previous one, where were we setting the index? We were doing that in our, in that comment. That comment totally, the compiler will just do the right thing, right? (laughs) And you can see here, when we do the copy, we hide it all away in this function. And again in the def rec function. So yeah. (audience member off microphone) I heard this morning we can just use contracts and it'll just solve everything, so. Compilers will be able to parse that, no problem. Corollaries to the new way of thinking. Again, just the rehash. You're all in this room to learn that allocator types are just like pointers. They're handles to memory resources and therefore they should be copyable just like pointers. This was always true. Even pre-11, even pre-17, this was always true but now it's more obvious. Not only should they be copyable, they should be cheaply copyable. Not necessarily trivially but you're gonna be copying allocators a lot. The container is gonna be copying them behind your back, we'll see that in a minute. And memory resource types, contrary wise, they're gonna sit in one place and be object like and generally immobile. They're gonna have their state mutated by means of these allocators. Memory resource, for example, might sit in one place and have a buffer inside itself and allocate chunks out that buffer as needed. So allocators need to do as the pointers do in one more very subtle way that I did not know a year and a half ago. Notice that in our one byte allocator, I gave it a copy constructor and I did not give it a move constructor. Even thought it's doing this sort of ref counting thing, you'd think, oh, it'll be super fast to just copy it over, zero it out, just like shed butter does, right? Have it be efficiently moveable. It is not efficiently moveable and the reason for this is that allocators should be essentially copy-only types. When you move out of an allocator, the thing that's left behind in the source object, the moved from object, must preserve its value. This is very weird because we are not used to thinking like this when we're talking about move semantics. However, this is extremely important to have because down at the bottom here I have v1 and v2 and they both use some custom allocator with some state and I move out of v1 and I move everything out of v1, including the allocator. I moved out of the allocator and now a v1 is in this unspecified but valid state. (audience member off microphone) So you define equals delete? No you don't. You define equals to the copy constructor. Right, so I've moved out of the vector, now the vector must be in an unspecified but valid state, the allocator in practice will be in a moved from state. But then if I clear the vector and start pushing things back in again, that doesn't magically give me my allocator back, right? I don't know how to get an allocator so I better hang on to the one I had. That means that when I move out of an allocator, it better still have the same value and be usable because someone's gonna try and push back in the vector again. Bryce? (audience member off microphone) Is this post condition completely unique in the standard library? Other than for trivial types, yes. But for example, for a pointer. If I move out of a pointer, it preserves it's value. If I move out of an empty class, it preserves its value such as it is. - [Man] So you can say that this is some trivial type trivially a different-- - Yeah, it doesn't need to be literally trivial but it needs to do the thing it says on the slide. So just be aware, if you're implementing an allocator and it needs to do some weird ref counting kind of thing, watch out for, don't get too clever with it. Really, just give it a copy constructor. But again, you should really be implementing your own allocator type too often because you're gonna be using polymorphic allocator. But if you need an allocator that's not polymorphic allocator and not standard allocator, you're going to have to implement it yourself because those are the only two standard allocators. (audience member off microphone) It is legal to use move from containers. You don't know what state their in but they're in some state where you can, you know, you can call anything with a precondition. In this case clear, right? (audience member off microphone) People say all kinds of things. This isn't that talk. Go to one of Howard's talks. He'll love it. (audience member off microphone) Could I use polymorphic allocator in C++ 11? Yes, absolutely. There's nothing, there's nothing that needs C++ 17 or even 14. - [Man] Without changing the vector. - Without changing vector, yeah. Correct. If you take literally from something like web C++, you will find out that there are error directives in the file that prevent you from doing this. But, except for the nanny state. Ah, yes. Okay, so we're not allowed to move allocators more cheaply than the copy because basically we have to copy. Except, I should say, the source object has to preserve it's value. If it does have other crap associated with it that's not part of it's value, like a cache of some... I don't really know what. But if it has some non-essential state, maybe that non-essential state will be moved over. That's okay. Non-essential state will be moved over. And the essential state will be copied. And I was thinking about this right after my last talk because it was brought to my attention. And it's interesting that C++ has this notion of copy and then move where move typically means steal all the guts from the other thing, right. From the argyle you can steal all the guts. And copy means don't steal any of the guts. But with allocators, that's a little bit changed where copy means don't steal anything, move means you can steal the inessential stuff but you need to leave enough of it to survive on it's own. Like steal one of the kidneys but not the other. Something like that. And for allocators we don't have an idiom or a way to tell the library, hey, I want to steal the whole thing. Destructive move might be something like that and you'll notice on my name tag here it says trivial relocatable. So ask me about that. But in general, there's many different kinds of stealing we might want to do and allocators have a different set of two kinds of stealing they can do than other types. (audience member off microphone) My single wide allocator. (audience member off microphone) Here in particular, I don't give it a move constructor and therefore it will copy because the copy instructor exists and the move doesn't. (audience member off microphone) Yes, however if I had given it a move constructor, then it would call the move constructor. Yep. When I move the vector, it will attempt to move the allocator. It will not fall back onto copy even though it, moving on, alright. So why do containers copy allocators in the first place? Because of something called rebinding. Allocators are rebindable family types. So the type to allocator for unfortunate historical reasons is baked into the allocator type. If I want to allocate into juries, I need to use a std allocator int or an allocator of int. And when I say allocate to, that means to something and because an allocator of int, it means two ints. This works great for std vector because I quite often need with std vector to allocate a continuous sequence of two ints or five ints or whatever. This only works for std vector. It does not help anybody else in the world, right? Because std list doesn't want to allocate T to begin with. It wants to allocate node of T, which is a T with a couple of pointers. Std map of takes this allocator of pair const K, V, it doesn't want to allocate that type at all, right? It'll never allocate an object of that type, it wants to allocate RB tree node, whatever, whatever. So we need a way of taking the allocator that we got, which is associated with one particular T that we almost certainly do not care about, and rebind it to the T we do care about. And so we have these family of related type, alloc of int, alloc of double, alloc of void star star, etc. An allocator value, which is representable in one of these types, must be representable in all of these types. Any type that I want to allocate, I should be able to step one, get that new type, alloc of some type, and then I should be able to take the value that I have and convert it in some way, preserving it's value, convert it to this other type. How do we express a value preserving conversion in C++? Cast, static cast, right. So, right, so rebinding is useful when we have, we have this foo, or possibly a foo of U and we need to get a foo of T. So there are basically two ways to do it, the stl prefers this rebinding strategy where we have this function, or we have this class called, class template called allocator traits. Standard allocator traits. We specialize it for our type or we use it for the default specialization. Normally you don't have to mess with it, you shouldn't mess with it. And it has this member type def rebind alloc of U is the type that we want. The alternative is we could do something like this where we take a template template parameter and then we can just instantiate it to substitute in U. This is not used in the stl, not even for stack and Q, for those of you that are thinking about those, no. So generally, know what rebinding is and how to use that. Bryce, I'm gonna stop taking questions from you. You asked all these in Aspen. (audience member off microphone) That is true and I thought I had a slide on that, maybe I got rid of it. But yes, template class dot dot dot is quite possibly what you want there. 'Cause otherwise it's not gonna work for some things. Yeah. (audience member off microphone) Why doesn't the allocator just deal in bytes? Well, memory resources do deal in bytes. They deal in, but they also deal in not only in bytes for size, also alignment. Which the original, going back to C++ 03, they didn't have to worry about alignment because you got the T, you just use whatever the alignment of the T was, right? So and there may possibly be other things you might wanna consider such as, well, we'll see the construction is piped through there but that doesn't really matter. But what pointer type do you return? Do you just return void star? No, you have to be type safe and return T star. So then you have to know what T is so. Some allocators are rebindable family types. There are many other kinds of rebindable families in C++. Allocator types are one example, which we just saw. But also pointer types. So for example, T star. I can take a T star and use pointer traits to rebind it to U star. And I can cast, static cast between T star and U star. Or if I have fancy pointers such boost offset pointer and so on, then I can use pointer traits as well to cast between those. And smart pointer types, unfortunately there is no smart pointer traits. But you can still convert or rebind a smart pointer like shared putter of int. I have a shared putter of int but I'd really like a shared putter of double. How do I get from where I am to that? I can use reinterpret pointer cast. Any ideas that we have a generic way of doing this so that you can do it with std shared putter, you can do it with boost shared putter, you can do it with weak putter, you can use it with your own types. But these are all rebindable families. Right. I'm gonna move on. Each rebindable family has a prototype or representative of the equivalence class. So for example, with the void pointer type in pointer and some more pointer families, allocator families also have a proto allocator type. Except the asterisks is they don't really but they will as soon as either executors or networking gets into the standard. They we will have this proto allocator concept. Or I can say alloc of void. Now alloc of void here, alloc is an allocator, then this is not really an allocator. Because it couldn't possibly do allocate. I can't say allocate two voids please. Deallocate two voids. That's not gonna work. But what I can do is I can use this to communicate a value, right. Because an allocator is a value type. I communicate which memory resource I'd like to use please through one of these alloc of void. The person on the other hand, the container or whatever, can rebind that type using allocator traits. Say, rebind alloc to int or whatever they need. And then cast the value. And now they have an allocator for the appropriate type and the appropriate value. (audience member off microphone) Yes, it has all the same state as an alloc of int or an alloc of double but it just has a sort of curtailed interface. So yes. And if you're looking, if you're designing your own thing where you don't have to be constrained by the history of C++, please do use this sort of proto allocator or representative concept. Please don't use alloc of std byte. Don't do that. Or alloc of char or alloc of int star star star. So if we were design STL today, in fact, we use proto allocator type on our interfaces to save on substantiations. Rather than making up all of these different alloc of int in the case of vector. In the case of list, we have alloc of int but it secretly going to rebind it to alloc of node. From map, we have to make up alloc of pair, constant int and then it's gonna rebind it to node. We're making all of these different substantiations, most of them never get used. It would be better to just have one proto allocator and then have list of map make the one's that they're gonna use. And we never try to make the intermediate ones like pair, constant, and int that's never gonna get used. (audience member off microphone) Let me see if I understand that. In fact, are you saying, are we circling back around to memory resource talk? Yeah, could we just say, why don't I just pass a pointer to a memory resource or something like that. (audience member off microphone) Yes. (audience member off microphone) Yes, why don't I just have one type, one class type with many member function templates rather than a class template with member functions? I suspect the historical answer has to do with poor support for member function templates in the 90s. I don't know. But yeah, it does seem better. (audience member off microphone) With the proto allocator type, the container must rebind its allocator before any operation so it would look something like that. And why don't we just get rid of all this rebinding entirely and just go with, you know, memory resource star? Well, the reason is that an allocator is not merely a pointer to a memory resource, it's a whole bunch of other stuff too. And I have 18 minutes to tell you what that stuff is. An allocator is more than the source of memory. So when I do an allocation like this, this is the, almost the correct way to do an allocation except that I have lied in that here this needs to be a non const value so the result of the static cast is not gonna cut it. But, and that's because the allocate number function is not const, which it should've been in the first place. Here the result of allocate is going to of type allocator traits of alloc T colon colon pointer. Which is typically just T star but it could be something fancier such as an offset pointer or some other very fancy pointer type. And that means it's completely up to the allocator to decide how it's pointers are represented. Again, typically they will just be T star. But if you wanted to say, I'm allocating out of GPU memory or shared memory or some weird thing where the associative array, memory, where the format of a pointer needs to be represented in memory in some very specific way. This is where you could hook in to do that with the pointer member type def. And the container is going to use that pointer type def for all of it's pointers. Problem is, it's pointers is really vague. Who made this pointer? Container uses pointer for all allocations. So here's std list. We have a node as promised and we have all these pointers running around, right? And we have this heap here. And the idea is any pointer that points to something on the heap has better be colored red. So the first interesting thing to note is that we have two pointers here that are stored outside of our heap as I've drawn it here. This list object is presumably on the staff or something. And all of it's allocations are taking place in the red heap. But that means it has to have two red pointers pointing to things on the heap that are not themselves stored on the heap. So that's a little weird. I need whatever my red pointer type is, my pointer type def here, it needs to be able to represent, or it needs to be able to be stored outside the heap if this is going to work. And then also, these two pointers, the green ones, are stored inside the heap but actually, because of the centennial nodes stored within list object itself, they point to places outside the heap. So but they have to be the same type, right? M prev here, this is our red pointer pointing to something on the heap. This is pointing at something outside the heap. They need to be the same type because they're both the same field. And so this is very strange, right? I have red pointers that live outside the heap, I have green pointers that live in the heap. And so this essentially means that fancy pointers, for every fancy pointer type I pick needs to have the same range as a raw pointer. I need to be able to take one of these fancy pointer m next's and turn in into a reference to this node by dereferencing it. And of course if I can do that, I can take it's address and then I have a raw pointer. And vice versa, if I have a raw pointer to this m internal node, I have to be able to get that as a fancy pointer. So fancy pointers and native pointers must be interconvertible at the same range of values. Does this mean that they are exactly the same and there is no reason to ever use fancy pointers for anything? Honestly, if you walked out of here thinking that, I wouldn't complain too much. But I have two reasons the answer is no and this is also covered, by the way, in my committee paper from about a year ago, P0773 toward meaningful fancy pointers. So if you wanted to go read that, that'd be cool. So are they the same type? No. Just because they have the same range of values, remember all the way back to the beginning where we had A as an int and B as a long and they were both value 42 but they had different memory representations. So the type also involves object representation. For example, boost offset putter as part as the boost interprocess library. It has the exact same range, it can address any of memory but it does so with a different bit pattern then a native pointer would. Bob Steagall has a thing called synthetic pointers. He gave a couple of talks last year and also at C++ Now and also I think this year. And he calls them synthetic pointers. Again, same range as a native pointer but different memory representation. Different behavior and run time. Different performance behavior, I should say. And the other reason is besides object representation is that the fancy type might be augmented with extra data. So it might actually be bigger, fatter. It might have metadata that's used by the deallocator, for example, so that it could carry information about where the allocation came from to enable the deallocator to work faster. So like, this came from slab five. So the deallocator can look it up quickly. It might carry something like infinity for GPU memory. It may carry array bounds like all these fat pointers. So that they metadata is not used necessarily when deallocating but it's used when doing pointer arithmetic and dereferencing. These cases are supported only incosistantly and possibly accidentally by vendors. So be very pessimistic if you're going to try to use this kind of thing. But I do think it would be nice if more vendors supported this kind of thing. And if the standard explicitly said, this is why we have the pointer type def so you can do this kind of thing because if you're limited to only using native pointers, why did we bother making all this machinery? (audience member off microphone) Can fancy pointers provide us any measure of type safety given that allocator traits construction take a raw pointer? Allocator construct should take a raw pointer because it doesn't actually care about the pointer type it just needs an address and as we said, they have to have the same range as the convertible. - [Man] Doesn't that imply a necessity for implicit conversion though? - For implicit conversion, no. When the container takes its fancy pointer and passes it to construct, it needs to do the explicit conversion. The explicit cast or static cast, something like that because the fancy pointer might not support implicit conversion. Yeah. In fact, you ask the people who are actually getting their papers approved and they all say they shouldn't use static cast. It should should use std colon colon pointer to or possibly address of for to address but static cast is the right answer. The same value in two different types. How do I get from one to the other? I do a static cast. So a C++ allocator is a run time source of memory, handle to a memory resource. And it's also decides the pointer type so it's a runtime source memory and it also decides that pointer type. And also, we haven't really talked about this, but it's besides whether the source of memory should move with the container value or stick with the container object. Whether this container that I'm building, this instantiation or specialization of std vector, am I going to be using it primarily as a value type or I'm going to be moving it around a lot, returning it from functions and I'd really like it to move very efficiently. Or am I going to doing, using assignment operator but if I have a one heap over here, one heap over here, associated each with a vector and I say I wanna copy this vector into there. Does this mean that I want this vector to get a pointer into that heap? Or does it mean I wanna copy all the data elements over to this heap? I call this the stickiness of the allocator and you can control that via these traits. And lastly, it's a runtime decider of how container subobjects should be constructed through the allocator traits construct. So that you can have a vector of vectors of things and it can pass down an allocator all the way down. To say, by the way, my elements should, even if your in placing some other thing, I really also want to add an allocator on to the back of that so it can, so that all of my subobjects will come in out of the same heap. So what's a handle to a heap plus some orthogonal pieces? We might separate some of these pieces. So there are only these two allocators in the standard but you can make up allocator adapters or things like that. You could take a nonsticky allocator adaptor that took in some kind of allocator type and put out an allocator type that was almost the same but just had different stickiness. Or one that took in some kind of allocator maybe a handle to some kind of heap and then said, I want a balanced check allocator that will return fat pointers instead of native pointers. And I don't wanna change where the source of memory comes from, I just wanna change the kind of fancy pointer that's getting returned. Eight minutes, eh we're doing alright. We're gonna switch gears. We're gonna go even more speculative here. Fun. How are analogous handle types handled in other areas of C++? So I've been talking about how an allocator is a handle to a memory resource and how this clarifies our thinking about how to deal with allocators, particularly in relation to containers. Can we take that intuition and apply it to anything else in C++ today? Well, an allocator is a cheaply copyable handle to a memory source. An iterator works very similarly, not quite the same but an iterator is a cheaply copyable handle to a thing, essentially a memory address but represented maybe a little bit differently. To a container contents plus some other bits, some orthogonal bits like iteration direction or a value category for reverse iterator is like a handle to the contents but also goes the other direction. And move iterator is handle to the contents but when you dereference it, it's gonna move out of it. And incidentally, I promised to talk about the difference between facades and adapters. I think of it this way, which is how I think boost think of it. It there is any system at all, it is this. A facade is providing the, the complete set of operations, the complete, I just wanna say facade again but that won't help so. The complete set of operations, the standard zoo of all the operators. So, for example, I could just implement the primitive functions next and dereference and then from that it would generate prefix plus plus and post fix plus plus and minus minus and prefix minus minus and star and arrow and just get all of those right and I'd would only have to do maybe five different primitive operations. And then I just ask the facade to generate the rest in terms of my primitive operations. An adaptor, I pass in some existing complete iterator type that already has the whole zoo of operations and I just say I would like these certain operations to behave differently. For example, standard reverse iterator is an adaptor in that sense and it says, do everything this guy does but just change the order of plus plus and minus minus. So that is an adaptor. What else we got going here? We're getting close to the end. Oh yeah, executors. Coming soon, in fact, I think they just had a meeting about this. Executors are essentially cheaply copyable handles to execution contexts. Possibly coming in C++ 20, 23? Executors lite in 2020. Okay. So these executors proposal is saying that an execution context is a program object, they're using specifically the term object because it connotes object type. It's gonna live in one place and basically be immobile and it's gonna represent, in some sense, a thread pool, something like that. And then if you wanna use the execution context, you don't use it directly, just like you don't use a memory resource directly. You wrap up a handle to it in something called an executor and you pass that around to things that need to get threads. (audience member off microphone) Executors are to execution what allocators are to allocation, yes. I agree with that but only if you think as I think about what are allocators to allocation anyway. (laughs) An allocator is a handle to a heap. Now the interesting thing about std executor is it's actually type erase just like std function. I believe that's the plan. So you can put any kind of capital executor or anything that satisfies that concept into a std executor. This is not true of polymorphic allocator. Polymorphic allocator says polymorphic right in the name but it doesn't do runtime polymorphism in the way we've learned to expect from std function and std any. We've been spoiled. It's talking about old school classical polymorphism where you must inherit from std pmr memory resource in order to have a polymorphic allocator point to them. We could actually imaging a truly polymorphic runtime polymorphic allocator that would look something like this. Resource adapter here, that's something you can write yourself. It was somehow omitted from C++ 17 but I think it might be coming, who knows. But it's something that can take any old allocator and wrap it up in something derived from memory resource. And so then you have a path to take your existing allocators, turn them into memory resources and then have polymorphic allocators coming out the other end. So if that sounds useful, that is totally possible. As long as you have some way of learning about or thinking about the lifetime of that shared state. Who owns that resource adaptor? Who owns that resource? That was the trade off from that table earlier. And so here I'm just wrapping it up in a shared putter the way we're used to. With that, questions 'cause we got about two minutes left. Yeah, use the mic if you can. - [Man] So I think I understood it but I may have missed something. One of the most, I don't know, the most popular use cases of allocator in C++ is to have big memory resources. Like eight bytes or 16 bytes, whatever. - Yeah, pooled resources. - [Man] Pooled, okay. But bounce to certain sites in order to deal with the fragmentation, right? So this, as far as I can see, doesn't help me at all with it, in fact it even makes it worse for the cases for list or map because I don't even have access to internal structure in size. - You don't know what in sizes to use because-- - Yeah, it makes it worse because I need to provide the memory resource but I don't know which resource to provide and it hides away the type before this. Back in the old school, I could have global memory resources and then in the allocator constructor, I could chose based on the size of global list of pools of different sizes so. - Yeah, so I think I understand the question. And I would say, just think more compositionally in that this pool resource, the reason why we called that pool resource was because in standard in 17 they do call it. There's a synchronized pool resource and an unsynchronized pool resource. And those are slab allocators in the type that you're used to with buckets of different sizes. And the idea is that entire thing is one memory resource. And then when you ask it for N bytes, it dispatches to figure out what pool to get it from and goes to that pool. It's not very configurable, there's basically no knobs on the standard one. I wouldn't recommend using the standard one. But again, it really clarifies your thinking and gives you a blueprint for if you're going to implement that kind of thing using memory resource, just go look at the web C++ implementation and you know-- - [Man] The memory resource needs to provide all the types of bins with it. I cannot have a memory resource with the single type. - A pool resource needs to provide all of those different bins. But you could have another resource. There's one called monotonic buffer resource that's just a huge chunk of memory and just allocates out of it like the trivial resource we saw at the beginning. And doesn't care how much you ask for, it's just gonna give you that much and it just doesn't even care, right? So it doesn't need to know about different bucket sizes and all of that. It's just very, very simple and you can go in between different strategies. But they're all expressible in this memory resource model. - [Man] Hi, slide 16. Is there a reason why it's using the pointer and not a reference? - Is there a reason to use a pointer and not a reference? Good question. I would say yes. Stylistically, we are not passing in a memory resource, we're passing in a pointer to a memory resource. If we took a reference, a reference is stylistically it looks like we're passing in the memory resource. And the fact that we're taking resource as sort of an optimization. But here, no, we really care about the address of that resource. We need that address to be the salient thing that we're going to keep, we're gonna keep a pointer to it. And you need to know when you're calling this constructor that the address matters. Don't pass us a temporary, don't pass us something and then later move out of it. Give us an address of a thing and we're gonna remember that address and use it later. Does that make sense? - [Man] Okay, fair enough. Stylistic decision. - It can't be null, right? - [Man] That's exactly my point. If you wanna make sure it's not null then make it a reference. - If you wanna make sure it's not null then make it a not null T star. Yeah, don't misuse references as non null pointers. (audience member off microphone) - [Man] Thank you. - Cool we're at the break. Thank you all. (applause)
Info
Channel: CppCon
Views: 5,552
Rating: undefined out of 5
Keywords: Arthur O'Dwyer, CppCon 2018, Computer Science (Field), + C (Programming Language), Bash Films, conference video recording services, conference recording services, nationwide conference recording services, conference videography services, conference video recording, conference filming services, conference services, conference recording, conference live streaming, event videographers, capture presentation slides, record presentation slides, event video recording
Id: IejdKidUwIg
Channel Id: undefined
Length: 63min 10sec (3790 seconds)
Published: Sat Oct 27 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.