CppCon 2019: Robert Leahy “The Networking TS in Practice: Patterns for Real World Problems”

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

All of this seems... awefully complex, really. What benefits do this rather extreme complexity bring compared to more conventional socket code?

👍︎︎ 12 👤︎︎ u/johannes1971 📅︎︎ Oct 09 2019 🗫︎ replies

What is the reason for all this complexity? Why not just expose a basic Socket API??

👍︎︎ 3 👤︎︎ u/frog_pow 📅︎︎ Oct 10 2019 🗫︎ replies

28:44 - "We've implemented that thing, that people complained was so complicated to implement in the networking TS."

...and it only took around 500 lines of non-trivial code distributed over 7 slides and 4 interacting structs, and that version still had undefined behavior due to data races when used with the default execution context.

I didn't actually use Asio myself so far, but based on this talk it looks like the complaints were justified.

👍︎︎ 3 👤︎︎ u/HotlLava 📅︎︎ Oct 10 2019 🗫︎ replies

I have one doubt about all this. The Round Robin approach is not horrible, but you can easily end up with an executor/thread handling three long very busy connections with 100% CPU while another io_context, which has in total received the same number of connections, is doing nothing because its connections were short lived.

Sean Parent has convinced me I should be using a work-stealing task system. But since a socket is associated with a io_context at construction time, and can't be changed... I can't move one of those three socket and put it in the free io_context/thread, can I? Doesn't hardcoding the socket with the io_context at construction time make the work stealing impossible?

👍︎︎ 2 👤︎︎ u/BlueDwarf82 📅︎︎ Oct 10 2019 🗫︎ replies
Captions
when I was in university I had a certain experience that happened to me Oh over again primarily it was anchored in math class it went something like this show up to the lecture professor would introduce something new a formula some tactic for approaching a problem introduced on the whiteboard and explored through example problems built up and knocked down repeatedly I would understand all of these examples I would understand all of these principles but then would come exam time I would crack open the exam and I would see a problem which practically cried out to be solved with those principles and practices I learned in the lectures and so I dive right in but two or three steps later something would happen there'd be a set of factors and they didn't quite cancel the way they should there'd be a set of terms but they didn't just Nestle themselves into the formula that I had rushed in armed with what I was learning through those math exams with something that's equally applicable to software engineering as well and that's that problems in the real world don't line themselves up to fall on the sword of techniques that you pulled out of academia last year I came to CPP Khan and I gave a talk on the Platonic principles of the networking to yes the things that it brings to us the power it gives us to write asynchronous IO code and over the course of that talk I took on the mantle of my professors I built up an example on a projector this time not a whiteboard and knocked it down repeatedly and behind the veil was the fact that that example had been tailored to demonstrate exactly those principles and not the other way around so please hold your questions to the end because this year I'm back to take everything that was built up last year the networking TS and see how we can apply it to problems that actually occur in the real world the kind of problems that fight back when you try to solve them but first let's talk about the networking to yes because when I gave my talk last year we were hopeful the networking T s would be in C++ 20 but that was not meant to be now we're looking down the barrel of C++ 23 which perhaps raises the question at once of why give this talk now why not wait two or three years until once again the networking ts looms large over the horizon and the answer to that question of course is that boost Ezio is the prior art on which the networking TS is based and boost Ezio is forward compatible with the networking - yes so if you like anything you see here today you can just download boost as yo and you're off to the races as soon as you change a simple namespace in fact to take this one step deeper if you wait until the last slide of this talk you'll find a link to a github repo and that github repo contains fully worked unit tested versions of every example on these slides and then some except there's no STD net in any of those examples because they were all built compiled and run with boost SEO and then when I moved them out of him and onto these slides I just changed the namespaces now I've talked a bit about the problem that I developed last year but not everyone was here last year not everyone remembers so let's talk about that problem it was a composed asynchronous operation which means that it took two lower-level asynchronous operations and layered on top of them some higher-level logic and the higher-level logic was that the operation was named async wait then right and what it does is write in the name it waits or some amount of time and then it writes some bytes last year I used this exact sequence diagram to lay out what this operation did and we're going to walk through this sequence diagram now on the left side as we enter the sequence diagram we see this initiator this is the person who kicks off the asynchronous work who gets our operations started notice immediately that they have a very very short lifetime and this is endemic to asynchronous operations because when you initiate an asynchronous operation you call an initiating function it sets up some work to be done in the background and then the initiating function is done your caller continues unimpeded by the fact that the i/o they said in motion may not yet be complete hence the very short lifetime since this operation waits and then writes the first thing we do is we kick off a wait on a timer and that continues through time until it completes and then we get something called a post against an executor what could that possibly be well if you think back when I talked just a moment ago about what an asynchronous operation is I said that the work continues in the background and people who work want to ask a lot of questions will look at that and say well will you mean background where is that whose thread is that who schedules those how many of those threads are there and the answer to that question as with so many grand questions is it depends because it depends on the executor that's being used an executor is a shallow handle to an execution context and an execution context is just something where call objects' can be run it could be one thread it could be a pool of threads it could be something that none of us have even thought of yet and the way you customized that is you form an association with something called a completion handler a completion handler is a callable object which you inject into an operation when you initiate it and by invoking this object the operation reports that the operation is complete so when the initiator initiated the operation they gave us a completion Handler and that completion handler carried with with it information about the executor that we would use to run background tasks and now the wait is done and so we need to run a background task and so we post to the executor at some time thereafter the execution context service is that post and invoke something called an intermediate completion handler which layers another word on top of completion Handler does it and that's because an intermediate completion Handler is part of a composed asynchronous operation a composed asynchronous operation consists of taking lower-level operations stringing them together and implementing some higher-level logic in this case we're not done the whole operation when we finish waiting we need to turn around and then start writing so rather than going off to the final completion Handler and saying everything is done when it's not our operation synthesize is an intermediate completion Adler whose only purpose is to string these parts together and sure enough we turn around and we kick off an async write on a socket we've waited these right now but then some curious happens we see that inside right we have these a sink right somes proceeding in a loop and that's because a sink right used in this context serves to demonstrate something beautiful about the networking ts and composed asynchronous operations a compose asynchronous operation doesn't just compose primitive operations like waiting but it can compose operations which are transitively composed asynchronous operations providing you the Avenue by which you build small operations and then expand on them continually until eventually you've implemented your entire application async right in this case calls async right some in a loop until it succeeds in either writing all of the bytes or it encounters an error at any rate at some point in the future it will complete there'll be a post to an executor and at this point we will have waited and then written all the work will be done and so we'll call the completion handler for our operation we'll hand asynchronous control flow back to the person who initiated this operation in the first place and let them know you're done now I promised a real-world snag but if you look at the sequence diagram you look at what it's doing and you project it out into the principle you can see that we don't have to compose waiting and writing we can compose any two operations and we don't have to compose two operations we could compose three and we don't have to compose three different operations we could just close the loop like async right does and we could compose in a loop we can generalize what I'm saying and say that this pattern can be used to compose any number of asynchronous operations in any way provided we do so in series borrowing some terminology from electrical engineering and what we're saying is that the composition is unrestrained except that we do one thing and then another and then another and then another and so on and so forth but if you take this and you rush headlong out into the real world it's gonna hit you right in the face all of a sudden isn't it this isn't the only way you want to glue asynchronous operations together you can't always solve your higher-level problem by doing one thing and then doing another thing sometimes you have to do things at the same time you have to coordinate between them let's think about those two operations that we compose on the previous slide wait and write we compose them in series one after but if we take its suppose that we composed them in parallel at the same time well that would be the building block of something that is widely bolon is difficult to implement in the networking ts writing with a time-out let's see how we might take a stab at implementing a sync right with time out we noticed at once that this sequence diagram looks very different than the sequence diagram we just looked at our initiator kicks off to operations and then both of those operations overlap and interleave with each other confusingly and we follow along the branches of asynchronous control flow and we discover AHA the wait finishes first that calls an intermediate completion handler which doesn't do anything that branch of asynchronous control flow just ends because if we think about it we can't call the users final completion under at this point because as I said earlier they call invocation of the user's final completion handler is a promise it says to the user all of the work is done and all you have to do is look very perfunctorily at the sequence diagram and you can see that all the work is not done there's still a write going on somewhere so this intermediate completion Handler it can't do anything it has to just end rest assured that the higher-level logic of the composed asynchronous operation will indeed invoke the user's final completion Handler when that write wraps up but we've run into another real-world snag here because if you think about this sequence diagram and you think about a time owed operation you realize that this sequence diagram is the failure case for a timeout if your wait your timeout ends first well that means that writing took too long and you timed out and so if this were the canonical sequence diagram the only one possible we wouldn't write this function because that would be an operation that always failed and I don't know I don't know about you but when I'm faced with an operation that always fails I prefer to just write no code it's a little bit more efficient no there's no reason to believe this sequence diagram is any more compelling than this one except that the operation is complete in the opposite or and now we see that it's the right operation that does nothing when it completes whereas the weight operation comes in and picks up all of the pieces and what this means is that our intermediate completion handlers don't have a definite role they have to dynamically pick up a role based on whether they are 1st or 2nd to run we've introduced an ordering problem implicit in the parallel composition and what this means more deeply when you think about it is that we need some way for those two intermediate completion handlers to communicate with each other you need some way for one to inform the other that I've already finished so you need to pick up the pieces we need a shared state and then as we iterate through thinking about this problem deeper and deeper we realize that we need a shared state for more than just this because when the right finishes first that's our success condition and the right is going to tell us how many bites we wrote and surely we want to turn around and give that to our user but because we finished first we need to wait to tell our user that things are done so we need somewhere to put that value for the weight operation to pick up and then continuing our thinking through of this process we encounter issue if we crack the networking PS open because the networking TS doesn't say that completion handlers have to be copyable which means that we can't use the strategy we use in series composition where we put the completion handler inside the intermediate completion handler because now there are two intermediate completion handlers and so the question becomes well inside of which one so instead we pick one canonical place to put this completion handler in the shared State the channel through which are intermediate completion editors communicate with each other let's take all of this that we've built up and try and ratify it in some code we'll start at the beginning this is the initiating function for async write with timeout we accept a stream to write to the sequence of buffers we're going to actually write the timer we're going to wait on to implement that timeout and the duration of that timeout and then we accept the completion token which may seem mysterious at first but as part of the guarantees and patterns the networking TS passes through for the time being we can just pretend that a completion token is a completion Handler and we'll ignore the related boilerplate if you're really curious about what it means there's a section in the talk last year entitled universal asynchronous model which covers the purpose of these tokens and what you can do with them the first two lines are again just boilerplate and so I've alighted them and we'll skip over them and then we have three types whose definition I'm gonna leave till later that's our shared state and our to intermediate completion handlers they only become relevant when the operation is completing and so it seemed appropriate to pull them out and put them somewhere else since right now we're just talking about the beginning line 80 is more networking ts boilerplate and we can just hop over that and then we grab ourselves an alligator to use to allocate the shared state and we allocate the shared state on lines 82 and 83 then the following three lines are us just getting things in motion we want to write so we call async right we want to wait so we set the timeout of our timer we call async Wade and we lead out with more networking ts boilerplate but as we reached the bottom of the slide there's an unanswered question something that I have glossed over that I probably shouldn't have and that is line 81 I snap my fingers and suddenly I have an alligator well where did that alligator come from who decided what that alligator is this is an extension of what I talked about earlier when I talked about executory I said that completion handler types carry with them implicitly an association between their type and a strategy for acquiring an executor in this way users as their completion handlers pass through layers of asynchronous operations keep their execution environment customized and in accordance with their requirements and the networking ts provides exactly the same machinery for alligators and so all we're doing on this line is going out to the networking GS going out to its associate or machinery and saying I need an alligator here's the users final completion Handler and sure enough I get back an alligator and I can use it to allocate my state all the networking TS asks when we do this is that we're very careful to make sure that if we allocate memory with this alligator we D allocate all of it before we call the users final completion handler but when you think about it that's not really such a big ask is it if you allocate memory to do some work and then you wait to call the users final completion Handler until you're done the work which you have to do because that's what calling the final completion handler means well you should be done with that memory and you should have D allocated it and then this provides a really nice guarantee for the user because it means that the user can provide you with something like a monotonic buffer resource and when they get asynchronous control back they can rest easy knowing that there's nothing in that monotonic buffer resource I can just reuse that buffer I can clear it as a no op or I can just let it go away I don't need to do any checking I don't need to do any validation the guarantees of the networking chest just make it so now that we've got our operation off the ground let's try and look at what happens as we try and bring it to completion let's take a look at our shared state which is just a bag of values we keep the write stream and the timer references to them and then of course we have to store the users final completion Handler somewhere as we discussed so there it is on line 13 and then we store the two values we're going to produce when we finish the error code and the number of bytes transferred and then we have a value called outstanding which we initialized to two this is the number of outstanding operations we set it to two because we have a weight and a write the sum of those two things is two looking at our intermediate completion handlers were faced suddenly with an entire slide of boilerplate code we need somewhere to store the shared State and then this allocator type and executor type this get executor and get allocator those are part of that associate or machinery of the networking to yes because I said before the associations are with the users completion handler type and I also said that those associations are maintained as that completion handler type passes down through layers of asynchronous operations but we took that user's final completion Handler and we stuck it in our shared state and then we invented to intermediate completion handler types and we pass those through to our asynchronous operations which means that now the associate or machinery of the networking test it doesn't see the users final completion handler it sees our types and so we need to write this boilerplate to make sure that our types behave the same way as the users final completion houndour when you ask for an executor or an alligator getting down at the bottom the side we have a constructor of course and then we also make sure that we're move copy ball so remove constructible and nothing else because the networking TS is silent on assign ability and we already know that completion handlers don't have to be copyable with all that boring boilerplate out of the way let's see what we actually do when a write completes when the function call operator of this object is actually invoked the very first thing we need to do is we need to disambiguate which of the two cases we laid out earlier we are in are we the first or the last operation to complete and so we decrement the outstanding count since the write just completed and so there's one less outstanding operation and then we check that value if it's not zero that means that the weight is out there somewhere still and it's gonna come in and clean up after us so we just propagate the values that we generated and then we're very careful to throw away our handle to the shared butter because if we don't do this in this operation happens to live long enough we won't clean up all of the memory we allocated with the users allocation strategy so we would violate those promises we made to the user now on the other perhaps more interesting hand if we find out that there are no more pieces of outstanding work we actually need to complete our operation but that involves the allocating the state and some of the values we need are resident in the state and so to avoid a use after free bug we picked them out onto the stack we've reset the shared put er there by destroying the object and then we call the users final completion Handler and this wraps up the operation except for the fact we haven't seen what happens when our wait expires so here is another slide a boilerplate exactly the same boilerplate with some find and replace done on it and then we get to the interesting part of our time o table we decrement the outstanding count we ferret away the appropriate value if we find that the write hasn't completed yet which is an error code indicating time out because that's what a time o means it means that your wait completed before the thing you were trying to timeout and then we're done unless of course we were the last ones to complete and then we have to pick some values out of our shared state for stay away and wrap things up great we can pose in parallel and we implemented a time out operation or didn't we because in a time out operation when the time owed expires the work your timing out doesn't just keep going but that's exactly what this operation does when the wait operation expires it doesn't stop you from writing just sits around and waits until you're done that's the opposite of what a timeout operation does and similarly with a timeout operation if you write really fast you're not gonna just sit around and wait for the timeout to expire are you but that's exactly what this operation does so it's not a timeout operation because we've run into a problem with our model of asynchronous operations our model of asynchronous operations is one where we initiate an operation and then the operation continues until it itself encounters a failure or success case but we need another option because we have another completion condition that we need to layer on from the outside we need to be able to tell our right I don't need you anymore because you timed out and we need a way to tell our wait operation I don't need you anymore because the right is already done what we need is a way to cancel asynchronous operations fortunately the networking Ts provides exactly such a member function on each and every one of its i/o objects takes no arguments you invoke it and if that Iowa object has any outstanding asynchronous operations banned immediately they end immediately with an error code which you can use to disambiguate cancellation from actual failure and actual success so this wasn't really a real-world snag was it this was more like I didn't read the documentation well maybe personally I've learned to distrust the real world and when a solution seems too perfect I look at it sideways so let's look at a sequence diagram of what might happen when our timeout expires so on the left here you can see that at some point in time we actually kicked off our right operation and we know that our right operation is a composed asynchronous operation it's composed of one or more calls to async right and so there is such a call and it proceeds in the background and it writes some bytes and it completes as opposed to an executor the intermediate completion handler of async write runs it looks at the values it receives and it sees that it wrote some of the bytes so it succeeded and I wrote some of the bytes but it didn't write all of them and so in accordance with the higher-level logic that it layers on top of async write some it goes back in it writes some more and as it's going on writing some more our time mode operation expires that intermediate completion handler runs it calls cancel on our stream and it finds that async write some operation outstanding so async write some completes there's opposed to an executor and then sure it didn't write all the bytes but when it calls the intermediate completion handler of async write async write discovers yeah I didn't write all the bytes but this operation failed so I'm done and then our write off runs finds it's the last outstanding operation and completes everything but that's that's still exactly what we want we haven't found any real-world snag here yet but the timeout that woke up and cancelled us that was something that the user provided to us and so I can just imagine that I have some user who provides a longer timeout and what that would mean is that cancel would happen a little bit later and maybe when that cancel happens that second call to async write sum is already done and maybe there was opposed to the executor but that underlying execution context it was just really busy hadn't gotten around to doing that yet and so when cancel comes along it misses there's nothing to cancel and our timeout operation winds up and thinks everything is fine and then async write wakes up sees that it wrote some of the bikes but not all of them yet and goes right back to writing and the write just continues until it completes naturally which is exactly what we set out to avoid at the beginning because what we were doing when we thought we could just cancel async right is we were assuming that async right was an indivisible asynchronous operation operating on the socket which was wrong and the in correctness of that assumption is raya fide in this sequence diagram so maybe we go and we look at how a sync right is actually defined it's defined in terms of a concept not a socket or a file or a buffer a concept a concept called a sync right stream and a few slides ago we saw a sync right stream I named one of my template parameters that so we get an async right stream and we pass it to a sync right but there's no rule there's no law that says that I have to pass through exactly the async rights during my god maybe I could decorate it I could decorate it to avoid this problem so that when cancel is called not only does it cancel all of the pending operations but my higher-level logic I'm gonna layer on top in my decorator is gonna be that cancel actually blocks future initiations so when we miss an async write loops around and tries to initiate a new operation that operation just fails immediately and then when it fails the higher-level logic of async right we'll bring a sink right to a closed and that will call out right on and that will bring our entire operation to a closed so let's write ourselves as the key cancel async right stream we just store a reference to whatever we're decorating we were storing a reference anyway so this is of no consequence and then we have our sticky bit whether or not cancel has been called it starts out as false we provide a constructor and we pass through one of the required member functions of our underlying object get executor it's required by the async right stream concept then we provide cancel with exactly the same signature as all of the i/o objects in the networking TS and we just call through to the underlying object but only after we set that sticky bit to true now an async rights from is async right stream isn't really any good at all if you can't write to it so we need to implement async write some on this type there's a bit of networking TS boilerplate here but the important thing to take away is it online is 18 19 and 20 if we haven't been yet we just pass through to the underlying object we just act exactly like the users stream does but if we have been canceled then we fall through to the bottom of this function and there's some boilerplate here will we get an executor and get an allocator and form a lambda and then post that lambda directly to the executor and then fulfill our obligation to return whatever we deduce from our completion token but the important thing is on line 26 on that lambda that we scheduled for execution as soon as possible well we take the user's final completion Handler and we invoke it we give it an error code that says you were cancelled and we also tell it that you wrote no bytes and therefore we closed off that sharp edge that we left earlier where possibly maybe sometimes we could just keep on writing for no reason because if that's going to happen if we missed with our cancel the sticky bit will be set to true and then when async write tries to initiate by calling this it's just gonna fail immediately which means that our operation will be wrapped up so now all we need to do is actually use all of this so let's go back and modify the operation that we wrote first we're going to go into our state and instead of storing a reference to the user's async write stream we're just gonna wrap it up and store a sticky cancel async write stream then we need to actually implement our cancellation logic so we go into our write operation and when we finish writing we tell the timer I don't need your wait operation anymore please finish it immediately and then we go into our time owed operation and we do exactly the same thing except for the stream except we're not canceling the stream we're canceling the sticky cancel a sink rate stream which means that even if we miss here the operation will still bring itself to a close now we've actually composed in parallel and implemented a time-out operation we've implemented that thing that people complained was so complicated to implement in the networking TS but in so doing I had to say to you parallel composition and then involves saying a dirty word doesn't it the word parallel because when you hear the word parallel all of a sudden the specter of undefined behavior comes up of happens before relationships and data races how does parallel composition work in the face this back in those halcyon days of series composition we had this really nice guarantee where an operation would run and complete and then it would start another operation and this kind of causality is embedded in physics not just computer science which means that even if the underlying execution context had say four threads well there's only ever one piece of work for it to do at any one time because of this logical causal relationship between the operations so we could just throw our hands in the air and say oh well I'm not gonna worry about parallelism and that was exactly the approach I advocated for last year but now we put two operations in progress in action at the same time so if that execution context has two threads then it's possible that both those operations complete and both of their intermediate completion handlers get scheduled for simultaneous execution and then they both go out and they try and access that shared state and that's data race that's undefined behavior and now my program is doing who-knows-what so there there is a solution to this embedded in the networking - yes there's an executor type called strand you layer it on top of another executor and then the strand gives you the strand guarantee which is that if you submit many pieces of work to me I'll only ever allow one to run at the same time regardless of the properties of the execution context associated with the executor it decorates everything appears single threaded which platonically is the solution but the title of this talk says real world not platonic domain and in the real world we might have a user who knows their application is single threaded who knows they're meeting the Strand guarantee in some other way and if we go and just blithely layer a strand on top of their executor we're adding a bunch of synchronization queues and locks when they didn't need them we're violating that fundamental principle of C++ which is you only pay for what you use and so that brings us around to another approach to solving this problem and that is pretty much a cop-out we just document in the documentation of our operation that when you execute this operation you must provide a completion handler which is associated with an executor she doesn't allow work actually submitted to it to run in parallel and this cop out actually has prior art basic stream from boost beast which is a library which implements HTTP using boost a zero slash the networking TS does exactly this and if we look at the deeper a basic stream to try and see is this really a motivating example of what I should do we'll find out that basic stream schtick the thing it does is it wraps a stream and then adds timeouts so through the whole first part of this talk we basically looped around and arrived at the same conclusion about parallelism that Vinnie Falco has so we're in good company but once again the title of this talk said the real world and if you just write something for your user and some documentation and run off you'll be fine for like two weeks and then you're the user and now it's your problem again so as we step up from the level of an individual operation to a higher level of our application how can we manage parallelism in the networking to yes what approaches can we use to work around the fact that single threaded code is really beautiful it's really easy to write single threaded code it's really easy to think about single threaded code single threaded code has no locks it's efficient correct by default but as with everything that's too good to be true the real world has to come along and hit us in the face and that hit in the face comes from arm AMD and Intel they say well why just put a lot of course in your CPU so if you don't use them you're not leveraging the whole machine and not leveraging the whole machine is something that C++ developers don't like to do so we loop around reflexively and we go back to what people have been doing for ages we go back to an execution context the networking ts provides one called IO context which has many threads that can execute work on you submit work to the execution context in the execution context distributes it among many threads and then at the highest level of your application you're left trying to associate executor with things you're left saying okay these two operations need to run that strand and these three need to run on that strand these four on that strand and so on and so forth do you really trust yourself to get that right you're not gonna mess that up and have a latent bug that takes years to find but let's assume that you're perfect are you really gonna be able to write down the structure that you've concocted communicate it to the next person has to maintain that code and then have them in turn maintain it perfectly probably not so maybe there's a different approach maybe we can flip this on its head maybe instead of having one execution context with many threads we could have many execution contexts each with one thread then everything we distributed to any of those execution contexts would appear single threaded we'd get all the beautiful benefits of single threaded code the only problem is that now we're left holding the bag whereas before we just kind of like dumped a bag of work on an execution context and it figured out where to run it now we're the ones figuring out where things need to run but that's a blessing and a curse because that means that we can put things in certain places and no they always run single threaded and when we think about this many problems in the real world actually have a way of distributing work that just falls right out of them it may take on many forms but let's think about a TCP server a TCP server repeatedly accepts incoming connections and most of the management of a TCP connection reading and parsing bytes managing timeouts sending messages that's all single threaded there's an implicit single threading there but then there's also an implicit parallelism in the problem because you accept many connections and so each time you accept the connection you can just distribute them across your pool of workers now your program probably has some global state it has some connection manager so you can actually ask it who's connected but you can factor that out into an independent component add synchronization to that component build confidence in that component and use that component well everything else you write now gets to leverage the efficient and correct by default guarantees of single threaded code although we have to do now is we have to actually implement that round-robin and the class you use in the networking test to accept TCP connections is basic socket acceptor and it provides an overall which does exactly this we can call async accept give it an i/o context give it a completion token and then when it pops a socket out the other that socket is going to be associated with the IO context you provided not the IO context that was provided with the acceptor so now we just need to implement the logic on top of that so let's write some code we're gonna start with the intermediate completion handler here the code that actually gets run when we end our accept we're gonna store our acceptor and a triple of iterators because we want the beginning and the end of our pool as well as where we're up to in our round robin then of course we have our completion Handler and this networking TS boilerplate that i've hand-wave played before which leads us implementing the function cooperator the actual meaty part of this class all we do in here is we bump that iterator and if we get to the end we loop it around at the beginning and then we pass it through along with all the synthesized values to the user instead of just receiving a socket and an error code the user receives a socket an error code and an iterator to the next executor they should use in round robin their work which takes care of what we do when we finish accepting so that's actually get started accepting we write the initiating function it contains quite a bit of networking TS boilerplate but the important thing to take away here is that it creates an intermediate completion Handler type and then just kicks off except with an intermediate completion handler type and the result of dereferencing the current iterator dereferences it gets an executor gets the context associate with that executor and passes that in then when this completes since we already dereferenced and use that iterator it's going to get bumped wrapped around and passed through the user and then the user can handle the connection however they want they can do whatever they want with the socket spin it off start some asynchronous processing on it and then they can call us again they can accept the next connection and when I lay that formulation out no one questions it it's very unlikely that anyone said well you know what in my workflow I would really like to accept just one connection but still round-robin execution context yes that doesn't really make any sense does it the way we've set up the round robin in here pretty much implies that we're going to accept a lot of connections if take a step back and we look at the model of work the networking TS has given us we encounter a model that doesn't match the model that we have in the real world because at the very highest level of our applications we don't say okay I'd like to accept one connection like to write 15 bytes I'd like to parse one message and wait for five seconds we say something like I would like to accept every single connection and I'd like to send all the messages receive all of them and wait for every single one of the timeouts the networking TS presents us was this model which is great for building up the building blocks of our application wherein we execute a finite amount of work and then succeed or fail and we're done but as we approach the highest level of our applications as we bubble up to Maine we enter a different domain entirely we enter a domain where the pieces of work that we're trying to initiate are unbounded where they go until they can't go anymore because we don't want to just accept one connection we want to accept every single connection and when you get a connection and you want to handle it you don't want to just read the next message and process it you probably want to read the next message and process it and then the one after that and so on and so forth in some sense both of these tasks completely banal go on until they can't go on any further which leads us to a conclusion for modeling our software at the highest level using the networking - yes and that's that maybe some of our operations they don't need to be able to succeed because success implies that you're done and there were no problems and a lot of networking tasks well they do finish they finish because there was a problem they finished because the connection was aborted they finished because the client timed out or in the case of accept the only really plausible example I can come up with is that they fail because the networking stack of the operating system went down at any rate these operations conceptually don't report success and armed with the knowledge of this models our problems better than the model of the networking TS we go out look and see hey there's nothing that actually prevents me from writing an operation that never succeeds it's not a requirement of the networking test the amount of work I kick off with an initiating function is bounded that it be possible for success to be reported so let's leverage that very very small composed asynchronous operation we wrote just last slide let's close the loop and write a powerful operation that eliminates the boilerplate of writing a TCP server once and for all here's the intermediate completion handler note that we kind of store the values that you would have expected from the previous slides we store the acceptor that we're going to accept on but now we don't need to store the current iterator we store a double rather than a triple and that's because we're going to rely on that operation we just wrote to manage that current iterator for us then we have this after accept object it's a function object which can be invoked with one argument and we use it so that we don't consume responsibilities we don't want to because we just want to accept the connections we don't really care what you do once they're accepted and so we have this customization point so the user can inject the logic for actually handling each incoming connection and then of course we have that lonely completion handler that we hope that we'll never have to invoke we have this networking Ts border plate which we can pretty much just hand wave away and then we have an initiate member function which kicks off the next iteration of our asynchronous loop it takes in the current iterator and then notice how the last thing it does is move from itself this is a fairly common idiom and the networking TS and Vinnie Falco and perhaps others call it stack ripping but at any rate if we move from ourselves and also try on reference values resident in ourselves that's implementation to find behavior and we prefer to avoid that so we spend the three preceding lights picking objects out of ourselves we kick off the call and we move ourselves into that call which means that we ourselves are going to be the one that handle the completion and there's what that looks like we're invoked by the operation we just wrote so we take in three values an error code a stream an iterator and then we have the soil completion condition for this operation if the error code we get in is true the-- that means that we failed at accepting and so only in that case do we invoke the user's final completion Handler look as you may you won't find another invocation of it assuming of course we succeeded which overwhelmingly except operations do we're going to spin off that socket into that customization point because we don't really care what happens when we accept the connection the responsibility of this operation was just to accept all of them then once that's been handled we closed the loop we call initiate on ourselves with the current iterator that was provided by the operation we're composing which takes care of every single iteration except the first one and so in the case of the first one we're going to rely on the initiating function to get us going we take in all the values you would expect the acceptor beginning and end iterators that customization point the after accept object and the completion token we layout the networking ts boilerplate and we create an instance of our intermediate completion Handler and then we just call initiate with the begin iterator once we fall off the end of this function and return we're off in the background round-robin over execution contexts distributing work accepting every single connection we've managed to model the archetypal TCP server we don't need any boilerplate in main we don't need to copy/paste from one network server application to another just to get things going we have just one function we call it with the right arguments and it basically execute our application for us so as you leave this talk what should you leave with the first thing you should leave with is the fact that well the networking TS provides a framework with a lot of great guarantees and while the real world appears to fight back against all such frameworks if you take a step back and invest a little bit of time you can reconcile the two and if you do that not only will you work around the sharp edges and solve problems in the real world here today but you'll be synthesizing reusable components that solve those same problems and which can be tested and reused for all time the end of copy/paste programming the second thing we looked at was a class of problems that were not amenable to your archetypal series composition that required that you bring asynchronous operations together and allow them to execute together simultaneously we looked at the snags of that with respect to the bookkeeping that you have to perform and with respect to parallelism itself and then regarding parallelism we saw strategies for managing it we saw how when you're writing low-level components use kick that can down the road and then once you arrive at the highest level of your application we have powerful strategies for dealing with that and injecting it and passing it down to all the levels of our application and lastly arriving at the very highest layer of our application we saw a strategy for closing the loop for modeling the kinds of things we see in the real world directly the kinds of operations which perform an unbounded amount of work which continue forever once put into motion unless they fail and that gave us the possibility that rather than our main function being a bunch of repetitive glue code that closes loops etc we might be able to implement our main function as just one call to one initiating function for one highly reusable highly tested asynchronous operation are there any questions there are a lot of do you me okay a lot are we framework for booking and so on for example they know this monthly bet that we use in 2011-2012 for the Iowa printer we'd haven't true why don't we need so much complexity in this case because you have seen that you read it you you you get from the past the proctor pattern and you go hide in the same way it's not to compress for the backing that why do we need such complexity because when were the cases goes very well it's fine we are very happy now when you are too bad the battery with working code in the proctor it's very hard in my experience well you think about well there's there's a couple problems layered in there like the one is why do we need this much complexity right and one of the things that falls out of that is that a lot of the complexity in the networking TS is actually glue code right which is something that you can mitigate through strategies that I considered putting on these slides but it just made them blow up because you'd have to like bootstrap yourself from first principles basically so you can eliminate a lot of that dead glue code that way the other issue is is that the networking ts being like a TS and going into the standard and as you're trying to be a framework is that you're basically trying to not just solve today's problem today you're trying to solve the problems people think of in the future for example if if you could constrain yourself when you're writing networking TS applications to basically using the dispatch method that nodejs uses always single-threaded like we wouldn't have to worry about most of this stuff right because we would know that things never run in parallel we know that nothing else is running when we're running we wouldn't have any data races but it's when you throw in the possibility that the user knows something you don't and is letting your stuff run in parallel that you end up with all this complexity and then when it comes to time modes one of the reasons why timeouts themselves are so complex is because again the networking TS is trying to take and provide a cross-platform abstraction for timeouts and so while most platforms with blocking writes and reads they'll have a parameter you can set kind of behind the scenes that allows you to time it out asynchronous operations don't really provide that like as far as I know I could be wrong because I haven't written completion port code in a long long time you can't call like send X and begin an overlap send and tell it to timeout no because the completion code right thank you hello thanks further talk I have a question about goroutines and this executor model how how well they cooperate because I've seen few talks about co-routines and they have some Kuwait kura-kura turn and a steady feature and when how these two models will cooperate or build lived together in STD right so a lot of a lot of the boilerplate like kind of the meat of that is covered in well last year I talked about the universal asynchronous model and the reason I called it that was because that's the title of a paper that Chris wrote a long time ago about the completion mechanism in the networking - yes and basically the stuff that I hand waved away today the completion tokens what that does is it provides a framework where instead of like getting a callable object which you invoke when you're done you get in this token and then there's a customization point in that token where you split that token apart and you get an actual completion Handler and the return value for your initiating function now most of the time people are like here's the function right and so the completion endler is just a function and the return type is void but you can do other way more complicated things like you can transparently turn any compliant initiating function and the corresponding operation you can just pass in a token that in the case of boost is Boost as the OU's future you pass that in all of a sudden instead of returning void that operation returns a future right and I believe from what I've seen and I don't know what version it went in but at least in boost 1.77 SEO has a token that you can pass in that's called use away table you pass it in and all of a sudden great Nick you can use code routine operations on the outside of that asynchronous thing the deal with executor or executor so just a way of getting the code somewhere to run rank as we experience this withstood async one of the things one of the things there's a lot of them one of the things that's wrong with stood async is that when you call stood a sink and you and you launch something you have no idea where it's going right like where are those threads whose threads are they who's managing them you have no idea and a lot of times users want to actually know where their stuff is executing for good reason and so all an executor is at least in the conception of the networking - yes because there's also executor zhh themselves and they're trying to be reconciled together that's one of the reasons that networking didn't get into C++ 20 so all you're doing there is you're providing the context that you run in right because I come from HFD and hft the thing that you care about is what core is all this going on not just how many threads do you have or what overhead there is but like I want to pin this to this core and if you use likes today saying how do you pin something to a core when you don't know what thread it's running on whereas when you just provide this context execute in you can pin that thread to a core and then you know that all the work is going on to this core that's something that I do in applications we write at work every day I take in older versions of boost it's called IO service it's called IO context now I put a thread in it I pin it to a core and then I or exactly where it's running so I think they solved two different things right like the the co-routines is how we inform you that the operation is done right so you can integrate it basically into code looks synchronous and then the executor basically gives you the behind-the-scenes context that actually gets that work done okay thank you and I have another question is there they'll be well will there be a yellow support from boost in STD library sorry what was that will there be detailed TLS support in STDs to library I'm not sure what you're asking the booze Oh has Duty allies oh that's right I don't think that's in the networking ts right now so for that you have to use boost yes which I don't I mean I don't consider that to be particularly onerous but maybe in the future they'll get it and the thing about the networking TS itself is that it's kind of a pain but you're not limited to just the types of TS gives you because as long as you understand the behind the scenes machinery you can actually glue in and write your own i/o objects so either you can just keep using boost right if you need TLS which is that's a compelling option right or two you can write your own TLS stream using open SSL which is looks like the networking TSN does the same thing that's number two or number three like someone's gonna take the one from boost import it so you can just download it it's like a header only library or something like that but for now they don't have they don't have TLS in a lot of things that are in as the owner really cool and really useful aren't in the networking yes like there's a class for basically performing asynchronous i/o except the i/o you perform is getting a signal that's not in the TS either okay thank you so you were doing your your move from this how do you stop it is you expect to you user like if you want to stop accepting because you got a shutdown and you want to finish off your current connections right so you're moving your your acceptor around all right now you probably lost some reference to it or do you have like a shared point or somewhere where how are you managing that or do you expect in the after except that the user throws an exception so the the question was related to what do you do with the acceptor like you're often in in the background and you're accepting all these connections and now you want to actually like gracefully shot your application down but and I think this was the important takeaway from your question but you want to make sure you finish out all the connections yes so what you can do in that case right is barring parallelism because parallelism adds a layer of complexity onto these problems and the way you get around parallelism typically is you like I have my single threaded execution context right say the one that the acceptor is associated with and I need to do something to the acceptor but maybe something in that execution context is using the acceptor so that would be a database so what you do is you take your operation you put in a lambda and you post it into the execution context right so that's how you can kind of like and then we're not moving the acceptor around we're just holding on to a reference to it so that the acceptor itself stays in main and so then what you can do is you can post yourself into the execution context which is single threaded so you know that no one's racing with you on the acceptor and then you could just call closed on the acceptor and when you call closed on the acceptor it actually goes and closes the underlying filesystem handle and causes it's basically like canceling right all of a sudden the pending operations on the acceptor will fail which means that the whole operation we wrote fails which means that your completion handler gets calls and then you can do whatever you want but because we spun off every connection with after except they're unaffected right unless you know the logic of your application dictates otherwise because you can do something like you can throw from after except if you want right if you get a connection and you know the connections from a magic IP that means shut my application out you can just throw and then that percolates all the way up and is handled however the executor handles it and the executor isn't the ts handle exceptions by going throwing their hand in the air and just letting them pass through okay thank you about the treads so you have an executor when you pass some acceptors and my question is about an embedded environment for example with the night frequency speed prediction opening close and the memory consumption what about that so I don't I've never had a reason to profile the built in execution context it's always I work in in in HFT so for all the stuff that like is really needy we just go off and we like spin and so I've never really encountered a problem where that was a problem we do have some libraries that use the networking TS for something vaguely resembling trading operations and it's fast enough there but this is the other great part about the networking ts is that if the network and gets provides you with something and your vendors implementation of it is not up to snuff most of the stuff throughout the entire TS and that kind of comes back to your complexity question earlier most this stuff and the TS doesn't really care about what the type is it cares about the concept so if you get really upset with the built-in execution context you can write your own right I've actually written my own execution context and it can dispatch work however at once you can have whatever limitations it wants exactly and that's a lot of the complexity and boilerplate the TS comes out of the fact that they don't want to pin you down if you want to execute on this executor will go for it if you want to do this go for it if you want to complete this way go for it that leads to a lot of boilerplate but leads to a lot of power right exactly yeah you may have not entered this already I came in late so I may have missed it but other than the name space how close is Boost ASIO to the networking proposal in terms of my exploration of what I understand is the current es so I pulled it from I forget where I pulled it from but I tried to find the latest TS and then I I consulted when I was writing all the examples which you can find right there they're fully worked there's even more than were used in the slides because I developed like six examples I didn't up using if you download them run them they run and execute but when I was looking at documentation to try and figure out okay what exactly do I need to do here I always just read the TS and so in my experience as best I've been able to tell it's best able to tell there could be one or two issues here SEO is a strict superset of the networking TS because it has everything the TS has as far as I can tell and then way more especially if you're working with like signals or the other example is windows handles right like the SEO has all these classes for asynchronous will dealing with Windows handles because its back-end its completion ports and so that's easy but the TS doesn't have any of that between the networking TS right so there's no conflicts at least to my understanding and at least right now right cuz the TS might continue to evolve and as you like well as he was trying to track the TS basically for now but I think my sentiment that I've gotten from talking to several people who have been involved in the standardization is that people are pretty happy with the networking ts it's trying to get the backend stuff to play well with executor so we don't end up with like two ways of doing the same thing which would be a tragedy since they're both on standards track at the same time Thanks that's all the questions thank you very much [Applause]
Info
Channel: CppCon
Views: 6,408
Rating: undefined out of 5
Keywords: Robert Leahy, CppCon 2019, Computer Science (Field), + C (Programming Language), Bash Films, conference video recording services, conference recording services, nationwide conference recording services, conference videography services, conference video recording, conference filming services, conference services, conference recording, event videographers, capture presentation slides, record presentation slides, event video recording, video services
Id: 3wy1OPqNZZ8
Channel Id: undefined
Length: 58min 35sec (3515 seconds)
Published: Wed Oct 09 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.