CppCon 2016: Kenny Kerr & James McNellis “Putting Coroutines to Work with the Windows Runtime"

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi folks thanks for coming I think we'll begin my name is Kenny Kerr I work on the windows team at Microsoft this is Jameson McNellis he's from the visual C++ team yesterday we had a talk introducing C++ WinRT our our new standard C++ language projection for the Windows runtime and today we're going to discuss how we're using curtains to turn all the asynchronous behavior in Windows one time into a far more effective programming model and also just taking everything that's concurrent about Windows the thread pool and so on and making it accessible to curtains James yeah so how many people here were in our talk yesterday introducing C++ WinRT all right good a fair number how many people attended at least one of the care routine talks today and keep your hands up if you've been to two of them Wow so we've got a lot of people here for a third round you know you survived the suspension points for lunch and the keynote and breaks incredible so it's reliable it is yes all right so just a quick note before we begin I know a lot of you then have seen this but we the visual C++ team are very interested in your feedback and to learn a bit about how you use C++ so we have this survey and if you fill it out I'm told that you can win and Xbox 1s I was told I cannot win the Xbox 1s so I'm well I hope you do all right so we're gonna jump right in yeah so briefly this is what it looks like trying to produce async operations today right so you'll have a function that returns for example a stud function stood future and then within that you would use stood async to go and schedule some work on a background thread and then you would run all of your hard expensive computational work in that on that background thread and then return the result through that future and consuming async operations looks a bit like this so in our consume function we'll print out that we're about to start the operation we'll then go and start the operation but the only way that we have to use the future after that is the synchronous blocking get and so we'll do that and now we'll be blocking this thread too so much for a synchrony and then we'll print out the result and if we want to do a continuation today we have to do something like this so our consider our consume async would actually have to start its own async operation to go run on another background thread and it would do that it would call produce async and do the blocking weight there so now we've got three threads but at least we've freed up the original thread and we're just blocking some other thread from making progress so ideally the solution here I think is future dot ven which leads to some you know beautiful code like this this is actually a very simple code for those of you who are my talk this morning we showed the the example that tried to do a loop asynchronously but so anyway this is not particularly pretty and it's also not particularly efficient because of how future dot then has to actually be implemented so async with C++ WinRT and co-routines looks a bit nicer this is effectively how your synchronous code would look if these were synchronous api's right nice you called you get the file you open it you you know decode it as a bitmap etc whatever else this program is doing yes doesn't have to yeah it would have to be a co return yes thank you for the clarification well that works today but yeah yeah so the the specification of what the keywords look like and whether you need to use like a return or a co return has changed over time and Visual C++ still supports some of the older syntax ease you know as well yes we support the new one too and definitely use that this one slipped through in our three hundred slides that we've been working with yeah yeah sorry about that yes all right and so if we look back at our simple example our continuation looks much simpler right it looks effectively like the blocking weight except now we have a KO await that's non blocking additionally we don't actually have to use to the future for this we could use another type for encapsulating work so for example in the Microsoft implementation we also have a concurrency task type which is part of a ppl library this has some extra efficiency improvements over stood future especially in the Windows platform where we can utilize some knowledge that we have about the underlying platform and thread context and so yes yeah we think we have something bitter though yes I specifically when you're using us the best party I have but yeah we'll get to that now in a moment as we show you how that all works why don't I use that thing so first time I every we want to make sure we can cover this an area where you just don't want to have any concurrency at this point perhaps you want to just block you're running a console app a unit test whatever it might be so we do provide a dot get method which is very analogous to stude futures yet so if you do want to just block and wait for that result we shouldn't allow you to do that essentially what we do so if we if you were here for our talk yesterday you'll understand how one of the interfaces are basically aggregated together at compile time but conceptually we will add a get method to all of the async interfaces so you can make that call and it'll return the result of the get results method if it's a nice incorporation or it'll just return the void if it's a nice Inc action just as a matter of course the implementation happens to be very efficient so to begin with we make sure that given if the async operation has in fact completed we certainly don't want to suspend or even suspend it always thread we simply return and you can carry on on the other hand if we do need to suspend we take advantage of the operating systems locking condition variable primitives so this is just a standard condition variable concurrency pattern but it's provided by the Windows operating system with what's what we call a slim reader writer walk and so essentially the completed handler which is part of the winner ta sing pattern is provided with a with a lambda which will acquire the lock and sit the completed variable and then wake the coaling thread meanwhile on the OS thread that's blocking we also acquire that lock we then atomically sleep and release that so that that the completed handler can acquire the lock instead that variable eventually it'll be raised from the sleep and returned and so you can keep going so that's our blocking suspend just as a matter of course it's very efficient if you need it it's there but really we want you to go from this produce get to a co wait produce this is a cooperative wait essentially the same thing you see the code is very similar but this is the model that allows us to do a lot of more concurrency and that's really the focus of this talk James alright so we're gonna talk a little bit about asynchronous operations in the Windows runtime so starting in Windows 8 when the windows runtime was made a part of the OS the one of the philosophies that they used to guide it was that any operation that would take you no longer than you want blocking the UI thread would be an asynchronous operation and so all of the i/o is asynchronous all of the you know heavy parsing operations are asynchronous certainly any network operations are asynchronous so there's a lot of asynchronous functions and they have some base interfaces that provide access to these and so one of these is a async info and the two key pieces here are that it allows you to get the status so it will tell you if it's completed if it if it's still in progress if it's been started and then it also allows you to cancel the operation if it's in progress or request cancellation and if you have one of these interfaces you can just use it like any other comm interface so for example if you get a nice Inc info from somewhere then you can call get status in order to get the status and then you'll have to release the I sync info pointer when you're done with it in C++ 1rt we project this into a much simpler a much more C++ friendly form and so here's the I a sync info definition from the C++ WinRT projection one thing that's obviously notable is that the status function now just returns you the async status you don't have to pass it through an out parameter that's I think the fundamental thing there here's an example of using this so if you get an async info there's no pointer involved you just get the object you can call status on it and then when the async info object goes out of scope it'll automatically take care of the reference counting and the release so there's four different kinds of asynchronous things in the Windows runtime and they fall into two categories so the first category is I a sync action which is used to package up that returns no result so here you can see that in the get results function which returns void tasks that return results are wrapped up in IA sync operations and so that's a template because it can return you know various things depending on the API and it has a get results function that returns the actual thing additionally there are two interfaces that allow progress reporting and so each of these works just like the the similarly named interface except they have two extra functions that allow you to get and set a progress reporting handler so that the producer of the API can tell you how much progress it's made how much work and thinks is left to do and then you can consume that and print that out or use it as you see fit in your application yes it's the progress type so that'll be the type that'll be passed into the handler and we'll see that a bit we have an example showing how they work so here's an example of the difference between them so we have for example get file async obviously it's going to get a file so it has to return you the ice the storage file and so that is an ia sync operation whereas a file rename operation doesn't actually return you anything that you can use and so that's an ia sync action here's an example showing how progress reporting works so this is an example that doesn't use progress reporting we go and we open up an RSS feed and then we iterate through the items in print print out the results but that function is actually an i async operation with progress and so we can hook up a progress handler if we'd like and here's how we do that instead of Co awaiting directly on the result of the retrieved feed async what we'll do is we'll take the ia sync operation and we'll store it in this local async variable well then sign up for the progress notifications by calling the dot progress setter and we'll pass it this function that allowed the results and then we'll actually do the KO await and so by doing this we'll end up with something that looks like this where it prints out the progress letting us know how much progress that's made reading the RSS feed and then we'll print out the results at the end with the loop that we had before on the other side in the production of this progress information here is a little structure we can use for progress reporting we're just going to report the number of bytes retrieved and the number of bytes that we expect to retrieve and then within the function we'll get a progress reporting token by Khoa waiting on this gap progress token that we'll see a bit later and then in our loop will download the next chunk or do whatever we need to do and then we'll just invoke it like a normal function cancellation is implemented in the same way so at the beginning of our function we can get this cancellation token which will return us a canceled predicate and so we can query this and it will return true if cancellation of the operation has been requested false otherwise and so then in our loop we can actually just check every time through the loop have we been canceled if so then okay we'll be polite and we'll stop executing and in our application what we'll do is we'll start this operation off well then wait a couple of seconds to let it took a few times and then we'll cancel it and then that cancellation will feed into that async operation and we will stop the operation and so this would print out something a bit like this it might vary based on thread scheduling but it would look about like this yeah great so wonderful so yes yes we'll actually show that really well yeah for sure right so what we want to do is the first thing we do is we need to make the one or two async types away table so if you can imagine a thirteen called produce now week oh wait on that and we eventually get a result back and we want to print it out what is what does the compiler actually want to do with us well one option is for the first thing we'd have to do then is get the iesson corporation that that produce function returned and we check if the operation is actually ready if it's if it's completed in that case we simply Skip's us the suspension and go straight to a wait resume which will return the value or raise an exception depending on what happened with that operation if it was not in fact ready it'll call the awaits the spin function which and that's where we can call that completed Handler to set up a callback that'll X once that happens so that we can then resume the kuru teen at that point and then suspension occurs another option is to use operator koa wait so in this scenario you might imagine that you have some type like icing action that doesn't in fact have these three methods and so we want to adapt it to work in that fashion it might be just because you can't add those methods to somebody else's type you don't want to or and so on but it's up to you there are two options for that so what you can do is you to define a type that adapts it and then you define the operator code weight that takes the IIST incorporation and I wraps it in a weighted app so that the compiler then sees something that it can actually a weight on one of the other advantages of using the adapter is we can reuse the one adapter for many different types if necessary yeah that's right yeah we have four async interfaces in the Windows runtime and so we can have one a weight adapter that deals with all of them so wait ready what does that look like for all four these one our T types we simply call this status method which returns the async status and tells us whether it's completed if it is we certainly don't have to suspend not ready well this is where it gets a little bit more interesting I wait suspend what we want to do is we want to make sure that we can capture the calling thread context so we call this code get object context function it's not a great name for the function we're really capturing the thread context of the object context but that eye context callback interface allows us then to call another state lists our lander at some future time to then throw some work back on that originating context so here we can have the first lambda which is passed to the completed Handler and what that does is you know when that's finally cold it signals that that async operation or action is completed and so we call context call back to then resume the curry teen back on the calling context so this is very important if you're running a UI app you want to make sure that after your suspension point you're back on the UI thread so you can update your buttons and whatever else you're doing your application when when resumed a weight resume is called and we want to call get results at this point we know that the operation is completed either successfully or or fail and in both of those cases we want to make sure that we get the result back so if it's one of the icing corporation interfaces we'll actually have a return value if it's not an icing action it returns void but thanks to generic programming we can use Auto here and it's one of the few places you really want to use order but certainly this is a great example of that but the other scenario is that it actually the the operation actually failed and there was a perhaps an exception thrown within that other Carine that you waited upon and so there's two scenarios there one is you called the dot get method I introduced earlier and this is what you might call synchronous exception handling the get method as you would as you would expect withstood future will actually throw the exception if the failure occurred and you can naturally catch a deal with a catch block what might be less obvious is that you can do the same thing with Co wait now this is not a synchronous exception handling it's just cooperative exception handling it's really the same thing that we saw in the previous slide because the await resume function is where the exception is thrown through and at that point you're back on the calling context so it's just a regular exception being raised there's nothing exciting there but it doesn't mean you can handle the exception within occur routine or you can allow to propagate up through many many other curtains and handle it wherever you'd like it's a very natural programming model either way James alright so what we have to do in order to make these windows runtime async types usable with co-routines so the first thing to note is what makes it a co-routine type to begin with so for example this I sink action type we have to go and implement certain things for it in order to for it to be usable as a co-routine type and so what we're gonna do is we're going to specialize co-routine traits for this type this is a partial specialization so it'll work for any Co routine that returns an I a sync action we have similar specializations for the other three interfaces and if we look at the curtain life cycle just as a reminder from the previous talks the first thing that will happen is we'll construct the promise as part of the Co routine frame will then reach the initial suspend point which will ask should we proceed or should we suspend at this point well then get the return object by calling get return object which give us the actual eye sync action to return to our caller will run until whenever the next suspension point is will complete either by setting the return void calling return void or by setting the appropriate exception and then we'll determine whether or not we can destroy the co-routine frame or whether we should leave it around so here's an example of what an implementation of ie a sync action has to look like so as we said before we have to implement two different interfaces there's IA sync action and then there's the basic ia sync info this is not a complete implementation this is just a demonstration of the functions that have to be implemented so the ia sync action we have to actually support the completed Handler and then the IA sing info we have to implement the status reporting and the cancellation support and other things so one way that we can implement the promised type for this is we can actually just store the IA sync action as a member of the promised type and then in our so we'll create and will hold a reference here as part of the promise and then in our get returned object we'll return another IA sync action that returns that refers to that same async operation action in our final suspend then we can just return suspend never the promised type will release its reference but the caller will still have its reference if it needs it and so the object won't be destroyed there's one problem with this and if you were in Gore's talk then he talked about this and that is that we need two allocations in order to do this so we need one allocation we have to allocate the promise on the as part of the curve routine frame which is going to end up on the heap and then we also have to call this make a sync which is going to allocate an instance of a sync on the heap as well so ideally what we'd like to do is we'd like to coalesce these into a single allocation and just allocate the async object as part of the promise and so here is what we might want to do we have our promise type actually implement those two interfaces and be the async type this allows us to have only one allocation and we have the promise implementation and we also have the WinRT implementation all as part of the same class we run into however to some lifetime issues with this yeah so the problem here is that a lifetime could look like this so the winner ta Seng object could in fact have its lost reference released before the curtain has come to an end only later will the final suspend function be called and that's a bit of a problem Oh at least that's the one scenario we need to deal with another lifetime scenario we need to deal with is that the async object outlives the Kuro team so first final suspend is cold and only later the final common reference is released so this is called a condition race oh sorry race condition so essentially we need to go back I told you they'd find it fun no that's right so we need it we need to take a closer look at two things Komets a reference counting and the life cycle of a promise type or the curtain frame so firstly here's a typical implementation of Combes reference counting we have a drift and release here we're using a studio Tomic we increment the reference count and add ref we release the reference count in the release method and we get the remaining reference count and if it's zero we can delete the object that could be a problem though because we can't actually delete the promise type that needs to be destroyed separately because we didn't knew it up in the first place looking at the promise type we have two options there so far so first is suspend never and that basically says we're gonna destroy the Qura teen automatically go ahead and do that for us but that one worked because they might be outstanding chrome references to that same instance another option is to say suspend always with that so that essentially says you know the cue curtain will be destroyed manually we'll take care of it but that also won't work because we might not in fact have any outstanding chrome references so there's no one to destroy it so what we need to do is override the release method provided by the implements class template to do a little bit more interesting work so the first thing we do is we hoist up the just release we take that decrement operation and we hoist it up as a separate function so that we can reuse a little later on if the count which is zero we can safely destroy the curtain because we know that there are no outstanding references we also have to do a little bit more work though on the other side in the promise type we need to do a conditional finals to spend so instead of doing it always or never we need to say basically firstly release the self reference so release the reference that the promise type holds to itself and then if the reference reference counter reaches zero we know that it can be destroyed automatically so yes go ahead and do that for us otherwise no no wait will will take care of that ourselves a little later on so by doing these things you get to have an incredibly efficient implementation reducing them of allocations and it works really well any questions about that make sense all right great excellent so that's a look at how we implement the progress and cancellation so here's a KO routine that uses both the progress and the cancellation tokens so first we get the cancellation predicate and the report progress function and then we enter a loop and while we're not cancelled we stop running for one second I was going to say we sleep for one second but we won't do that and we'll see why in a bit and then we report our progress just to report which operate yes yes we're about to see that yep so the question was why do we Co await giving the tokens and we were just about to show that so then on the consumer side similar to what we've seen previously we'll call produce and we'll store the ia sync action there we'll attach our progress handler which is just going to print out the number of times we've iterated well then wait for three seconds and then we'll cancel the operation now what this is going to do is it's going to print out one two three and then it will see we will have to cancel the background operation it will notice that it's been cancelled and then it will stop running so there's three different kinds of things that we can pass Toccoa wait so the first one is the obvious one if you have a class that implements a wait ready a wait suspend and a wait resume appropriately then you can just co await that and those functions will be called as as you would expect they would be called and in the Visual Studio IDE if you're using that the co await operator will show up as blue or whatever your normal keyword color would be you can also however use the co await operator to transform one particular type into something that is away table and so for example that if we have a value struct here it does not have any functionality on it but we might want to be able to await it and so we can overload the Co await operator to take a value and return one of those pass-through objects from the previous slide with the with the actual number that we want to await provide back to our caller and so here yeah this is standard supported yes the question was if this is an MS SVC extension and it is not everything well okay I won't say that yes and but in the visual studio editor this may show up as teal to tell you that it's an overridden operator I actually didn't know that the IDE did that because I have custom colors but Kenny enlighten me to this fact so the third kind of thing you can await on is via a function called await transform so inside of your promise type you can define functor loads of this function await transform and what this will do is it will allow you to customize the behavior for a particular promise type when you await on different types of things so for example we're going to use this in order to allow us to await on a tag tag object like get progress token and actually extract state from the promise using that and so here's how we do that we define a type name get progress token T this is just a tag type you would never name this in your own code and then we define an instance of it named get progress token and this is the the instance that we saw in the previous funko routine example and then we're going to overload a wait transform to take an instance of this and it's going to return a progress type constructed with a pointer to the promise type and so this allows us to access the promise and return object from within the co-routine because if you remember this is allocated as part of the co-routine frame and you don't actually get to see it from within co-routine it's all behind-the-scenes unless you do this so that our progress type is actually going to be implemented quite simply it's just going to store that pointer to the promise it's a wait ready it's going to return true because we don't actually want to suspend we just want to return the progress token back to the caller oh wait resume will then just return the progress token and then we will overload the function call operator so that you can actually call it with the current progress and it will call the set progress function on the promise which will do the appropriate thing for the WinRT interfaces so the key here is that it's a lightweight away table type and doesn't actually have a whole lot of state to it it's not going to cause you to do any allocation it doesn't suspend it returns the the function object this progress type object and then inside of that operator we just notify the listener and we'll do the same thing for the cancellation token so here it's going to store the pointer to the promise the wait a wait ready will return true so that we don't actually want to suspend the await resume will return the copy of this object and then inside the function call we're actually just gonna ask the status have you been have I been cancelled yet any questions about that or how that works no questions is anybody still awake okay there's no a few people away good great so we've talked a lot about how to create proteins certainly in the previous talks as well but at some point you're gonna run out of things to wait on and so you want to create some organic things that you can actually build and do some real work with yourself and that's where the thread pool comes in so let's begin though with this function so I see it returns an eye sync action and so this is not a great curtain because it won't actually compile it must return something it's just a function give me something to return so one option is to await on something that you can actually wait on this is an existing available type in this case we're actually calling a an operator code wait for that standard corner duration but it's a valid option certainty option two is to actually forward an existing implementation so you have an implementation of iosing action before they don't that's just a regular function but nobody wants to actually implement those interfaces that's kind of a chore so we want to be able to integrate the windows thread pool in such a way that we could get that work done for us very efficiently and very simply so the windows thread pool and has were four types of objects that you can actually submit to the thread pool to get work on the thread pool as a work object which is basically you know here's a callback that you can put on the thread pool to execute at the earliest convenience another one is weight which is basically rape call me back on the thread pool when a certain kernel of an object has become signaled timers are another example where you could say run this callback when a certain timeout record or a certain time in the future and then I overlap by obviously the thing that the window spread pool was built on and built for it does incredibly scalable Iowa on the windows thread pool so the first star example is Windows a thread pool work so the example here is you know you have this really great curry teen and does a whole lot of work there's just one little problem with it there it just lacks a suspension point so because of that it never actually gets to the point where it returns to the coiler so that the call is free to do some other work concurrently how do we solve that well first we need an away table type so one option is to write in the way the bowl type like this has been never it's not very good though because it doesn't actually introduce any suspension and then we have this curtain which is certainly valid but it's faulty because it calls sleep and it again it doesn't suspend and so you're actually going to block so that the bottom call on your main function the blocks for however long it is 5,000 milliseconds and then so this assertion actually is guaranteed to succeed the status will always be completed because by the time it gifts it's already finished so that's a bit of a drag how do we get work on the thread pool it must be really really hard I think this is gonna be like 30 slides worth I thought so yeah brace yourselves and there it is there's a resume background away table type a wait ready as is no I'm not ready and so so it's going to suspend it then the await suspend function is then called which submits the work to the thread pool and that's the try submit thread for callback function there's another more complicated function which allows you to create a work item and submitted multiple times but in this case this is all we need the stateless lambda there then goes and takes that curtain handle and resumes it which gets your work effectively on the thread pool in a very simple manner it's almost as if a threat while we were designed for curtains it almost yeah almost that void starts like that voids are new that destiny was 2010 carotene that's it so we take our original function that does all that condition and computationally expensive work and all we need to do to get it on the thread pool is to code awake resume background prior to the call you're still on the calling thread context but amazingly off to the cold you're living on the thread pool and just like that you can move work onto the thread pool very efficiently and very scalable one function to different threads there we go so let's look at a it is pretty cool it's worth passing form and let's look at another example so that this really sink in so here we have a produce function again it's a cur routine at Kuwait's to spin never and then sleeps for five seconds so that's an OS sleep the the actual operating system thread will go to sleep in the main function we call produced twice so we have two icing actions and they'll be a five-second delay here and another one over here so that's not a lot of concurrency what we want to do is Co wait resume background now we have a lot of concurrency because that delay is combined down there and so to visualize that a little bit better you can imagine there are two threads starting these two different versions of the application on the left hand side you can see suspend live and you can see that it printed out the thread ID in every in all of those points through the code and you can see it never switches threads it's always one thread and there are two individual delays there for with resume background you can see the initial main thread running but at some point you get a five-second delay and then you see those two threads from the thread pool kicking in to do the actual work in the background and then the main regimes let's look at one final example of that so if you were here for our talk yesterday you would have seen our example very cool application we took an image with some text in it and the user would pick the image and it would go to the application where in a background thread in the Qura teen we would actually do the work of doing some OCR on that image finding the text and then returning to example application we would displayed it on the screen and that involved these two essentially these two curtains so there was a foreground a sink which used the file open picker which is the thing that pops up the little window you can select a file and then it went when I got the storage file it would posture the background a sync file a background a sync did that work and it used Co away to resume background to get the work onto the background thread so it starts off you know you require the UI thread here because you know you're interacting with UI resources we then offload the work to the thread pool and then we're back on the on the UI thread over here so thanks to occur routines this isn't too hard but I think we can do it a little bit better maybe we could put the functions together yeah yeah you think we can do it in one function I hope so okay well let's see what we can do so here we have what I call a thread context in its constructor we get the calling thread context and we capture that and we store it in a local variable the high context call back variable right there and then later on when that curtain resumes it'll it gets the occur routine handle and then resume the curtain on their original calling context so how would we use this well here's an example first we start off on the UI thread we use the file open picker we then capture the UI context immediately switch to the thread pool we do some work on the thread pool and then we can return to the UI to the UI thread simply by Co waiting on that UI thread and we finish the job done this is going to make my unclick handlers a lot simpler I think yeah I think yeah what do you think yeah so are there any questions about this this actually works yeah oh no yes are these things part of the header that the library that you generate today or is this separate code yes that's a good question so yes this is part of the base library so there's you saw yesterday there's a there is a compiler that generates the projection based on metadata but that's backed by a base library which has all of this stuff in it so you'll have get this for free with with the base library it's included in the projection yes what if an exception is thrown while you're on the background thread that's fine whatever ends up happening is when you know whenever that whenever that is resumed you know how wherever the thread is it'll get I mean sorry waiver the exception is thrown it'll be caught by the next boundary and it'll just probably get out the stack so within here you could certainly wrap that like I showed earlier with try-catch block and you can catch it in here you could allow to propagate up and the icing action which is the return type that'll be the thing that actually holds on to that exception pointer so whenever you wait on that that guy you'll get the exception again so it'll propagate up in whatever way you want to do it for your application whether it's catching it here like I'm going to propagate catching it later or allowing your application to terminate anyone else all right I do want to note that so while this example used resume background use the Windows thread pool it would work with practically any thread pool since most thread pools either let you pass a C++ function object of some kind or a function pointer and a void star and that's all you really need in order to implement this yep all right so as we said though work is not the only thing you can put on the windows thread pool so you can also put weight objects on the thread pool so here's an example of a program that does an inefficient weight so here we have a producer and a consumer and in each of those we're going to go and resume on a background thread and the producer we're going to go and do some potentially expensive work and then we're going to set an event to its signaled state in the consumer we're also going to resume onto a background thread and then we're going to wait so we're going to wait for that event to be signaled so this is inefficient because while we have moved the the weight onto a background thread we're still just sitting on that thread not doing any work ideally we'd like the the thread pool to coalesce all the weights and then as objects become signaled it will run the appropriate callbacks and so to support this we can implement a resume on signal we have a constructor here that a well we have two constructors here one takes the handle of the object to be waited for the other takes the handle to the object and then a timeout in our wait ready we just check to see is the object already in its signal state and if it is then we don't need to go through the suspension in suspend we go and we create the thread pool weight object which and we will submit it to the thread pool and this is what will actually tell the thread pool when you when this object becomes signaled then go and schedule my continuation which is going to be the the the resumption of the KO routine and then in a wait resume we actually return a bool and it will tell you if the wait succeeded or if it timed out and so we return true if it succeeded and false otherwise and so here's our inefficient wait again that we can convert this into an efficient wait by changing that KO await on the resume background into a KO await resume on signal and so this will now basically stop running that KO routine and it will schedule the rest of it on the thread pool but the thread pool won't actually run it until the event becomes signaled so we're not sitting on a thread just turning or well just sleeping I guess yeah and here's how it would look with a timeout so we can await a resume on signal with a 500 millisecond timeout and if it succeeds then it will return true and we can use the state or do whatever it was we were going to do and if it fails then it timed out and we do whatever we need to do there great another example is the thread pool timer so in this scenario you might want to do some increasing delay perhaps for a network retransmission and so we start off on the on the background on the on the on the thread pool and then we do this increasing delay while the wait for some packets to arrive perhaps but this is not very efficient for the same reason you're home you're hogging a thread pool thread you're sleeping that's not efficient way to do that kind of waiting so what we can do is we can introduce a resume after away table type here we have another duration which is just a standard Crona duration that's just a type alias in the Windows runtime and a wait ready you it'll see well is there any time left and then I should actually wait on if not in our resume immediately but if there is we'll create a thread pool timer object and we'll submit it to the thread pool and that callback runs which resume x' the career routine on the thread pool when that time has elapsed very straightforward so we go from our inefficient timer to a for more efficient timer using resume after now there was a question earlier about how do we do that one second Coe wait and that's simple what we have is the operator crow wait and you should notice by now we have basically the conversion which turns it into an away table type using the resume after type I just showed you and time span is just a type def for a scheduler own under a ssin yep with the appropriate periods that it matches the OS time span yep so you can now literally write something very simple I want to co wait for one second over here perhaps 500 milliseconds and it's a very natural way to write your code if you have to do that sort of work and it's very efficient it runs on the thread pool so the other kind of thing we can put on the thread pool is IO and we're not going to go into this example in quite as much details the others but here for example is a a cub routine that opens up a file on your hard disk file name dot txt I'm sure it's got something very important very secretive in it yes and then in a loop we're going to read chunks of data from that and we're going to do so asynchronously and that read function is actually not particularly complex the only thing we wanted to point out here was basically all we have to do is call read file with the overlap with the appropriate parameters for it to do asynchronous IO or overlapped IO as it's called or that api's specification basically all we have to all of the work that we're doing here is just to you know convert the parameters into the right form and then do the appropriate error handling the just coincidentally over there the overlap structure is actually allocated as part of the curtain frame since an incredibly efficient way of handling this and we take care of that lifetime for you doing through curtains cool so with all those we should start writing a lot of curry teens that call sleep yeah and yeah yeah you think is it gonna perform there is it going to perform is any of this going to perform I don't know it's that future is part of the standard library it's gotta be good right it's gotta be good yeah um yeah okay let's see yeah so we wrote this we have this little benchmark here and all it does is it's three nested care routines so care routine one calls a routine to and then it waits on it care routine two calls curtain three and does the same and then couraging three just moves to a background thread and it doesn't actually do anything so what really trying to measure here is the actual overhead of our eye a sync action implementation compared to stood future or ppl task which is the the task implementation that most people are using for Windows runtime programming today and we have a few little benchmarks if you want to describe the benchmarks sure so that we have three of them get old as you can see it runs them all in series and waits for each one so it calls one and then waits for that to finish and then calls two and I mean the next one and so on for however many iterations they are then wait all it does the same same thing essentially but it does it in a loop so it first it builds up a victor of the right size and then it calls one for each one to actually create that protein and finally once it's created them all and actually got them all started it'll wait all cooperatively using that Wade ol function and it'll block at the end of that once you know it's done then there's one more run ole it does Co wait one so again it's going to do them one at a time but using Co wait and then it's going to wait at the end of them using the get method so they are all essentially doing the same thing at the end of it you would have had however many iterations running and completed but they test you know various nuances of the implementation yeah so I guess just two points so we don't have all of the fancy enhancements that Gore presented in his talk right and we're adding the overhead of calm to this so does anybody want to take anybody who I haven't talked to already about this want to take a guess as to how much more or less efficient it is then stood future alright nobody so here's the craft so these are the the results for a million iterations in yellow we have stood future in the brownish oh I don't know what color that is mustard yeah mustard Dijon has the ppl task and then the red one is the eye a sync action and lower is better on this graph the measurement is in seconds so yeah so you can see there's a 40 X difference in that middle bar chart between stood future and dicing actions yes no that's in fact what happens these are so the question was yes with the optimizations that Gore described previously this should have reduced everything down to a single return statement and maybe you can answer yeah no because because here we're actually building something that can work with the Windows runtime so there's actual runtime object here that you can get a reference to this isn't this isn't optimized away in that way we don't want it to be optimized away because we gotta actually give you more features here we'll give you something that you can actually give to another language this is a bi friendly you can give it to a c-sharp and they can consume it does gonna say the real reason that you can't do that optimization is it has this KO await resume background which forces that onto the background thread so that it can do that await and yes that a weight is not going to actually do any work but the transition to the background thread is still essential here so it can't be inlined so yeah I'm I mean I guess this is okay okay yeah yeah and restart yeah so the overhead from Futura comes in a few places the first is is that when it goes to run a continuation it has to start a new thread to run that continuation because it doesn't have the appropriate context information to know if it's appropriate for to run it to run its continuation on the same thread whereas we can take advantage of the calm context information to know if I'm already on a background thread and I need to continue running on a background thread then I don't need to do any thread transition I can just run the continuation there where and the same thing goes if you're already on the UI thread and you need to run a continuation on the UI thread then you can do the continuation there we only have to actually do the thread transition if it's if it's necessary right and we're also using the thread pool not actual you know real threads so we don't have to start up three million threads which is what I did inaudible knows Tweaker teens they all ran on the windows thread pool but it's the overhead of everything else they add on top of that that we're seeing here so great well thanks for coming there's some more information at modern cpp comm you can email me if you want more information about CPP C++ 1rt or anything else and you can reach us on Twitter and yeah we have kuru teamkhan going today so we've got three awesome co-routine talks unfortunately you've either attended them all or you've missed them additionally Kenny and I gave a talk yesterday at 11 a.m. embracing standard C++ for the Windows runtime in which we gave an introduction to the C++ when RT library and described you know the architecture of how it works how you know it tries to do what it does efficiently and so we'll highly recommend that for when it's on available on YouTube soon and his boss wants us to mention that link over there please do visit that that would make them very happy yes and when my boss is happy then I'm happy yeah and with that that is the end so are there any questions Oh No yes gore that's right yeah so in the first example we had the kama object explicitly allocated within the promise type what Gore is now saying is that because we've combined them together and we actually have the promise time implementing those interfaces there's one allocation and the compiler says hey I don't even have to do that allocation so that doesn't even occur yes more yes yes Rai I that's the that's the trick right any other questions is this available now it is not available now April my my boss tells me it should be available in a few weeks no nobody is next note and all of these examples like with the library and development they work with the Visual Studio 2015 the released version and the Windows SDK the latest through the released version yeah and C++ one ot works on clang as well and as soon as we get clang support for curry teens this'll work - yeah so the question was we said this could be implemented on any type of thread pool so that was the resume background can be implemented up for any type of thread pool and the question is would it be as efficient on other thread pools that's fundamentally a question for how efficient - the thread pool is it will be as efficient as say you know passing any other function to the thread pool Windows just happens to have a very efficient thread pool model that works for us but I'm sure the other operating systems have something that you could rely on as well any other questions all right well one more from car only if you're not going to complain that one every so that in every suspension point it would automatically I'm sorry yes so the question was the question was if Gore could ask very nicely for us to implement a a type like ia sink action with automatic cancellation in the library that would automatically do the cancellation for you and my so my question is is are you saying at every suspend point it was automatically check am i canceled and then do the cancellation yes yes that is possible we can certainly do that implicitly so that's a great idea and for Gore we'll do anything yes well he's done so much for us yeah yeah yeah that's a great idea yes yes yes yep it can be a member function to them all right both up get one more yeah so the question is is if we have a function that has a lot of Khoa weights in it like we had the example at the beginning that was just like Co awake month inchoate another think oh another thing what is the performance overhead if we just replace those with jets so if we make it synchronous the performance overhead is effectively that now you've turned that asynchronous program into a synchronous program so that function will sit on the thread basically doing nothing while it's waiting for those synchronous operations to complete that is generally not going to be more efficient if you are dealing with for example IO which almost all of those functions dealt with right because now you've just got a thread that's sitting there not doing any work whereas the OS could be using it for something else so all right well thank you everyone for coming and oh gorg are subtle well we support both visual C++ and clang so what what is no discard so it's like check return in cell yes yes sure so the question was what we'd be willing to annotate the these functions so that you actually have to wait on them and not just like yeah or dot get on them and yes we would absolutely consider doing that yeah so that the focus of yesterday's talk was showing you how all of this is done in standard C++ and we're all about embracing standard C++ for this language projection so anything as the language improves we want to improve as new features are added at compile them that we can rely on we certainly want to take advantage of that and make them available to you all right and with that thank you all for coming
Info
Channel: CppCon
Views: 8,980
Rating: undefined out of 5
Keywords: Kenny Kerr, James McNellis, CppCon 2016, Computer Science (Field), + C (Programming Language), Bash Films, conference video recording services, conference recording services, nationwide conference recording services, conference videography services, conference video recording, conference filming services, conference services, conference recording, conference live streaming, event videographers, capture presentation slides, record presentation slides, event video recording
Id: v0SjumbIips
Channel Id: undefined
Length: 54min 40sec (3280 seconds)
Published: Tue Oct 04 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.