Explore structured concurrency in Swift - WWDC2021 - 10134

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hi i'm kavon and i'll be joined by my colleague joe later on swift 5.5 introduces a new way to write concurrent programs using a concept called structured concurrency the ideas behind structured concurrency are based on structured programming which is so intuitive that you rarely think about it but thinking about it will help you understand structured concurrency so let's dive in in the early days of computing programs were hard to read because they were written as a sequence of instructions where control flow was allowed to jump all over the place you don't see that today because languages use structured programming to make control flow more uniform for example the if-then statement uses structured control flow it specifies that a nested block of code is only conditionally executed while moving from top to bottom in swift that block also respects static scoping meaning that names are only visible if they are defined in an enclosing block this also means that the lifetime of any variables defined in a block will end when leaving the block so structured programming with static scope makes control flow and variable lifetime easy to understand more generally structured control flow can be sequenced and nested together naturally this lets you read your entire program top to bottom so those are the fundamentals of structured programming as you can imagine it is easy to take for granted because it is so intuitive for us today but today's programs feature asynchronous and concurrent code and they have not been able to use structured programming to make that code easier to write first let's consider how structured programming makes asynchronous code simpler say that you need to fetch a bunch of images from the internet and resize them to be thumbnails sequentially this code does that work asynchronously taking in a collection of strings that identify the images you'll notice this function does not return a value when called that's because the function passes its result or an error to a completion handler it was given this pattern allows the caller to receive an answer at a later time as a consequence of that pattern this function cannot use structured control flow for error handling that's because it only makes sense to handle errors thrown out of a function not into one this pattern prevents you from using a loop to process each thumbnail recursion is required because the code that runs after the function completes must be nested within the handler now let's take a look at the previous code but rewritten to use the new async await syntax which is based on structured programming i've dropped the completion handler argument from the function instead it is annotated with async and throws in its type signature it also returns a value instead of nothing in the body of the function i use a weight to say that an asynchronous action happens and no nesting is required for the code that runs after that action this means that i can now loop over the thumbnails to process them sequentially i can also throw and catch errors and the compiler will check that i didn't forget for an in-depth look at async await check out the session meet async await swift so this code is great but what if you're producing thumbnails for thousands of images processing each thumbnail one at a time is no longer ideal plus what if each thumbnail's dimensions must be downloaded from another url instead of being a fixed size now there is an opportunity to add some concurrency so multiple downloads can happen in parallel you can create additional tasks to add concurrency to a program tasks are a new feature in swift that work hand in hand with async functions a task provides a fresh execution context to run asynchronous code each task runs concurrently with respect to other execution contexts they will be automatically scheduled to run in parallel when it is safe and efficient to do so because tasks are deeply integrated into swift the compiler can help prevent some concurrency bugs also keep in mind that calling an async function does not create a new task for the call you create tasks explicitly there are a few different flavors of tasks in swift because structured concurrency is about the balance between flexibility and simplicity so for the remainder of this session joe and i will introduce and discuss each kind of task to help you understand their trade-offs let's start with the simplest of these tasks which is created with a new syntactic form called an async let binding to help you understand this new syntactic form i want to first break down the evaluation of an ordinary let binding there are two parts the initializer expression on the right side of the equals and the variable's name on the left there may be other statements before or after the let so i'll include those here too once swift reaches a let binding its initializer will be evaluated to produce a value in this example that means downloading data from a url which could take a while after the data has been downloaded swift will bind that value to the variable name before proceeding to the statements that follow notice that there is only one flow of execution here as traced by the arrows through each step since the download could take a while you want the program to start downloading the data and keep doing other work until the data is actually needed to achieve this you can just add the word async in front of an existing let binding this turns it into a concurrent binding called an asynclet the evaluation of a concurrent binding is quite different from a sequential one so let's learn how it works i will start just at the point before encountering the binding to evaluate a concurrent binding swift will first create a new child task which is a subtask of the one that created it because every task represents an execution context for your program two arrows will simultaneously come out of this step the first arrow is for the child task which will immediately begin downloading the data the second arrow is for the parent task which will immediately bind the variable result to a placeholder value this parent task is the same one that was executing the preceding statements while the data is being downloaded concurrently by the child the parent task continues to execute the statements that follow the concurrent binding but upon reaching an expression that needs the actual value of the result the parent will await the completion of the child task which will fulfill the placeholder for result in this example our call to url session could also throw an error this means that awaiting the result might give us an error so i need to write try to take care of it and don't worry reading the value of result again will not recompute its value now that you've seen how asynclet works you can use it to add concurrency to the thumbnail fetching code i factored a piece of the previous code that fetches a single image into its own function this new function here is also downloading data from two different urls one for the full-sized image itself and the other for metadata which contains the optimal thumbnail size notice that with a sequential binding you write try a weight on the right side of the let because that's where an error or suspension would be observed to make both downloads happen concurrently you write async in front of both of these lets since the downloads are now happening in child tasks you no longer write try a weight on the right side of the concurrent binding those effects are only observed by the parent task when using the variables that are concurrently bound so you write trial weight before the expressions reading the metadata and the image data also notice that using these concurrently bound variables does not require a method call or any other changes those variables have the same type that they did in a sequential binding now these child tasks i've been talking about are actually part of a hierarchy called a task tree this tree is not just an implementation detail it's an important part of structured concurrency it influences the attributes of your tasks like cancellation priority and task local variables whenever you make a call from one async function to another the same task is used to execute the call so the function fetch one thumbnail inherits all attributes of that task when creating a new structured task like with asynclet it becomes the child of the task that the current function is running on tasks are not the child of a specific function but their lifetime may be scoped to it the tree is made up of links between each parent and its child tasks a link enforces a rule that says a parent task can only finish its work if all of its child tasks have finished this rule holds even in the face of abnormal control flow which would prevent a child task from being awaited for example in this code i first await the metadata task before the image data task if the first awaited task finishes by throwing an error the fetch1 thumbnail function must immediately exit by throwing that error but what will happen to the task performing the second download during the abnormal exit swift will automatically mark the unawaited task as cancelled and then wait for it to finish before exiting the function marking a task as cancelled does not stop the task it simply informs the task that its results are no longer needed in fact when a task is cancelled all subtasks that are descendants of that task will be automatically cancelled too so if the implementation of url session creates its own structured tasks to download the image those tasks will be marked for cancellation the function fetch one thumbnail finally exits by throwing the error once all of the structured tasks it created directly or indirectly have finished this guarantee is fundamental to structured concurrency it prevents you from accidentally leaking tasks by helping you manage their lifetimes much like how arc automatically manages the lifetime of memory so far i have given you an overview of how cancellation propagates but when does the task finally stop if the task is in the middle of an important transaction or has open network connections it would be incorrect to just halt the task that's why task cancellation in swift is cooperative your code must check for cancellation explicitly and wind down execution in whatever way is appropriate you can check the cancellation status of the current task from any function whether it is async or not this means that you should implement your apis with cancellation in mind especially if they involve long-running computations your users may call into your code from a task that can be cancelled and they will expect the computation to stop as soon as possible to see how simple it is to use cooperative cancellation let's go back to the thumbnail fetching example here i have rewritten the original function that was given all of the thumbnails to fetch so that it uses the fetch1 thumbnail function instead if this function was called within a task that was cancelled we don't want to hold up our application by creating useless thumbnails so i can just add a call to check cancellation at the start of each loop iteration this call only throws an error if the current task has been cancelled you can also obtain the cancellation status of the current task as a boolean value if that is more appropriate for your code notice that in this version of the function i'm returning a partial result a dictionary with only some of the thumbnails requested when doing this you must ensure that your api clearly states that a partial result may be returned otherwise task cancellation could trigger a fatal error for your users because their code requires a complete result even during cancellation so far you've seen that asynclet provides a lightweight syntax for adding concurrency to your program while capturing the essence of structured programming the next kind of task i want to tell you about is called a group task they offer more flexibility than async let without giving up all the nice properties of structured concurrency as we saw earlier async let works well when there's a fixed amount of concurrency available let's consider both functions that i discussed earlier for each thumbnail id in the loop we call fetch1 thumbnail to process it which creates exactly two child tasks even if we inline the body of that function into this loop the amount of concurrency will not change asynclet is scoped like a variable binding that means the two child tasks must complete before the next loop iteration begins but what if we want this loop to kick off tasks to fetch all of the thumbnails concurrently then the amount of concurrency is not known statically because it depends on the number of ids in the array the right tool for this situation is a task group a task group is a form of structured concurrency that is designed to provide a dynamic amount of concurrency you can introduce a task group by calling the with throwing task group function this function gives you a scoped group object to create child tasks that are allowed to throw errors tasks added to a group cannot outlive the scope of the block in which the group is defined since i have placed the entire for loop inside of the block i can now create a dynamic number of tasks using the group you create child tasks in a group by invoking its async method once added to a group child tasks begin executing immediately and in any order when the group object goes out of scope the completion of all tasks within it will be implicitly awaited this is a consequence of the task tree rule i described earlier because group tasks are structured too at this point we've already achieved the concurrency that we wanted one task for each call to fetch one thumbnail which itself will create two more tasks using asynclet that's another nice property of structured concurrency you can use async let within group tasks or create task groups within asynclet tasks and the levels of concurrency in the tree compose naturally now this code is not quite ready to run if we tried to run it the compiler would helpfully alert us to a data race issue the problem is that we're trying to insert a thumbnail into a single dictionary from each child task this is a common mistake when increasing the amount of concurrency in your program data races are accidentally created this dictionary cannot handle more than one access at a time and if two child tasks tried to insert thumbnails simultaneously that could cause a crash or data corruption in the past you had to investigate those bugs yourself but swift provides static checking to prevent those bugs from happening in the first place whenever you create a new task the work that the task performs is within a new closure type called ascendable closure the body of ascendable closure is restricted from capturing mutable variables in its lexical context because those variables could be modified after the task is launched this means that the values you capture in a task must be safe to share for example because there are value types like int and string or because they are objects designed to be accessed from multiple threads like actors and classes that implement their own synchronization we have a whole session dedicated to this topic called protect mutable state with swift actors so i encourage you to check it out to avoid the data race in our example you can have each child task return a value this design gives the parent task the sole responsibility of processing the results in this case i specified that each child task must return a tuple containing the string id and ui image for the thumbnail then inside each child task instead of writing to the dictionary directly i have them return the key value tuple for the parent to process the parent task can iterate through the results from each child task using the new 408 loop the 408 loop obtains the results from the child tasks in order of completion because this loop runs sequentially the parent task can safely add each key value pair to the dictionary this is just one example of using the 408 loop to access an asynchronous sequence of values if your own type conforms to the async sequence protocol then you can use 408 to iterate through them too you can find out more in the meet async sequence session while task groups are a form of structured concurrency there is a small difference in how the task tree rule is implemented for group tasks versus async let tasks suppose when iterating through the results of this group i encounter a child task that completed with an error because that error is thrown out of the group's block all tasks in the group will be implicitly cancelled and then awaited this works just like asynclet the difference comes when your group goes out of scope through a normal exit from the block then cancellation is not implicit this behavior makes it easier for you to express the fork join pattern using a task group because the jobs will only be awaited not cancelled you can also manually cancel all tasks before exiting the block using the group's cancel all method keep in mind that no matter how you cancel a task cancellation automatically propagates down the tree async let and group tasks are the two kind of tasks that provide scoped structured tasks in swift now i'll hand things off to joe who will tell you about unstructured tasks thanks kayvon hi i'm joe kavon showed you how structured concurrency simplifies air propagation cancellation and other bookkeeping when you add concurrency to a program with a clear hierarchy to the tasks but we know that you don't always have a hierarchy when you're adding tasks to your program swift also provides unstructured task apis which give you a lot more flexibility at the expense of needing a lot more manual management there are a lot of situations where a task might not fall into a clear hierarchy most obviously you might not have a parent task at all if you're trying to launch a task to do async computation from non-async code alternatively the lifetime you want for a task might not fit the confines of a single scope or even a single function you may for instance want to start a task in response to a method call that puts an object into an active state and then cancel its execution in response to a different method call that deactivates the object this comes up a lot when implementing delegate objects in app kit and ui kit ui work has to happen on the main thread and as the swift actors session discusses swift ensures this by declaring ui classes to belong to the main actor let's say we have a collection view and we can't yet use the collectionview datasource apis instead we want to use our fetch thumbnails function we just wrote to grab thumbnails from the network as the items in the collection view are displayed however the delegate method is not async so we can't just await a call to an async function we need to start a task for that but that task is really an extension of the work we started in response to the delegate action we want this new task to still run on the main actor with ui priority we just don't want to bound the lifetime of the task to the scope of the single delegate method for situations like this swift allows us to construct an unstructured task let's move that asynchronous part of the code into a closure and pass that closure to construct an async task now here's what happens at run time when we reach the point of creating the task switz will schedule it to run on the same actor as the originating scope which is the main actor in this case meanwhile control returns immediately to the caller the thumbnail task will run on the main thread when there's an opening to do so without immediately blocking the main thread on the delegate method constructing tasks this way gives us a halfway point between structured and unstructured code a directly constructed task still inherits the actor if any of its launched context and it also inherits the priority in other traits of the origin task just like a group task or an async let would however the new task is unscoped its lifetime is not bound by the scope of where it was launched the origin doesn't even need to be async we can create an unscoped task anywhere and trade for all of this flexibility we must also manually manage the things that structured concurrency would have handled automatically cancellation and errors won't automatically propagate and the task's result will not be implicitly awaited unless we take explicit action to do so so we kicked off a task to fetch thumbnails when the collection view item is displayed and we should also cancel that task if the item is scrolled out of view before the thumbnails are ready since we're working with an unscoped task that cancellation isn't automatic let's implement it now after we construct the task let's save the value we get we can put this value into a dictionary keyed by the row index when we create the task so that we can use it later to cancel that task we should also remove it from the dictionary once the task finishes so we don't try to cancel a task if it's already finished note here that we can access the same dictionary inside and outside of that async task without getting a data race flagged by the compiler our delegate class is bound to the main actor and the new task inherits that so they'll never run together in parallel we can safely access the stored properties of main actor bound classes inside this task without worrying about data races meanwhile if our delegate is later told that the same table row has been removed from display then we can call the cancel method on the value to cancel the task so now we've seen how we can create unstructured tasks that run independent of a scope while still inheriting traits from the task's originating context but sometimes you don't want to inherit anything from your originating context for maximum flexibility swift provides detached tasks like the name suggests detached tasks are independent from their context they're still unstructured tasks their lifetimes are not bound to their originating scope but detached tasks don't pick anything else up from their originating scope either by default they aren't constrained to the same actor and don't have to run at the same priority as where they were launched detached tasks run independently with generic defaults for things like priority but they can also be launched with optional parameters to control how and where the new task gets executed let's say that after we fetch thumbnails from the server we want to write them to a local disk cache so we don't hit the network again if we try to fetch them later that caching doesn't need to happen on the main actor and even if we cancel fetching all the thumbnails it's still helpful to cache any thumbnails we did fetch so let's kick off caching by using a detached task when we detach a task we also get a lot more flexibility in setting up how that new task executes caching should happen at a lower priority that doesn't interfere with the main ui and we can specify background priority when we detach this new task let's plan ahead for a moment now what should we do in the future if we have multiple background tasks we want to perform on our thumbnails we could detach more background tasks but we could also utilize structured concurrency inside of our detached task we can combine all the different kinds of tasks together to exploit each of their strengths instead of detaching an independent task for every background job we can set up a task group and spawn each background job as a child task into that group there are a number of benefits of doing so if we do need to cancel a background task in the future using a task group means we can cancel all of the child tasks just by cancelling that top level detached task that cancellation will then propagate to the child tasks automatically and we don't need to keep track of an array of handles furthermore child tasks automatically inherit the priority of their parent to keep all of this work in the background we only need to background the detached task and that will automatically propagate to all of its child tasks so you don't need to worry about forgetting the transitively set background priority and accidentally starving ui work at this point we've seen all the primary forms of tasks there are in swift async let allows for a fixed number of child tasks to be spawned as variable bindings with automatic management of cancellation and error propagation if the binding goes out of scope when we need a dynamic number of child tasks that are still bounded to a scope we can move up to task groups if we need to break off some work that isn't well scoped but which is still related to its originating task we can construct unstructured tasks but we need to manually manage those and for maximum flexibility we also have detached tasks which are manually managed tasks that don't inherit anything from their origin tasks in structured concurrency are just one part of the suite of concurrency features swift supports be sure to check out all these other great talks to see how it fits in with the rest of the language meet async await in swift gives you more details about async functions which gives us the structured basis for writing concurrent code actors provide data isolation to create concurrent systems that are safe from data races see the protect mutable state with swift actors session to learn more about how we saw 408 loops on task groups and those are just one example of async sequence which provides a standard interface for working with asynchronous streams of data the meet async sequence session goes deeper into the available apis for working with sequences tasks integrate with the core os to achieve low overhead and high scalability and the swift concurrency behind the scene session gives more technical details about how that's accomplished all these features come together to make writing concurrent code in swift easy and safe letting you write code that gets the most out of your devices while still focusing on the interesting parts of your app thinking less about the mechanics of managing concurrent tasks or the worries of potential bugs caused by multi-threading thank you for watching i hope you enjoy the rest of the conference [Music]
Info
Channel: iHTC boy
Views: 131
Rating: 5 out of 5
Keywords:
Id: lGPdyK0mrgE
Channel Id: undefined
Length: 27min 54sec (1674 seconds)
Published: Sat Jun 12 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.