Async for loops in Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome everyone I'm James Murphy and today we're talking about asynchronous iteration in Python otherwise known as how to use the async for Loop first I'll teach you the syntax of an async for Loop and how to use one through one of its main applications writing an asynchronous web application in order to see it work we're also going to have to write a client for that web app and then I'll finish up by showing you the nitty-gritty details of how an asynchronous iterator actually works in Python I'll show you two different ways of writing an asynchronous iterator both using an asynchronous generator as well as the lowlevel protocol for it so let's get started an Asing for Loop in Python looks and feels a lot like a regular for Loop if I had for X and Y do some stuff then that means I'm going to Loop over the collection Y and get each element X out of it and then do stuff with that X in the loop same deal if I have async for X and Y this means that I'm going to asynchronously Loop over the elements of Y the main difference in the purpose though is the expectation that in order to get the next X Out Of The Collection y I might need to wait why would I need to wait well the most obvious applications are waiting on iO operations SO waiting on bites that you're getting from the network or waiting on bytes that you're reading from disk if I'm waiting on receiving a file over the network I can't really control how long it takes the next chunk to get to me the file might be really big and require many packets or the client might just be slow and take a while to send them so while I could in theory wait for each packet and process them one at a time that would be really wasteful because if I'm just waiting for them if this for Loop involved a portion where I just slept until a packet was available then I wouldn't be able to handle any other requests while that waiting was happening well that's kind of the whole purpose of async I can wait on as many things as I want to at the same time so an Asing for Loop is just like a for Loop except it's got the idea of waiting on next element built into the syntax that means that in theory I could process as many other requests as I want while I'm waiting on packets for this one next up let's write a simple asynchronous web app that takes advantage of this a synchronous web app you say surely we're going to be using fast API don't get me wrong fast API is a great Library let me know if you want me to make a fast API tutorial but for this video I'd like to use something else did you know that fast API is kind of just a wrapper around another much simpler Library if I go to definition on this fast API class you see that it actually inherits from this Starlet object and let's take a look at some of the Imports in fast API a lot of these Imports are also coming from this Starlet Library a lot of the main types of objects from Fast API are actually Starlet objects or rappers around Starlet objects but sadly while fast API has nearly 70,000 stars on GitHub poor little Starlet only has 99.3k but I think Starlet deserves a lot more attention for itself so I'm going to use Starlet and it also helps you learn fast API too since pretty much everything in fast API is a wrapper around a Starlet thing both Starlet and fast API allow you to write asgi applications that's asgi a SGI which stands for asynchronous Server Gateway interface not to be confused with asky the character encoding the main benefit of Starlet is how dang simple it is all you need to do is create a list of roots to help Starlet map requests for certain paths to certain async functions the example we're going to be building is just going to Echo back the shaw 256 hash of whatever file the user uploads so whenever we get a post request to the root of our application then it's just going to call this function compute shot 256 and pass in this request object now what we're not going to do is store an arbitrarily large uploaded file in memory and then compute the shot to 56 has attempting to store arbitrarily large files provided by a user is a great way to crash your production servers but the great thing about many hashing algorithms is that you don't need the whole file in memory in order to get the hash you can operate on the btes chunk by chunk as they come in and then throw them out in that sense shot 256 is what's called an online algorithm so that's what we'll do we'll write this online sha 256 function which is going to asynchronously compute the hash as the bytes come in we'll understand a little bit more about what this stream object actually is later but for now just know that it's something that we can use in async for loop on given a stream of fites how do we compute the sum let's create the object that knows how to do the actual complex hashing part then we'll asynchronously Loop over all the chunks in the Stream as each chunk comes in we feed it into the hasher so it can update its current value for the hash the stream will end when the client tells us there are no more bytes to process at which point we return the current value of the hash we just await on that value and then return it as a plain text response believe it or not this tiny amount of code is actually a fully functioning asy application that will compute the shot 256 hash of an arbitrarily large file in order to see our application actually work we're going to have to run it which we can do using uvicorn uvicorn is an asgi web server so it's like an Apache web server or an engine X web server but it's specifically for running these kind of asy applications it also happens to be a project under the same encode organization that writes Starlet let's go ahead and run our application looks like it started up just fine and it's running on Port 5000 on local host of course but to actually see it work we of course need a client to send us a file to Hash so as is almost always the case when we write a server we're going to need to write at least a test client in order to make sure that our server works we certainly don't have to but let's make an async client as well so we'll be using the httpx library and would you look at that it's another Library written under this encode organization we'll start by creating an asynchronous client we'll use the client to post some data to the URL that the server is listening on don't forget to await the response because this is an asynchronous client once the response is ready we'll read the response body as bytes typically hashes are printed in HEX notation so we'll use the hex function that's a built-in bytes function in order to convert from a bytes object to a hex string representation for testing purposes let's just put in some fake file data which we Supply using another async death function now because this function is marked async and it also has yields in it this is actually an async generator this is just like a normal generator except you're allowed to await things inside of it we're going to take advantage of that by sending hello world but with some fake lag in between hello world is small enough that it would normally be sent in a single chunk but because we put this sleep in there we can actually force it to be sent in two separate chunks yielding an empty bytes object is the way that we tell our client that we have no more data left to send so let's scroll down here and go ahead and run the client when we run the client we do get a response from the server that looks a lot like a Shaw 256 sum and from the server's point of view we can see that in the process of computing that hash it received three separate chunks hello world and M bytes and in case you're wondering we do in fact get the correct hash that we would be expecting if we just manually passed in hello world into the shaw 256 function because you know you can hash any sequence of byes like instead of the user's password you might hash their username Instead This is what unit tests are for getting back to a synchronous for Loops how does that stream object work how do we create something that can be used with an asynchronous for Loop well I already stuck it in there this fake file data that we were using in the asynchronous client code is actually an asynchronous generator which is an asynchronous iterator just like a normal generator is an extremely easy way to write an iterator that you can use with a normal for Loop an async generator is an extremely easy way to write an asynchronous iterator that can be used within a synchronous for Loop and just like with normal generators Asing generators are Ty typically much simpler and what you should prefer to use over trying to implement the asynchronous iterator protocol let's see how to write our own async iterator to implement rate limiting suppose we have some API in this case we're just doing some fake sleeping and then multiplying by two but imagine it's a real API we have a bunch of tasks that we want to send to the API that need to be done in order so we just Loop over them and await the results but when we run this although we do wait for the result from the previous call until sending the next one we're still spamming calls as fast as we possibly can we sent out all 10 of our requests in about 1 second but let's just say for the purposes of this example that the API has a rate limit that they ask us to send no more than five requests per second to do that we'll write this async generator called await rate limited we'll use an async for Loop pass in our await bles and tell it what the rate is how many items per second it's allowed to process so let's implement it first the reciprocal of the rate is the maximum amount of time that we need to wait we'll do a normal for Loop looping through the await bles awaiting and yielding each one in turn let's also compute how much time we spent waiting but if we've already spent some time waiting for the server to respond then we don't need to add that as additional time onto our sleep duration we can actually subtract it off so that the total amount of time between requests stays approximately around that one over rate amount this Max with zero here is for the case that the server takes a long time to respond and we've already waited our entire required sleep duration so that we don't need to wait at all as soon as we get our response we can go ahead immediately with the next API call and of course don't forget to use Ayn iio dos sleep instead of time dos sleep and there we go let's test it out this time around we see that it took about 2 seconds to complete and that's because we were waiting even though we didn't have to we were being you know nice users of the API and not spamming them too much staying within their five or Quest per second limit instead of using an async generator the other way to do it is utilizing the actual async iterator protocol firstly though let's go over the normal iterator protocol when you do a normal for Loop for X and Y python calls iter Y which calls y's iter method any object that has this under iter method is called an iterable and its sole purpose is to return an iterator python then repeatedly calls next on the iterator which calls the iterators D next method it continues calling next to get elements from the iterator until the iterator raises a stop iteration exception the for Loop handles all these calls and catching the stop iteration for you so the iterable is a thing you can iterate over like a list and the iterator models the process of visiting each element so for a list you could imagine an iterator keeps track of the index of the current element in terms of actually defining this in terms of classes there's two protocols one for the iterable and one for the iterator the iterable just needs Dunder iter that returns an iterator and the iterator needs Dunder next that Returns the next element or raises stop iteration but python made a somewhat controversial additional choice they said we don't want people to have to type for X in it or Y that would be annoying we want it to automatically call it but on the other hand if someone does manually call it iter y we also want to allow people to iterate using that iterator using a for Loop so in what will cause confusion for python students for the rest of time they required that all iterators are also iterables that return self meaning an iterator must also have this thunder itter that just returns self now we don't have to write for x and it y we just write for x and y and if we do happen to have an iterator we can also write for X in it great so I snuck a whole less on normal iterators into this video now how about async ones luckily it's very nearly the same to do async for X and Y first python calls a iter on y to get an asynchronous iterator then it repeatedly calls and awaits a next on the iterator this continues until eventually the iterator raises a stop async iteration exception once again the async for Loop hides all these call Falls and catches to stop async iteration for you to implement this with classes we need two new protocols the async iterable that just has a Dunder a iter that just returns an async iterator and the async iterator that has an async function Dunder aex to get the next element or raise a stop async iteration once again python wanted to make all async iterators async iterables so an async iterator must also have a Dunder a itter that just returns self note this is just a normal function not an async function the only asynchronous part is waiting on the next element so Dunder aex is the only async function here and that's all you need to know about the low-l protocol let's see how to apply it to the rate limiting example here's how you could write the exact same very simplistic rate limiting that we had with the async generator using the actual async iterator Proto call it's a class and on construction we take the await bles and the rate we get an iterator to the await bles which is going to be corresponding to this normal for Loop inside the async generator we compute the max sleep duration and we use this variable to help us to compute the time difference between two subsequent calls to our aex function a iter always just returns self the wait if needed function encapsulates the logic of waiting however long we need to wait in order to comply with our rate limit then getting the next element just means waiting if we need to updating our last iteration time trying to get the next awaitable again this corresponds to a normal for Loop that's going to throw a normal stop iteration if there are no more await bles we have to convert that to a stop async iteration otherwise we're going to get like a runtime error and then if not we have an awaitable to await on so we await it and return it obviously this is way more complicated than the simple async generator that we used here and it accomplishes exactly the same goal and we can use it in the exact same way I could literally replace this with the capital await rate limited and it would work just the same so if Asing generators work then why would you ever want to do this the only reasons you might prefer to do this class way sometimes rather than the generator way is the same reasons that you would normally prefer a class over a function you would prefer the class approach if you had a lot of state that you need to keep track of or invariance that you need to keep track of if you have a lot of operations that modify those State and need to maintain those invariants then you could Factor those out into functions on the class or just using methods on a class in general to help you organize the process of doing the iteration if it happens to be extremely complex now if your iteration is really that complex maybe you should be rethinking the way that you're doing things in the first place but if it really is just that complex then maybe a class might be the way anyway thanks for watching check out my website M coding. where I offer Consulting Services as always thank you to my patrons and donors don't forget to slap that like button an odd number of times and I'll see you in the next one
Info
Channel: mCoding
Views: 59,293
Rating: undefined out of 5
Keywords: async, python
Id: dEZKySL3M9c
Channel Id: undefined
Length: 16min 36sec (996 seconds)
Published: Mon Apr 01 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.