Python Generators Explained

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hello everybody and welcome to another youtube video so in today's video i'm going to be discussing generators in python now generators are highly related to iterators and would be considered a more advanced or expert level feature in the python programming language that said though they're not overly complicated in fact i think i can give you a decent explanation that most of you regardless of your python knowledge can understand well with that said let's dive into the video after a quick word from our sponsor before we get started i need to thank hi counselor for sponsoring this video high counsellor's job search accelerator program combines ai power technology and professional mentorship from top tech companies to help recent graduates and young professionals get a lot more interviews and fast track them to landing their dream jobs in tech in six weeks on average high counselor has no upfront enrollment fees and their program is completely free until you land a job admitted candidates get help with their resume linkedin networking and much more high counselor also uses a proprietary tool that does professional networking on behalf of their candidates this in combination with the unique network of over 250 hiring partners mean candidates get 10 times more interviews on average and spend less time searching the web for jobs multiple job offers are common amongst high counselor candidates and the program has a 100 placement success rate you can submit an application for free through the high counselor website or attend an upcoming information session to learn more check out hi counselor from the link in the description today all right so let's go ahead and dive into the video the first thing that i need to do before we can really talk in depth about generators is describe what an iterator is and talk to you about how iterators work in python now really an iterator is an object that allows you to loop through a sequence of numbers or some sequence of data some sequence of something without having to store all of the different items in that sequence in memory so i'll kind of give you an example of what i mean in a second but a generator is really just a new form of syntax that was introduced in python version 3 that allows you to create an iterator without having to go through kind of a more tedious and less elegant process so generators and iterators are pretty much the same thing generators just use a new syntax and are much easier to create than kind of your standard iterators which we'll see here in a second so i'm going to start by explaining iterators taking a deep dive on those and then in the later half of the video we will get into generators and how to use the new syntax to create more elegant iterators essentially all right so let me start by introducing a problem to you here let's say we didn't have the function range and we didn't have a while loop okay no while loop and no range function and we wanted to loop through the numbers one to ten right one two three four five six seven eight nine ten well without using the range function without using a while loop and without coming up with some something super creative just simply thinking about this what we would probably do is the following we might create some list right and then in this list we could store our numbers 7 uh 8 9 and 10. okay and then what i could do is i could just loop through this list right i could say 4 elements in x and then i could just print the element and quite simply this is going to loop through all of these values we're looping through the numbers 1 to 10. now the problem with this is that i'm storing these 10 numbers here in memory in a list and for me to be able to loop through them i need to store all of them in a data structure now that's not very efficient and with 10 numbers this is fine but you can imagine if i wanted to loop through 100 000 numbers a million numbers a billion numbers it really doesn't make any sense at all for me to store the entire sequence of numbers in memory especially if all i'm going to do is process them one at a time and i don't need to know what all of these are as i loop through the numbers so it doesn't make any sense for me to store this in a list and that is why of course we use something like the range function right so i say 4 i guess i won't say element in this case i can say i for i in range and then i do something like 1 to 11 and then i print element here i get the same thing oops sorry i'm going to print i not element here i get the exact same results right looping through the range of 1 to 11 except when i use the range function here i don't actually need to store all of the numbers 1 through 10 in a data structure and so if i were to look at the size of this data structure versus the size of the range function we should see that this data structure has a larger size than the range function because the range function doesn't need to store all of these numbers and in fact if i want to do that i can so i'm going to import sys and i'm going to say print sys dot get size of x we can see that that is giving us a value of 136 and then i'm just going to take my range function here i'm going to print sys dot get size of and then of the range function does that look good okay and we see that we get a size of 48 for the range function so obviously this doesn't prove exactly what i'm saying i'm just trying to show you that this uses more space than the range function because the range function doesn't need to create this list for us to be able to loop through it and that is because the range function returns to us an iterator okay so now we can really talk about what an iterator is so an iterator as i described before is really something that allows us to loop through a sequence of numbers or of some data without having to store them all so the range function is a great example of an iterator because it lets us loop through all these numbers without having to store them in some data structure i'm going to show you one more example of an iterator in python that you may not know is an iterator and just kind of describe briefly how that works because it will bring our understanding so i'm going to use this function called map if you don't know what it does don't worry once i write the code here i'll explain exactly what it does i'm going to say i to the exponent 2 and then pass x okay so what the map function does is it takes some data structure in this case we have a list and it maps all of the values in this data structure to some function so pretty much it applies this function to all of the values in the data structure and then it actually returns to us an iterator that contains all of the results of this function called on all of these values so essentially what this is doing is generating the squares of all these values and so if i were to look at this as a list i would have 1 4 9 16 so on and so forth because i'm applying this function right here which is just generating the square of all the values from the list now the thing is this map function here doesn't actually store a new list that has all of the results of these function calls this function here is actually a generator or an iterator that allows us to iterate through all of these results so all of the results of these values pass to this function without actually storing them i understand that's a little bit confusing but if you look at why if we print out y here we see this is a map object this actually isn't something that's storing all of the values it's not a list right now if i were to look at the list representation so i print out the list we see that we do get all of the values that we expect but this is not the default behavior it's not storing this list it only creates this list for us when we call this list function that's a common misconception so let me show you what i mean and how this is an actual actually an iterative so if i go something like 4i in y and then i print i here we see that we get all of the values that we expect the thing is we only generated these values once we started looping through this map object they weren't already generated and stored in memory like when i say 4i and y on the first iteration it sends this 1 to this function and gives me the result then it sends this 4 or sorry this 2 to the function gives me the result sends the 3 to the function gives me the result it's doing it while i'm looping through it it didn't do it before and store all of it in memory and to give you one last proof here i'm going to print the sys dot size of or i guess this is get size of and then of y and let's see what we get we get a size of 48 again if you compare that to let's say the list of y okay you'll see that the list of y is much larger than simply y because when we do the list it actually generates the list for us hopefully this is making sense the whole point is that we didn't have to store the sequence in memory we could generate the sequence as we looped through it and that is really what an iterator is okay so now i want to dive into how an iterator actually works and how you create an iterate so really what a for loop is doing is it's calling a special method on all of these iterator objects so this map object this range object that gives us the next item or the next piece of data in the sequence that we're looping through so there's actually this function here and it's called next and what this does is give us the next value in the iteration through an iterator so if i go here and i print the next of y you can see that we get one right and then if i print this again we get four and if i print this oops again we get 9 and again oops i did not mean to create a new file there we get 16. and then notice here right i've gone through 4 times y and now if i start looping through y for i and y and i print i we're going to start looping at 25 because that was the next value in the iteration notice i didn't have 1 4 9 16 and then those repeating again and then looping through the rest of the values it continued right at 25. and just to make this even more clear i'm going to say print for loop starts okay and notice we have 1 4 9 16 the for loop starts and then we continue through the rest of the iteration hopefully this is kind of making sense but really what a for loop is doing is it is calling this next function on your iterator object and that is returning to you i which is going to be the next value in the sequence so when i've already looped through the first four four values in the sequence and then i go to the for loop here well the for loop is just going to continue to call next until eventually there's no more items in the sequence and so that's why it starts after uh we've already looped through the value if if that kind of makes sense or it starts where we left off looping through the rest of the sequence hopefully i'm articulating that okay so hopefully that makes sense but that is what an iterator does that's what a for loop does an iterator has this kind of next method on it and when we use this next function well we get access to the next value in the sequence now the same thing works when i do this y dot underscore underscore next under score underscore so this is a dunder method a double underscore method some of you may be familiar with that but the next function is the exact same as me calling underscore underscore next underscore underscore on an object and to just show this to you if i run this you see we get the same thing okay nice so that is how that works now what i'm going to do for you is write a for loop or write with the actual implementation of a for loop is sorry so you can see how it actually works on an iterator so this is our for loop but if i were to write this for loop and kind of peel away you know the disguise here where it says 4 this is what it would look like if i say while true and then i'm going to try to get the next value in my iterator so i'm going to say value equals next and this is going to be of y and i'm going to accept the exception here is stop iteration i'll describe this in a second and then i'm going to break and i'll just print done and then here i will print the value okay so these here this for loop and this while loop are actually identical let me get rid of this though because we don't need this anymore okay again yeah so this for loop and this while loop are identical we are looping through this y iterator so in this case i'm saying 4 i in y and then what this for loop actually does is what i've written here we have an infinite loop so we have while true because we don't know how many items are in the sequence and then what we do is we try to get the next value from our iterate so we keep calling this next function or in the same way we could say y dot underscore underscore next underscore underscore but the next function is kind of convention then we print out the value and then we accept stop iteration so at any point in time there is no more values in the sequence or in the iterator the iterator is going to raise an exception this exception is stop iteration and in that case we end the while loop and we're done so just keep in mind this for loop and this while loop are literally the exact same thing all right so let me delete this for loop here and just run this for you and you can see that this works right we print out all the values and as soon as we find the stop iteration error we print done now let me show you what happens if i don't have this this except here so if i just do this i just try to get the next value from my iterator and i run this notice we get an exception and it just says stop iteration that's obviously why i accepted the stop iteration error because well you need to do that otherwise your program is going to crash and that is exactly what a for loop does okay so there you go that is the basics of how you loop through an iterate now let me show you do kind of doing the same thing sorry with the range function so let's say x is equal to in lowercase x is equal to range and then we have 1 11. that's the range we want to loop through so let's start by just printing x now notice when we print x we actually just get a string representation of the range function so this is a string it just prints this out for us and now watch what happens though if i try to call next on x if i do this notice we get an error it says range object is not an iterator now the reason it's not an iterator is because for us to actually get the iterator from the range object we need to call another special method which is called iter so there is this iter method or this inner function sorry that returns an iterator from an object and there's also the dot underscore underscore iter underscore underscore underscore dunder method now what this iter method does is the exact same thing as this iter function if you call the iter method it's literally identical to calling theater function just like the next function and the next dunder method so the whole point of me saying this to you though is that if we want to get the iterator from x we need to call iter like that so if i call it or there now this is actually going to work right if i get the next of it or x we don't get an error i didn't print out what that was i can print it you can see that that is now correct there you go so that is kind of how that works so the whole thing is when i do something like 4i in x what actually happens is we're saying 4i in the iter of x and then it calls the error method gets the iterator and then calls the next method on the iterator object that we got from the object we're looping through hopefully that kind of makes sense but the whole point is you start by calling that either method that inner method returns to you the iterator which has a next method on it and then you call the next method on the iterator it's a little bit confusing because on map you don't need to do that on range you do need to do that the whole point is that there's kind of two methods that make up an iterator one is the inner method the next is the next method so now what i'm going to end off the explanation of iterators by doing is showing you how you make your own iterator using kind of the old legacy syntax not the generator syntax so the way you do this is you create a class so i'm going to make a class i'm just going to call this class iter now you can define the init and any other methods in this class just like you would for any other class so i'm going to say self i'm just going to take some value n i'm going to say self.n is equal to n and n will kind of be like the maximum value i want to go to while i'm iterating all right then if i want to make this an iterator i need to implement two methods those methods are ader and next so i'm going to say define underscore underscore enter like that and all i'm going to do here is i'm going to return myself and i'm going to set self dot current equal to zero so when i call the inner method i'm going to initialize something this current variable that's going to store the current volume out of my iteration and then i'm going to return myself the reason i'm returning myself is because i'm not creating any special iterator object here i'm just returning this object itself the current instance that i'm calling this on so that it can use that as the iterator might be a little bit confusing but we'll keep going so now i'm going to implement the next method so underscore underscore next underscore underscore and inside of here what i'm going to do is say self.current plus equals 1 and then i'm going to return self.current now i need to change a few things first i'm going to make self.current actually equal to negative 1. and before i return self.current i'm going to say if self.current is greater than or equal to self.n then what i want to do is raise stop iteration okay this is kind of the way that you need to implement iterator inside of your next method you have something some base case that raises the stop iteration or some exit case that raises the stop iteration and then you increment your iterator or do whatever it is you're doing with your sequence and then eventually you return the value so the reason i'm initializing current at negative one is because i start by adding one to whatever my current iterator value is so negative one goes to zero so the first value i return is zero then one then two so on and so forth and then eventually i raise the stop iteration error and it's not until i call this iter method again to get a new iterator do i reset my current value to negative one okay hopefully that kind of makes sense but let's just make the iterator now let's say x is equal to itter let's pass the value 5 in here and now let's try to loop through this iterator so i'm going to go 4 and i'll say i in x i'm going to print i and let's see if we get an error no we don't we get 0 1 2 3 4. there you go so we've just created an iterator we can now loop through this object because we have the inner method and because we have the next method now watch what happens if i remove the iter method though if i remove this it says inner object is not iterable the reason it's not iterable is because it couldn't get the iterator from it because we didn't implement the iter method now let's remove this next method and let's see what happens here returned non-iterator of a type so it returned something that didn't have the next method on it so that means it is not an iterate hopefully that is clear but that's how that works okay so now let me just show you how we would manually call the next method on this so i could do something like print and then next of x but when i do this take a guess out what's going to happen notice i get an error it says inner object has no attribute current the reason it has no attribute current is because i didn't create the iterator object first or i didn't initialize the iterator by calling theater method i just called next so i would need to say like itr is equal to iter of x and then i could call next on itr okay and then notice i get 0 and then if i call it again so let's do it here i get one two three so on and so forth okay so that is the old school way of creating an iterator now i'm going to introduce you introduce to you sorry generators which is the much more elegant and nice way of creating an iterator so the generator syntax is quite simple you create a function so in this case i'm just going to say define gen take in any parameters that you want in this case i'll take an n and then what you do is you use the yield keyword instead of the return keyword i'm going to explain exactly how this works but let me just make the exact same iterator i made before using the generator syntax so let me say actually i don't need self i can just say we'll use a for loop actually 4i in range and then this will be n and i'm going to yield n and i need to spell yield correctly i'm going to yield that okay so this is exactly the same as that class that i just created and notice how much simpler this is so i'm sure this probably still doesn't make much sense to you but let's just loop through it for i in and then we're going to say gen of 5 like that print i and notice that we get uh oops sorry it's yielding n that needs to yield i my bad we get the exact same thing that we got before so the way that the generator works is when the yield keyword is hit it pauses the execution of the function and returns this value to whatever is iterating through this generator object so in this case i'm saying 4i in gen 5. so i'm looping through my generator here passing the value 5 to it and inside of my generator i create this for loop and in this for loop i'm looping up to 5 or up to n so in this case 5 but not including 5 and i'm yielding i so as soon as i get through the first iteration of this for loop and i is equal to zero i yield zero and what that means is i immediately pause i stop all of the information about this function is saved so it's stored in memory and then i go to this for loop and i print whatever the value is that i yield it then what happens is the function or this for loop continues so we go through the next value we call next on this generator that then has the for loop run again so then we get the value one then we call next again we get the value two three 4 so on and so forth the important thing to understand is when yield is hit the function execution is paused it's not terminated it's not done it's paused and then we return to it and continue running whenever we see the next keyword again or whenever we call the next method or function on this generator so to do this manually i could say i guess i'll just go with x is equal to gen of 5 and i could say next x let's just print this like this okay so if we do this manually okay notice we get 0 1 2 3. so pretty much this generator syntax makes the next method and the inner method implemented for us so we don't have to manually write them inside of a class and instead we just use the yield keyword now i'm just going to show you in a different way how we could do this so let's say we take no parameters here what i can actually do inside of this function is i can yield multiple times so i'm going to yield one then i'm going to yield two then i'm going to yield again i've got to spell that correctly 3 then i'm going to yield 4. so now when i do this notice you get 1 2 3 4 because what happens is we yield 1 we pause the execution of the function we print the value we yield 2 we pause the execution of the function we print two right so on and so forth and if you want to make this even more clear we're gonna print pause one okay pause two pause three and actually now that i think of it hmm yeah and okay this is fine yeah we can do that so let's run this and notice we get one pause one two pause two three pause three and then finally we have four and the whole point is that we didn't need to generate this entire sequence of numbers to be able to loop through it we were just returning it as we needed those values all right so at this point in time you're probably asking yourself what is the point of this maybe you understood the examples i went through you know when you use the yield keyword in a function that generates a generator or makes that function a generator and as soon as you hit the yield keyword the execution of the function is paused but the entire state of the function all variables everything inside of it is saved right all of that's saved but the execution is just paused then it can be returned to when you call the next function however again what is the use case of this well really the use case of a generator is such that you can loop through a sequence or some large amount of data without needing to store all of it and the situation in which you use a generator is when you do not care about the data before or after something in an iteration you only care about the current piece of data that you're looking at so a super simple example is you want to print out all of the numbers in a sequence you don't need to know the previous or the next number in the sequence or any you know of the numbers in the sequence you only need to know the current number in order to print it out and there's a lot of times when you're looping through some type of data structure or something where you don't care about anything before or after the current element you just care about the current element and if that's the case you use a generator however if you do care about stuff forward or after or i guess sorry before or after the current element then you can't use a generator because you need access to whatever's before whatever's after maybe you need access to everything that's to the right in the list or everything that's in the left of the list or sequence or whatever in that case you can't use a generator because the generator only gives you the current value you could have a generator give you maybe the second third fourth value but if you need access to the entire sequence when you're processing one element you can't use a generate hopefully that's clear but that's kind of the differentiating factor on when you would use a generator or when you would use some data structure to loop through but i want to give you a very real example here of when you use a gener so let's say you have a very large file maybe a file that has billions of lines now just a ton of lines it's going to take maybe like a few days of processing power to actually go through and process this file depending on what you're doing with it and let's say simply in this file you want to look for if a word exists you don't really care where it exists you just want to know if a word exists in this file well the first way you could do this find if the word exists in the file is you could read the entire file this would be the standard way you read it all into memory and then you loop through the data structure you have that represents the file and you see if that word exists that's the first way that you could do it however you don't need to know every single line in the file at one point in time you only need to know the current line that you're looking at and then you can determine with the current line if that line has the word that you're looking for in it so you look at one line at a time you don't care about all the lines before all the lines after you only care about the current line if you find the word great if you don't you move on to the next line in this situation you would use a generator and what you could do is yield one row of the file at a time so that rather than reading in the entire file and taking up you know maybe gigabytes of memory you're only taking up a few megabytes or gigabytes or bytes in your memory to store one line at a time and then as you look through that line you move on to the next line if it doesn't have the word or maybe you stop the program if it does have the word so i'm just going to copy in a function here that shows you exactly how you would do that this function is called csv reader and all you do is you pass a file to this function what it does is it opens the file and then it yields one row of the file at a time so this way you don't actually need to uh what do you call it read the entire file in at once you just read one row you can process the row see if it has a word and then continue on from there so that is kind of my example hopefully that may be cleared up on when you would use a gem so sorry for the abrupt cut here but i realized when i was editing this video i forgot to mention an important part of generators which is generator comprehensions so there is actually another way to create a generator that does not involve creating a function previously the way we made a generator was made a function and we just put the yield keyword in as soon as you put the yield keyword in a function it's a generator however another way to do this is with a comprehension so let me just show you and say something like x is equal to i'm going to put the syntax for a tuple so open brace and closed brace or open parenthesis closing parenthesis then i'm going to say i for i in range 10. now this is a very stupid generator you never make this but i just made a generator and to prove this to you i'm going to print x so when i do this notice i get a generator object so whenever you do a comprehension like this inside of the parenthesis this gives you a generator so now i could print out the next of x and when i do that i get 0. i could do this again okay i get 1 2 3. and then i could loop through x i could say for you know j in x print j and well we get the same thing right that's all i want to show you that's another way you can make a generator with a comprehension like this of course you would do something that probably makes more sense in this that is a bit more advanced there's another way to go about doing that all right so with that said i am going to end the video here i hope you guys enjoyed if you did make sure leave a like subscribe to the channel i will see you in another youtube video [Music] you
Info
Channel: Tech With Tim
Views: 38,309
Rating: 4.961165 out of 5
Keywords: tech with tim, python, generators, iterators, generators explained, python generators, python iterators, iterators explained, next(), iter(), legact iterators, creating generators, generator use case, generator compregensions, python iterators and generators, iterators python, what are iterators, what are generators, python programming, learning python, python coding, python generators explained, generators in python, python generator yield, python yield, python yield generator
Id: u3T7hmLthUU
Channel Id: undefined
Length: 28min 36sec (1716 seconds)
Published: Tue Aug 03 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.