Genetic Algorithms in Python - Evolution For Optimization

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what is going on guys welcome back in this video today we're going to learn about genetic algorithms and how to implement them in Python so let us get right into it not [Music] AED all right so we're going to learn about genetic algorithms in this video today and we're also going to learn how to implement them in Python now genetic algorithms are so-called metah suristic that are used to solve optimization problems using an evolution approach so when you have an optimization problem instead of solving it with an algorithm or a heuristic or an approximation algorithm or even with machine learning we use an evolutionary approach to find good Solutions that's the idea behind genetic algorithms and to not keep this too abstract here I'm going to give you an example right away we're going to use a very very simplistic and trivial example just to illustrate the concept and this concept can then be applied to uh real problems or specific uh use cases so let's say I have this scenario here I have a list and in this list I have a couple of slots and I can have some values in here and let's say that the values that I can have in the list are either zero or one so I can have something like this this would be a valid solution to the optimization problem and let's say that what I'm trying to optimize for is the maximum number of ones now again this is very trivial and very simplistic uh simplistic just to illustrate the example so my goal is to get only ones or to maximize the number of ones in this list now of course since this is a very trivial example the easiest way to do that is to just set all of them to one but you can imagine this to be something else like for example the set cover problem the set cover problem is uh you have a universe of values so something like 1 2 3 4 5 6 7 8 9 for example and then you have different sets one set could be uh 13 another set could be 4 five 8 another one could be 18 another one could be uh I don't know 1 78 something like this and you have all the different sets uh so that some combination ends up representing uh or ends up covering the whole universe so imagine you have a bunch of more sets here and uh in certain combinations you can get all the numbers from the universe here this is the set cover problem the question here would be if this is this is now a more uh real problem is which sets should I pick to cover the whole universe while keeping the number of sets used to a minimum so for example if I have uh another set here let's say I have the Set uh 2 4 5 6 7 8 9 then I can cover the whole universe with 1 three and with 2 4 5 6 7 8 9 so with two sets I can cover the whole universe um of course I can do the same thing with other sets so maybe I have a set here here with with just one and maybe I have one with just three so that would also be a valid solution to the set cover problem but it uses three sets instead of two sets so this is a real optimization problem where you try to keep the number of sets low while covering the whole universe this here is a very simplistic example just to illustrate the concept so I just gave you this so it's not uh just a random uh trivial example but you have some real use case for this uh genetic algorithm stuff now this problem that we're looking at up here is called The One Max problem it's considered to be the hello world Pro uh problem for genetic algorithms because it's so simple and of course it doesn't make sense for this problem to use genetic algorithms because all you have to do is you just have to get the length of the list and fill it up with Once that's all you have to do but how does a general uh how does a genetic algorithm solve this problem or how does a genetic algorithm find solutions to this problem um there are basically three parts to that one is that you have uh a representation and this representation um actually that's that's not a part the the representation is what you need and then you can do certain operations you have the three operations which are reproduction mutation and selection these are the three important pieces of a genetic algorithm because what you do is you have a population of uh different solutions so this could be one solution then you could have another one which looks something like this maybe I don't know that could also be one and they have different Fitness values you could say now a fitness value is just uh saying how good is this solution in the case of the One Max problem this Fitness value of course uh is very simple the sum of the numbers or the amount of ones in the list so in this case the fitness of this um of this solution here would be four because we have four ones in it and the fitness for this one would be seven because we have seven ones in it in the case of the set cover problem for example you might want to use the number of sets used as a fitness function if you want to only allow for solutions that cover the whole universe otherwise maybe you can also uh include somehow into the formula for the fitness how how much percent of the universe is covered uh using the solution so it's important to come up with some Fitness function in the case of a One Max problem very simple just count the number of ones that's your f Fitness value now the next thing is you need to reproduce so you need to somehow combine Solutions because remember we're talking about an evolutionary algorithm here we want to take two solutions and combine them we want to make them reproduce so how could we make these two reproduce now there are different ways to do that you could just uh take individual bits from the individual Solutions at random or you can do something like a crossover for example a crossover is you choose some index so for example I can say this is the position I want to do the crossover on so I can put a colon here and then what I do is to combine um the two solutions to a new solution I take the first um part of the of the first solution so 0 0 1 and then I fill up the rest with the second part of the second solution so 0 1 1 1 1 1 zero and that then would have uh in this case also a fitness of seven but it could improve it could become worse and then what you also want to do is you want to have a mechanism for mutation and mutation in this case could basically just be uh saying flip bits at random so for example we might want to flip this bit and then you know with a certain chance we're going to have some mutations in this case if that's the only mutation it would lead to a fitness score of eight so this is the basic idea on how you can approach good Solutions now how do the solutions get better overall by using a selection method that makes sense for example picking just the best Solutions or using something like a roulette wheel uh selection or there are different ways to to decide which of the instances you want to keep because the worst instances are going to not survive they're not going to um to make it to the next Generation that is now a very rough description of how genetic algorithms work now we're going to implement this here with core python we're not going to use any packages we're going to implement the One Max problem so that you can see how it works and then you can invest some thought on your own and I can also make additional videos on this how to apply this to actual problems so let's start and say we want to import random and then let's go ahead and Define some constants here let's say we want to have a population size and a population size can be 100 which means that we start with 100 Solutions um they can be random you can initialize them intelligently if you want you just get some solutions initially you have a population size of 100 then we want to have a genome length now a genome length is just the length of the individual Solutions let's say 20 so this would mean that we have 20 positions to fill up here so this 20 times uh so 20 zeros or ones um then what we want to have is a mutation rate so remember we said a mutation could be flipping a bit what is the probability of a mutation happening let's say in our case it's going to be 1% 00 01 um and then we're going to also have a crossover rate and the crossover rate is just going to say How likely is it that we're going to do a crossover instead of just returning the parents so if I select parents uh P1 and P2 there is a chance that I'm going to just return them without any mating without any reproduction or I'm going to do the crossover and we want to have a 70% crossover rate here of course you can tweak these values if you want to and then the last parameter here is going to be the generations I want to do 200 Generations which means that the whole process is going to be repeated 200 times now what we're now going to do is we're going to define a function called random genome now you don't need to make this a function if you don't want to I just like to have it in functions here uh in a modular way this is going to just return a random genome no reproduction no mutation just a random genome um as an initialization it's going to accept the length argument and it's going to return uh random random integer between 0o and one for placeholder in range length so just a random list of zeros and ones with the length that is passed here and this length is of course going to be our genome length uh then we also want to have an initialized population function so we're going to have AIT population and the population is going to take the parameters population size and genome length and of course feel free to combine these functions or to not have functions at all to just write down the code if you want to I just want to keep it structured here um let's say now we're going to say uh what we want to do is we want to return a random genome or we want to return population size number of genomes so we want to say random genome with genome length 4core in range population size so basically here we're generating bits here we're generating actual genomes and then we have our initial population which is going to be initialized randomly so it's not going to have any uh intelligence already in it it's just going to be random zeros and ones then as our fitness function this is very simple we get a genome and the fitness of this genome is just going to be the sum of the genome now if you have multiple values if you don't just have zero and one if you have two three four or something maybe you want to count the number of ones if you want to still optimize for the number of ones but in this case counting the number of ones and taking the sum is the same so want to return the sum value here as our fitness and then we get now to some more interesting stuff selecting the parent now let's define a function select parent and this function takes the population the full population and the fitness values so the fitnesses we could say or let's call it Fitness values sounds more professional so we have the population and a fitness value vales and we can say Total Fitness is the sum of the fitnesses or of the fitness values so we get the total amount of Fitness that is um available here and then we're going to say we want to randomly pick um a fitness value here as a reference point so we can say and of course you can use a different selection algorithm if you want to uh but what I'm going to do here is I'm going to say random uniform I want to take something in between uh zero and the maximum uh or the Total Fitness value that we have here so what we're going to do then is we're going to say current equals zero and we're going to start a loop for each individual in the population and for the corresponding Fitness value so actually fitness uh let's call it value Fitness value in we're going to zip the population and the fitness values so that we can iterate over both uh we're going to increase the current by the fitness score or by the fitness value that we have here and if this current value becomes larger than the pick uh than the random pick that we made up here then we're just going to return this parent as uh the individual so or this individual as the parent this is one selection mechanism it's basically a random selection we want to have some uh random uh some some random uh pick here and once we get to the to the value that was picked here by adding up the fitness values we're just going to return that parent here so this is just a function for selecting the parent now where it gets interesting where it gets interesting is crossover and mutation now crossover as are explained is our reproduction method here it means we take two parents and we pick a point and then we take the first half of one parent and the second half of the other parent so we can call this either crossover or reproduction method whatever you want to call it and this just takes parent one and parent 2 so we have our crossover rate which means that we're not going to just um always do the crossover we're going to say if random dot random if that is less than the mutation rate then we're going to return the crossovers otherwise we're going to just return to parent so we're going to do a pass here if it's not uh less than the mutation rate we're going to return just parent one and parent two uh in the case that it is actually less than the mutation rate we're going to return parent one and we're going to pick a point so we're going to say here that the crossover point maybe let's call this crossover Point not point is going to be equal to some random Point some random integer between one and the length of the um of the parents so the parent one length is going to be the same as the parent two length but we don't want to pass here again the genome length so we can just get it from here um so this just gives us some point some random point that we can choose and then we're going to return parent one up until this point plus parent 2 from this point onwards to the end and the same thing oh to call this crossover Point not just point crossover point there you go and the other way around we're going to return parent up until this crossover Point Parent two up until this crossover point plus parent one up until this crossover Point uh or actually crossover point until the end so we're just uh using both here as a return value we're returning both um then we also want to implement the mutation the mutation in our case is going to be just flipping a bit so we're going to say mutate genome do we want to mutate The genome uh 4 I in range uh length of the genome so for each bit we're going to say if random random is less than mutation rate so for each bit there's a 1% chance we're going to flip it and what we do here is we basically say okay if this chance happens we want to flip a bit so we're going to say the genome that we're currently the bit that we're currently looking at is going to be one if it was Zero before so if genome I is equal to zero otherwise it's going to be zero so we're just going to actually isn't there a mathematical way to to do it could I not just say I think I can make this in a more intelligent way I can say 1 minus genome I because I think 1 - 1 is z and 1 - 0 no actually doesn't make a lot of sense wasn't it like the absolute value I want to figure this out now I can do the absolute value of genome minus one shouldn't this work if it's zero it's going to be one because 0 - 1 is - 1 and if it's 1 - one it's going to be zero okay I think this should work I hope so um and then we can return in the end the genome once we're done with all the mutations all right so that is what we need now to implement our big function the genetic algorithm itself so the genetic algorithm genetic algorithm is going to be the following function it doesn't take parameters because it's just going to use them from up here um we're going to do now population is going to be equal to initialize a population with the population size that we defined as a constant and the genome length that we also defined as a constant and now we're just going to go through generations of doing the process of the genetic algorithm so for Generation in range Generations we're going to calculate all the fitness values so Fitness value values is going to be equal to Fitness of the genome uh for genome in population so for each genome genome in the population we're calculating the fitness score we're keeping track of them in a list and then we're defining a new population or you could also say the population of the Next Generation and the new population will start as an empty list and we're going to say for place folder in range population size divided by two because we always take two parents so we don't need to do it um for the full population size but only for half of the population size so we floor divide by two uh we select two parents so parent one is going to be equal to select parent from the population with the fitness values that we have here and parent two is going to be the same thing so we're just selecting two parents and then we get The Offspring so we get the yeah the future genome it's EI either going to be the two parents themselves or it's going to be the crossover result so Offspring one Offspring 2 is going to be equal to uh crossover parent one parent 2 and then we're going to extend our new population [Music] um yeah we're going tot send our new population by our offsprings and of course we also need to mutate them before so we need to say mutate Offspring one and mutate Offspring 2 so maybe we're going to flip some bits and then basically what we do is we uh get the population again so we say population is equal to new population because otherwise when we talk about a population here it's always going to take the initial one so we need to update this to the new population and then we want to again get all the fitness values so the fitness values are going to be actually let me just copy this it's going to be these values here and we're going to say that the best fitness we have is Max of Fitness values and we can print that here as a status message we can say generation and then generation Best Fitness is equal to to Best Fitness just so we can keep track of that and in the end what we want to do after the for Loop is we want to say that the best index is going to be the index of the best fitness value so fitnesses or Fitness values index Max Fitness values and then what's a problem here Fitness values might be referenced before assignment no it won't be because Generations is not going to be zero so so best solution is then going to be uh the population instance so the individual The genome that has this index in the fitness values list so we can then print best [Music] solution and then we can say the best solution is equal to the best solution and the best fitness is going to be the fitness value of this best solution genome so in this case I'm calculating it again of course I could also take it just from the Fitness values list but that's fine and then finally what we want to do is want to say if uncore uncore name uncore uncore is equal to uh uncore uncore maincore uncore then we're going to run the genetic algorithm that is it I'm going to explain it you second again what's the problem up here I saw some stuff not what is this did I mess up something with the formatting here initial population no okay I just have to remove this okay so now I can run this and what you see is that something doesn't seem right here let me just double check check oh I think I figured out what the problem is we have our crossover here and we're checking if it's less than the mutation rate we should check if it's less than the crossover rate otherwise we're not going to have any reproductions almost so let's run this again and there you go now I actually increased the number to 2,000 Generations so here after 200 Generations you can see it found the optimal solution only once and I can run this a couple of times maybe sometimes it's only going to get to 19 or 18 but you can see it improves over time because it starts with a 17 here now we can also increase the genome length so we can say want to half 50 genomes uh and in that case um the progress is going to be slower but uh if I increase then again the genome uh the generations uh probably no it doesn't even get to the best solution in this case so maybe if we increase the mutation rate no seems even worse maybe if we reduce the crossover rate or we do both you can play around with that so doesn't seem to make a lot of sense here in this case uh but this is just the hello world example of genetic algorithms so to go back to the beginning what you actually want to do is you want to take a problem an optimization problem uh good examples for this are as I said the set cover problem uh graph coloring or traveling salesman problem tsp uh or basically any problem that you can somehow any optimization problem that you can somehow represent here as a genome also the napsack problem stuff like this because all you have to do is you have to think about how are you going to uh represent your genome what is your genome going to be and then you also need to come up with an idea for the fitness function so for example uh as I said you might have uh the individual sets which ones are you going to choose I choose set one I don't choose set two I choose set three I don't choose Set uh four I don't choose set five but I choose set six and that covers the whole universe so we only have three sets here the fitness could be um you know in this case the number of possible sets used minus the number of actual sets used so you could have uh in this case if it's 6 - 3 it would be three uh if it's like this it would be two uh because I'm using four sets and you want to minimize the number of sets or for graph coloring maybe you want to have each index represents a note in the graph and then the value of that um index or of that position says which color we're using so it could be something like 1 two 1 2 2 2 3 1 1 one in this case the fitness would be three because we used three colors for the graph um and in the graph coloring problem you want to minimize the number of colors used so that's just another idea here so yeah this is how you can work with genetic algorithms in Python so that's it for today's video I hope you enjoyed it and hope you learned something if so let me know by hitting a like button and leaving a comment in the comment section down below and of course don't forget to subscribe to this Channel and hit the notification Bell to not miss a single future video for free other than that thank you much for watching see you in the next video and bye for
Info
Channel: NeuralNine
Views: 12,278
Rating: undefined out of 5
Keywords: genetic algorithm, genetic algorithms, GA, python, genetic algorithm python, python GA, one max problem, set cover problem, python genetic algorithm, evolution algorithm
Id: CRtZ-APJEKI
Channel Id: undefined
Length: 26min 10sec (1570 seconds)
Published: Tue Apr 02 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.