A.I. teaches itself to drive in Trackmania

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Very cool!

๐Ÿ‘๏ธŽ︎ 2 ๐Ÿ‘ค๏ธŽ︎ u/Sirthomaken ๐Ÿ“…๏ธŽ︎ Nov 25 2020 ๐Ÿ—ซ︎ replies

Great video.... Thanks!!!

๐Ÿ‘๏ธŽ︎ 1 ๐Ÿ‘ค๏ธŽ︎ u/MarceloBravari ๐Ÿ“…๏ธŽ︎ Nov 25 2020 ๐Ÿ—ซ︎ replies

quite interesting

๐Ÿ‘๏ธŽ︎ 1 ๐Ÿ‘ค๏ธŽ︎ u/marcnyc3 ๐Ÿ“…๏ธŽ︎ Nov 25 2020 ๐Ÿ—ซ︎ replies

Ok so what is the solution then?

๐Ÿ‘๏ธŽ︎ 1 ๐Ÿ‘ค๏ธŽ︎ u/BabylonianGM ๐Ÿ“…๏ธŽ︎ Feb 02 2021 ๐Ÿ—ซ︎ replies
Captions
this car is driven by me in trackmania the racing game but this one it is controlled by artificial intelligence as you can see this ai still has a long way to go before it beats humans in fact this ai probably can't go faster than me because it was built to mimic my driving style in order to do that i'm using supervised learning which is a subcategory of machine learning and with this method the algorithm isn't really able to do everything it wants but what if we could create a new ai that begins with zero game knowledge and learns everything on its own this can be done with reinforcement learning and in this video i'm gonna show you an ai working that way and after training my ai for many hours i'll compare its best run with a good human run [Music] in my videos about supervised learning many of you have suggested that i should make my ai evolve successive generations limitation crossover selection fitness function all these terms refer to a reinforcement learning algorithm called the genetic algorithm it's inspired by biology and natural selection and i'm gonna quickly explain how it works i wasn't surprised to see comments about the genetic algorithm because there are already a lot of videos on youtube about this machine learning method it's often used to teach an ai how to move around and avoid obstacles in games which is very useful if you want your air to drive a car so how can we do that in track mania well at the start the method is similar to supervised learning my ai knows absolutely nothing about game physics it doesn't know how to drive a car and actually it doesn't even know what a car is and what fast means i'm just gonna give it basic information about its environment ai gets this information in real time in the form of numbers the ai is also able to perform three different actions turn left turn right and accelerate finally ai has a neural network to make the link between what it knows and what it decides to do now i need to find the optimal structure and connections of the neural network so that the ai can drive and this is where the two major learning methods become different in supervised learning i could show examples of my driving style to the ai for example i could teach it that in a case like this i will turn right so i need to collect a lot of data so that ai could use and generalize this knowledge now let's finally speak about reinforcement learning and genetic algorithm this time i'm not giving any prior knowledge to the ai i just give him a basic neural network with a random structure and random connections between neurons this is the initialization step as you might expect the ai does not perform very well but that's okay i'm just gonna initialize all the ai's and select the best ones so here i have a population of several individuals and most importantly i have diversity because as i said each initialization is randomly generated [Music] now let's begin our long training process [Music] i'm gonna use a population of 100 individuals this population represents the first generation and each ai has 13 seconds to drive on this map it's pretty easy to see that some ais are performing better than others but how can we measure that i need to define a fitness function it's something that allows us to quantify how well an individual is performing in its environment i'm gonna use the distance traveled by the car as my fitness function thanks to that i can make a selection i'm gonna maintain the best individuals in the next generation and eliminate the worst ones i also need to create new individuals with the hope that they will be even better in order to do that i can make crossovers between the best individuals i am also adding mutations to modify randomly some individuals i am doing this until i have 100 new ais ready to compete in the next generation with selection crossovers and mutations the genetic algorithm is globally working in the same way as natural selection [Music] after only four generations we can already see some evolution the ai is even able to reach the first checkpoint several strategies emerged during this process the zigzag strategy the wall strategy or even the circle strategy let's continue the evolution to see how far the ai can go [Music] [Music] [Music] after 5 hours and 12 generations the ai is finally able to reach the second checkpoint within 13 seconds it's still far from a good run but evolution is known to be a long and slow process so far i've used the distance traveled by the car as a way to identify the best ais instead i'd like to use the time recorded at each of these checkpoints the problem with this stance is that it favors some useless strategy such as the circle strategy it also favors some inefficient trajectories so it feels more natural to use checkpoint times instead [Music] [Music] [Applause] [Music] after more than 40 generations the air is becoming quite efficient at the start but still doesn't know the rest of the track so i increased the timer to let the ai discover more of the map [Music] now ai is discovering new terms for the first time as you can see ai is struggling and always hitting the same walls but it is still able to go through some of these new curves ai was able to generalize the ability to generalize is very interesting in machine learning this means that the ai has acquired knowledge during its training and is able to apply this in a new similar but unknown environment [Music] between generations 30 and 60 the ai hasn't made much progress here are two graphs showing the best time achieved by each generation for the first and second checkpoint ai is progressing which is a good sign but this progress is getting slower and slower perhaps this is because ai is approaching the limits of this map i could compare this checkpoint's time with those of my personal best time as a reference but it will be even better to know the true limits of this map this is why i asked travada to help me trabeja is known for his precision skills and his legendary record on ao1 a famous campaign map in tragmania nations forever here is the run it drove on my map [Music] let's compare his checkpoints with those of ai as you can see ai still has room for improvement but it's learning very slowly and i figured it will probably not improve much on the beginning of the track or it will take too many generations so i decided to let ai drive on the whole map until generation 100. [Music] after 100 generations 10 000 runs and many hours i stopped the algorithm ai is still struggling in some turns and it will probably take hundreds of more generations to fix this unfortunately the fact that i have to simulate the runs one by one makes learning very slow anyway ai was able to drive this map in 23.48 seconds during generation 99. [Music] now do you remember the supervised learning run from the beginning of the video well genetic algorithm at least defeated this previous ai but i still believe that supervised learning is more effective than genetic algorithm in this game it's way faster to collect thousands of data myself and then use it to train the ai in addition it is possible to collect data on large and diverse maps and thus show the ai many possible situations the ar is therefore better able to generalize so if i compare the two ai's on this map this map or vase map the one trained by supervised learning will win i could also try some other reinforcement learning methods such as q learning but that will be for another time until then you can enjoy these clips i've made combining runs from the last 30 [Music] generations [Music] two two three [Music] one two three you
Info
Channel: Yosh
Views: 3,332,584
Rating: 4.9039645 out of 5
Keywords:
Id: a8Bo2DHrrow
Channel Id: undefined
Length: 15min 3sec (903 seconds)
Published: Fri Nov 13 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.