Each of these cars is controlled by an Artificial
Intelligence in the racing game Trackmania. Right now, this AI is attempting something
particularly tricky: driving on pipes. I've designed this AI to learn from scratch, without any previous knowledge of the game. So
at first, the AI can't keep its balance for long. But this is part of the plan. Because
this computer program is designed to learn from its mistakes and
improve itself over time. So with enough training, can it come
up with better strategies than humans ? On these unstable pipes, this question
might be particularly interesting. So to answer that, the AI is gonna attempt to beat the human World Record on three
challenging tracks that I've selected. Starting with the easiest
one: a simple straight pipe. The rules are simple. the AI can use
four different actions. At first, without any experience, it's
just using them randomly. If we want the AI to make progress, it needs a target. So for each action
the AI takes, I'm gonna give it a reward. The faster it progresses along the pipe, the
higher the reward, as long as it doesn't fall off. Now, the AI's only goal will be to predict
the actions that add up to the most rewards. And it's gonna learn this through a
process called Reinforcement Learning. Basically, the AI is gonna play over
and over again. In each attempt, it can try new things, and gather new experience. Over time, this experience is used to reinforce
the AI to select actions leading to more reward. Through this trial and error process, the AI should gradually learn to go
faster and keep its balance on the pipe.
After 12 hours of driving,
the AI is already quite fast. It may look easy but I can assure you, it's not. I've tried myself to challenge it, and even
though I've been playing this game for years, I just couldn't keep up with its pace. The AI's driving already looks quite inhuman. To select its actions, the AI uses
real time observations of the game. A few numbers that sum up everything
it needs to know, such as its speed, position and orientation on the pipe. About every tenth of a second, the AI interprets these numbers with
something called a neural network. Basically, that's the AI's brain. Its job is to
predict the optimal action in a given situation. So the AI's strategy depends on how
this neural network is configured. This is where reinforcement
learning operates behind the scenes, by gradually tuning the network configuration. For now, this training process is not over. The AI can probably reach a faster pace,
and it's not super consistent anyway. Let's see how far it can go. This isn't the first AI I've
trained in Trackmania. I've already got some promising results before. But today, there's something
I can say for the first time. The level of this AI is definitely inhuman. This is the human record on this track.
Compared to that, the AI's pace is just absurd. But driving fast is only useful if you
can reach the finish at the end. And if you haven't noticed yet, this track is quite long. That's partly why the world record keeps
a very low speed compared to the AI. Going any faster would be quite risky, as
a single mistake is enough to end a run. In fact, that's something that
started to worry me about the AI. Can it really maintain this pace to the
finish line without falling off the pipe ? Because I've observed this AI attempt the track
many times, and it's not super consistent. All these cars are controlled by
the exact same version of the AI, yet the outcome is very different
from one attempt to another. It's kinda strange, the AI can repeat the
same cyclic pattern dozens of times without any problem, but suddenly its car starts to
deviate slightly, and that's the end of the run. I have no idea what it's doing wrong. I mean,
I just can't drive at that level myself. We could guess that the AI favors its
pace over its consistency. But for some mysterious reason, it can't optimize both. But that's okay, we can save
that issue for later. because the AI still got several promising attempts. Like this one. A pretty good time, right ? This looks very promising for the upcoming tracks. But to be honest, this time is still quite
far from optimal. And I know that because.. This isn't the only AI I've trained on this map. Here you can see the start of a second training
session using the exact same training process. I've run several of this out of curiosity,
and it gave me quite disconcerting results. Just like before, these new AIs
eventually managed to complete the track. But they didn't end up with
the exact same strategy. It looks similar, with the
same kind of cyclic pattern. But when you look at their actions
closely it's not exactly the same. In particular, they use a
different method for slowing down. And the two new strategies are actually
better, both achieved a faster time. So if the AI can find different strategies, there might be a faster one
that it hasn't yet discovered. But I can hardly understand what's happening here.
How could I know if it's possible to do better ? Well I think there's one thing I could try. As I said, it appears that all these
AIs intentionally slow down the car. is that really necessary ? If I retrain the AI one final time from scratch, but I force it to always accelerate
and never brake, can it drive faster ? If the only thing the AI can control is its
steering angle, can it still finish this track ? Here's the best the AI could come up with. At first, the AI can't slow down, so it
inevitably builds up speed. Soon enough, it reaches a faster pace than all previous AIs. But then, something interesting
happens. Its speed starts to stabilize. After a closer look, I think it's because the
car regularly loses contact with the pipe. With these little jumps, the AI still
found a way to control its speed, it should be able to cross the entire pipe. Like all the previous AIs, it's not super consistent. But again
these mistakes were not too frequent. Not enough to prevent the AI to
finish the map one final time. So finally, it turns out that this track
belongs to the full speed category. Actually, it's surprising that the
AI didn't find it on its own. We might never know how close we are to the limit. Maybe the AI can still reduce
its air time and go even faster. But we'll stop here for this
track. This was just a warm up, and it's time to move on to the
serious stuff, on a more complex map. A map with a challenging world record, which by itself was enough to
motivate me to make this video. A record held by a player the AI
recently faced, but couldn't beat. A player named Wirtual. If you don't know him, Wirtual is a
highly experienced Trackmania player, and also a well-known streamer. I can remember it was while
watching one of his livestreams, two years ago, that I first thought
of making an AI drive on a pipe. That night, Wirtual was attempting to beat
the world record on this giant pipe maze. And after a few hours, he finally succeeded,
with an impressive time under 20 minutes. As for me, I managed to get my AI
working a few months later. But it wasn't fast enough to beat experienced players. Since then, I've made quite
a few improvements to the AI. A few months ago, for the first time, it was able
to completely dominate me on regular road tracks. However, this wasn't enough against Wirtual, who after another extensive playing
session, managed to beat the AI. But the AI looks stronger on pipes. Today, it might have a chance of a small revenge. Overall, the training method remains
the same as for the first track. The main difference is that the AI needs
additional information about the track layout. For instance, this new input provides
the distance to the next corner, and this one the direction of that corner. So far, the AI hasn't been able to
go that far in the map. Once again, it seems to prioritize pace over consistency. Its driving style is fairly aggressive.
It's able to overtake the record in the first few corners, but it never gets very far. This time, consistency might be more important. Okay it looks like the AI has no intention of playing any safer. It's just
driving faster than before. But it's not that bad, it
can go further and further. Actually the AI isn't really exploring the
map as it goes. For the clarity of the video, I'm only showing attempts from the start block, But in reality, the AI starts from a random
location on each new training attempt. This way, it can practice all possible scenarios, without focusing excessively
on the first few turns. And if you look closely, there's one area
where I made the AI spawn more frequently. Right before the finish. This particular section is quite different
from the rest. To reach the finish line, you need to build up speed
and jump from the last corner. It's a tricky jump, even for experienced players. But for the AI, the difficulty wasn't the main issue. It was more a problem
of understanding what to do. To guide the AI, I had to
adapt the reward signal a bit. When the AI enters the finish
area, it knows it with this input. From there, it's rewarded based on
how close it gets to the finish, regardless of whether it's
following the path or not. And if it ever crosses the finish line,
it receives a massive bonus reward. With that, the AI quickly understood that jumping would bring more rewards. But it
wasn't jumping from the right spot. It took the a many hours to rectify its approach. From there, it started to look interesting. And after many attempts, the
AI got its first success. It then continued to improve its approach. Eventually, it became quite consistent. Now there is a good chance that if the
AI ever reaches this area in a real run, it could conclude it with a successful jump. During this time, I've kept an eye on its training on the rest of the map. Its
driving looks insane now. All this time, the AI has
continually improved its pace. Now we can stop the training and
keep the final version of this AI. If this one maintains that pace up to
the finish, it could set an insane time. We just have to repeat what
we did on the first track: make it attempt the track many times
and see how far it can push the limit. Among all these attempts, the
AI didn't reach the finish once. Most of the time, the AI can
survive for one or several minutes. But for some reason, it always
ends up falling of the pipe. That sounds familiar. I could train the AI longer, but I don't
think it's gonna make any difference. Its consistency has remained almost
unchanged for many hours now. If we want the AI to be the
human record, that's a problem. But if the AI understands so
well how to go fast on a pipe, why doesn't it also understand
how to avoid falling out of it ? Why do these mistakes even happen ? I mean, it's strange to observe
this behavior for a robot. It's not like a human who would make an
inattention error after some time. Why is it that even a robot can't repeat
consistently the same strategy without failing ? These are the questions that have
obsessed me over the last few months. I've conducted dozens of training
sessions, with various training settings. It never fixed the problem. I've tried to modify the reward signal, to
further punish the AI when it falls off. It didn't fix the problem. Then I've tried to increase its action frequency. Maybe its reaction time is too slow
to recover from small accidents. It didn't fix the problem. Honestly I think I'm a bit lost here. But I still have one thing to investigate. For that, I need to tell you about a
small detail I haven't mentioned yet. I said that all these cars are controlled by the
exact same AI. And you might have been wondering.. Why is there so much disorder
among these different attempts ? Why are these runs even different ? Since the AI has a fixed decision making process, all its runs should be identical, as
Trackmania's physics are deterministic. The same action in the same state of the
game will always have the same consequences. But to counter this, I'm using a small trick. In the first tenth of a second, the AI
normally decides to go straight ahead. Instead, I'm forcing it to turn very slightly,
using a different steering value for each run. In the next tenth of a second,
the AI takes back control. This initial perturbation is so tiny
that it's not even visible on screen. However, after a few seconds, you can see that the actions and trajectories in
each run become desynchronized. To the point of generating
completely different runs. With this simple trick, we can get the
same AI to drive many different runs. What's surprising about these runs is that there doesn't seem to be any obious
pattern to the AI's falls. It's almost as if these mistakes occur at random. But what I find most disconcerting is how even the slightest change in one single action
can yield widely diverging outcomes. And it's not specific to the
the start of the track. This kind of perturbation can be generated anywhere. Like here. Just by applying any
tiny random change to one action. Every time, a seemingly insignificant change in one decision will have
massive consequences later on. This gave me an idea. Let's take a look at one of the AI's attempts, right around the spot where
it ends up falling off. What if we generate the same type of perturbation
on this run a few seconds before the fall ? Right here. Now it's getting really disturbing.
Almost every time, the mistake disappears. So if the AI had changed its steering
by almost any tiny amount here, it would not have failed in the next corner. I've repeated the same experiment
a little closer to the fall point. Here. This time, the outcome is a little more uncertain. What's even more disconcerting is that
there doesn't seem to be any apparent logic between the AI's action here, and
whether or not it falls 2 seconds later. The outcome looks completely random. So is it the AI that plays randomly here ? Let's repeat this experiment one final time. But this time, after the perturbation, each car will be forced to maintain
the same actions as the reference run. Well even with consistent actions,
the outcome still varies a lot. It fluctuates between different patterns. In a
way that once again just seems unpredictable. It can't just be the AI that's inconsistent. Instead, it might be the game's
physics that behave randomly. But Trackmania is deterministic,
so.. it's not random. Right ? Well, if you ask any experienced player, they will probably tell you that there are quite
a few situations where this game feels random. Situations where you try to
repeat the same lines and actions, but the game just seems to react differently. Situations where the game's
behavior seems unpredictable. There's one track from the official campaign
that's particularly known for that: E03. This track is a nightmare. On almost every
jump, you can feel that sense of randomness. Especially at the landing,
where the car's behavior is highly sensitive to any small change in its state. Even for professional players, it's
super hard to predict these behaviors. Such irregularities are commonly referred as bugs, and they constitute a severe
obstacle to consistency. As far as I know, no one has ever
managed to complete a run without landing bugs on E03, 15 years after it's release. So is it the same thing that limits
the AI's consistency on pipes, despite its insane driving skills ? I'm not sure we can call that
bugs, but I feel there's a fair amount of randomness whenever a car
drives over a pipe in Trackmania. At high speeds, these complex behaviors
might become particularly punishing. Unrecoverable for anyone, even an AI. But this is just a guess,
there might be other reasons. This AI is far from perfect. Maybe these pipes
are not random, just too complex for the AI. We might get a better answer with the final level, when we get there. But for now,
let's refocus on our initial target. Wirtual's record. So how can we beat that ? Actually, I've
no idea how to fix the AI's mistakes. But what I do know, is that even if the AI makes
mistakes, it did have a few promising attempts. Sometimes completing more than half of the map. So maybe the AI just needs a bit more luck ? if we make the AI drive the map many more
times, it should end up getting a lucky attempt. One that reaches the finish line,
and finally beats the human record. In the world's most competitive racing game, driving a perfect speedrun often
implies taking risky lines. Trajectories where disastrous
mistakes can be inevitable. Apparently, this is no different for our AI,
which often loses its balance in corners. In fact, based on its numerous attempts, the AI
has about a 97.3% chance of passing a corner. In a way, that's a pretty good number. But this infamous track contains many corners. And the probability of completing all of
them without failing becomes pretty low. But it's possible, the AI just needs one good
run. one single run where everything goes well. A run which would finally tell us how far
the AI can push the limit on this map. And finally, after more than a thousand
additional attempts, the AI got this run. 8 minutes faster than the human
record. That's pretty strong ! And the AI still has some room
for improvement. Already in the first corners, its run was well
behind some of its other attempts. With this new success, I was already thinking
about putting the AI on the final track. But before going any further, I think
Wirtual deserves a second chance. Honestly, we could argue that the AI
had obious advantages in this battle. Compared to a human, it doesn't risk
to lose its focus on this long track. It can make quick decisions,
which is ideal on pipes. And it can't lose its patience either. Even
if it fails, it can just try again and again. So I've been thinking about
ways of penalizing the AI, to try to make the competition
with human a bit more even. And I came up with this question: What if we force the AI to drive backwards ? So I've driven the first
few seconds of a run myself, to make the car land on the pipe backwards. From there, the AI is gonna take control. I'm gonna start a fresh training
session with this new starting point, and we'll see if the AI can still beat humans. Even backwards, the AI keeps
very good control on the pipe. I was quite amazed when I first saw this. Again, the AI's efficiency
and precision look inhuman. But still, the AI is not very intelligent.
because it could have done better. It could have cheated. Although it's difficult, it's actually
possible to turn around anywhere on the pipe. But I'm not too surprised that the
AI didn't discover any of this. For that to happen, the AI would need to perform a precise sequence of actions by chance,
without immediate positive feedback. The payoff would only come a long time after. This is quite unlikely to happen,
without clear and guided indications. Anyway, it's a good thing the AI
didn't cheat. Driving backwards was supposed to be a handicap for the whole race. Now it's time to find out. In these
conditions, can the AI still beat Wirtual ? Ok so the AI is definitely faster
than the record pace. Even backwards. But it looks like we have a major problem: is
it still possible to reach the finish line ? During its training, the AI never
cleared the jump. But over time, it made good progress. It even
got a few very promising attempts. It's at this stage of the training
that the AI stopped improving on the rest of the track. Yet I still have
the feeling that it can do better here. So let's try to continue training, but with
even more focus on the jump. From now on, the AI will spend 90% of its
training in the finish area. Ok now we have confirmation that it's possible. But let's keep training a little
more, the AI is still hesitant. And actually I think it's quite fun to watch.
Seeing the AI gradually master the jump, the different strategies it puts in place.. Here for example, the AI has turned
its car over by accident. Now, it just has to go forwards to the left. But the AI doesn't understand this new situation. During its training, it was punished again and
again every time it tried to drive forwards. So now, t's just trying to go backwards as usual. Until it's able to finish. After a while, the AI's strategy has stabilized. At this point, it's still unable to complete the
jump consistently, only about 20% of the time. That's quite low compared with the previous AI. But on the rest of the map, it's surprisingly consistent. It makes almost no mistakes
compared with the one driving forwards. So this time, it didn't take the AI that many
tries to get everything right in a single run. Honestly, I wasn't expecting such a good time. It appears that driving backwards wasn't
such a big disadvantage for the AI. So again, I've been wondering
how to handicape the AI further. What if we force the AI to drive upside down ? No I'm kidding, let's move on
to something more interesting. I've tried though! But let's forget that, and
finally focus on the final level. This is by far the hardest one. And probably the most random. This track was released 12
years ago. And since then, only one person has ever finished
it: a player named Unnamed. Most of the track consists of the same
repetitive jump from one pipe to another. So initially, to simplify training, we're gonna
focus on that part, making the AI start from here. Again, I had to slightly adapt the AI's
inputs for this track. But overall, the training method remains the same. After 100 hours of play, the AI's
behavior doesn't look super promising. Let's take a closer look at the world record. As you can see, Unnamed is constantly
maneuvering at moderate speed, to ensure he lands correctly after each jump. That's quite different from
what the AI chose to do. It's maintaining speed to jump in one go,
without using the intermediate platforms. Obviously, the AI strategy seems faster.
But it can't pass these jumps consistently. If the AI never tried the world record strategy, I think it's for the same reason that it never
flipped its car when it was driving backwards. For the same reason I had to force
it not to brake on the first level. When you see all this, this
artificial intelligence doesn't seem so intelligent. It seems to lack creativity. There might be a link with the
AI's falls. We've observed them on every level. and I said it's
because of the game's randomness. But maybe the AI just lacks enough
creativity to deal with that ? Still on the last level,
even if it's not consistent, I believe the AI found the fastest strategy. So to better understand its mistakes, I've
tried to play the same way for about an hour. And I couldn't do much better than the
AI. Maybe some other players could, I don't know. But I'm not sure
creativity is an issue here. Actually the one thing I remember about this
experience, it's just how random these jumps feel. On this track, the pipes
look more random than ever. For the final time, is it really
because Trackmania is random, or are we missing something ? Here is a last experiment I made. Let's say we wait at the start of this
simple track, without pressing anything. During this time, the car remains stationary. But not entirely. With external tools, we can
observe imperceptible variations in the car state. So if we start to accelerate at different times, the car will start with small
differences in initial conditions. Extremely small differences.
Yet it's enough to induce a completely different behavior on the pipe. A behavior that looks as
random as the roll of a die. Like for example, if you wait exactly 7.65s
before accelerating, you always reach the finish. Here are the same runs again, visualized
as if they were starting at the same time. Can we predict anything in this mess ? Can we
really predict how a car is gonna land on a pipe ? It just looks like complete chaos. Since I've been confronted with these
irregular behaviors in the game, I've been wondering if there's
a connection with chaos. It's a field I don't know really well, so I'd
be interested to have your opinion on this. Basically, chaos theory deals with
deterministic systems where small differences in initial conditions can
lead to vastly different outcomes. That sounds pretty close to
what we've observed so far. And this theory states that
the deterministic nature of these systems does not make them predictable. So if there is chaos in this, even if both
the AI and the game are deterministic, the consequences of certain actions
could be impossible to predict. The whole point of this AI is to
make this kind of predictions. So maybe that's the reason some
of its mistakes seem inevitable. I won't go any further on that. Again,
it's totally outside what I know. But if there are any specialists
of this field watching this, please don't hesitate to contact me. I'd like to understand this better,
maybe talk about it in a future video. What I know for sure, it's that if
we want this AI to finish the track, it's gonna need to be very lucky. I've tried to train the AI longer, but I haven't observed any progress. As if
it had given up trying to understand the map. Actually, I've tested way too many things on this track already. I think I'm
getting tired of these pipes. I don't want to dream about these pipes anymore. And this video is getting quite long too,
it already took 5 months to get there. So I'm gonna keep the AI as it
is, and hope it's good enough. I just gave it a bit of extra
training to practice the start. It caused additional problems, but anyway
it's time to test it on the whole map. Once again, it's gonna play it many times. And we just have to hope
it gets one lucky attempt. That's it, the AI did it ! And with that it's done, it managed to break
the human world one each of the three levels. A big thanks to all my Patreon
members, who helped finance this video. Making this kind of project takes a lot
of time, so any support on Patreon is a great reward for me, and it will help me
to spend more time on upcoming videos. I'd like to end with a shoutout to
the players mentioned in this video. The tracks I chose were quite specific and
repetitive, clearly in favor of the AI. It didn't fully showcase the
incredible skill, patience, adaptability and intelligence
of such Trackmania players.