Unity ML-Agents 1.0 - Training your first A.I

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
right now is the best time to get started with machine learning in unity ml agents wondered oh just released the syntax you learned from now on we'll probably not change drastically in the future and the installation process has never been easier no matter your experience level with this video you will be able to build your first AI I promise let's get going I assume you have installed unity already you can pretty much use any unity version you like it has to be at least 20 18.4 so you probably already have the correct version we will start by creating a new empty 3d project open up the package manager window make sure preview packages is enabled and install ml agents yes emeral agents can be installed via the package manager this is great news as we are going to start with the basic example provided by unity we have to download their repository as well the examples are not included in the package unity if you are listening right now having the samples optionally included in the package manager like in the case of the VFX graph for example would be neat just clone the ml agents repository with your preferred method I am just downloading it straight from github check out the description for the link decompress and open the downloaded repo open up the assets folder on the project and simply drag the ml agents folder into the editor now under examples you can find a 3d ball scene if we open it up press play we can see it is already working with the pre-trade machine learning model a unity this is nice the simplicity of this framework really is its biggest strength as you can see the blue agents are balancing the ball pretty much perfectly this behavior is not hard-coded it is learned don't worry we will train our own version of this shortly now let's dive in a bit deeper and check out how this thing works there are 12 agents in the scene they are pretty much identical and just used to speed up and stabilize training if we open it we can see it contains a child game object called agent this is where the magic happens the two important scripts on this game object are the behavior parameter script and the ball 3d agent script in the ml agents framework agents are the actors and the behavior they are linked to determines how they act an agent requires a behavior to function but it is also possible for multiple agents to be linked to the same behavior which comes in very handy in many situations the agent is responsible for collecting observations executing actions and assigning rewards the behavior on the other hand receives the collected observations and rewards and is responsible for determining the action to execute there are many ways of setting up your agents and many more ways to train them which all drastically change the outcome but more is later let's start with what we see in the behavior script there are vector observations and vector don't be confused by the language used here it is a bit different than unities usual language because they adopted the wording of the machine learning community a vector in the unity way of thinking is just a list on array of floats the space size determines the length of the list so a vector observation with a space size of 8 is just a float array that can fit 8 elements we can also stack these observations meaning that we wait until a number of observations have accumulated until we send them off to the neural network this can be useful because data from multiple time frames could be used to infer new information what do I mean for example a single position just captures one point in time but two positions from different points in time could be used to infer things like direction velocity or accelerations this is what a stacked vectors parameter describes the number of observations that are collected until it'll stand off it's a simple slider with a big impact so it's really important to know what it is doing the vector action has two types continuous and discrete again in the unity terms continuous is basically a float with numbers ranging from minus 1 to 1 and everything in between to a certain position and discrete is an integer what you choose here just depends on the type of game you have discrete numbers are used to encode different decisions like one is for calling a tied-up and two is for going left and so on with continuous numbers the decisions are more fine-grained for example the force you used to throw a ball if we quickly jump into the ball through the ancient script we can see the same numbers reflected there for our observations we have space size of eight we are adding two angles which are both floats a position and a velocity which are both vector three objects so containing three floats each making it it floats in total the process of selecting the observations is very important don't ever rush it next the model and behavior type is specified in this case there's already a model attached because unity has trained one you can leave this empty until training is done if you are training your own model there are three types of behavior this is really important first we have heuristic which is the classic way AI in games work programmers think of ways the AI should behave and hard-code them in it can work very well but it has problems to adapt in ever-changing and complex environments and of course machine learning is also more fun this is the next point learning behavior this is what we are after this is when the AI is currently trained using machine learning during training a neural network model gets generated in order to use this generated model after the training is finished the last behavior is used which is called inference where the learnt model is supplied but not changed meaning the AI won't learn if we choose to fold it will basically try to use the learning behavior if we don't have the external Python training process attached like right now it tries to use the model attached to it for inference if no model is specified it will fall back to using heuristics the team ID is only relevant if you want to use the same behavior on multiple agents playing against each other like you can see in the sakai example let's go into the board 3d agent script we won't cover this script in full detail because the specific implementation is not really the important part here we will focus on the initialize collect authorizations on action received and on episode begin methods these are all inherited from the agent class if you would create your own agent you would also let your script inherit from agent though it is also possible to use the base agent class by itself but inheriting is my preferred way you can check out the basic example if you want to see a way of doing that the first method to overwrite is the initialize method this will be called when the game object gets enabled if you are familiar with unity this is happening in between the awake and start function here you usually find and connect references and set parameters but in the background more's going on we have to take a look in the base class for that you can see the agent connects to its behavior possible sensors and to the Academy the Academy is also very important component of the ml agents frameworks besides agents and behaviors and some of you may have been wondering where it is in previous versions the Academy was attached to a game object but now it is a globally accessible singleton the Academy controls the training process for all agents it communicates to the external Python process and ensures that all agents are in sync additionally global parameters can be set here and are accessible by all agents the next method is collect observations and it is very easy to use everything you ai needs to make a useful decision is collected here next the on action received method is where you transform the action parameters into concrete action everything you a eye does in terms of moving jumping and shooting whatever is located here at last we have the on episode begin method an episode last until the objective has either been achieved or failed and the environment needs to be reset in this case an episode lasts until the ball has fallen this pretty much concludes everything you need to know for training your first AI let us begin the training process now we need to install Python and the unit EML agent Python package we will just use our CPU for training for now because it is just easier to set up and it's enough of this case first download and install Python again you can find the link for this in the description then if python is installed all you have to do is a simple pip3 install ml agents and you are done no more installing now change directory into the previously downloaded repo and execute the following command again you can find this command in descriptions wait for a few seconds if everything went as expected the message start training by pressing the play button and the unity editor should appear and this is exactly what we do you can see dai is training congratulations right now you are training your own AI as you can see getting started with the ml agents framework is really easy of course there are complex concepts you need to grasp to master this framework but this is for another video if this was helpful to you please subscribe I'm trying to make more videos and Noah have been slacking off in the last few months but I'm trying my best to get started again please give me feedback and tell me what you want to see peace [Music]
Info
Channel: Sebastian Schuchmann
Views: 112,830
Rating: undefined out of 5
Keywords: ai, artificial intelligence, ai video game, machine learning, neural networks, unity3d, tas, bot, game bot, ai learns, game engine, machine intelligence, robots, reinforcement learning, deep reinforcement learning, ppo, mlagent, unity, ml-agents, ml-agents 1.0
Id: _9aPZH6pyA8
Channel Id: undefined
Length: 11min 55sec (715 seconds)
Published: Sun May 10 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.