Training a Cascade Classifier - OpenCV Object Detection in Games #8

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this opencv tutorial i'm going to show you how to train and use a cascade classifier for object detection in video games opencv supports a variety of machine learning algorithms and while deep neural networks have received a lot of attention lately hard cascade classifiers remain the easiest to get started with and they can still produce good results i'll briefly go over some theory but this is mostly going to be an implementation tutorial and we'll be doing everything entirely on your desktop windows computer which is hard for me to find information online about how to do this so hopefully the method i'm going to show you will simplify things for you hey i'm ben and this is learn code by gaming this is part 8 in a series all about using opencv to detect objects in video games if you missed anything there's a link to the full playlist in the description while we don't have time in this video for a full explanation on how machine learning works just in case this is a new topic for you let me give you a short summary normally when we program something like when we write a function for example we expect certain inputs like the parameters in a function and then at the end we'll end up with some output like the return part of a function and in the middle to get from the input to the output we write some sort of logic if statements loops all that stuff with machine learning it's exactly the same except that middle part is replaced by a machine learning model so with machine learning you're not writing any of your own logic anymore instead you're trusting this mysterious dark jumble of multi-dimensional calculus to transform your inputs into your desired output and at first your model won't know how to do what you want it to do its output will be no better than random guesses to get the output we want we must first train our model we do that by showing it lots of input examples and for each example we tell the model what we want the output to be once it's seen enough examples a well-trained model should be able to predict what the output should be given some new set of inputs so if you've ever been confused before be confused no more that's the super summarized version of how all machine learning works in our case our input is going to be screenshot images from the video game we're playing and the output we want is a list of rectangles that identify the objects that we're trying to detect and fortunately for us opencv's cascade classifiers are designed to do exactly that opencv actually comes with two types of cascade classifiers har and lbp har is generally more accurate while lbp is generally faster to train and if you're not already aware training these models can take a really long time like hours or days are not uncommon i'm going to be focusing on har but you can switch to lbp pretty easily if you want to give that a try the way a hard classifier works is it looks for features inside an image very much like the orb feature detection that we covered in the last video and it looks for these features in different layers so at the top layer it'll be looking at large features that span nearly the whole image window down to the bottom layer where it's looking at really fine details this makes the end model fast enough to detect objects in real time because it can quickly reject areas of the image that fail to match features in the topmost layers and they can spend more time analyzing areas of the image that are good candidates by studying those finer details alright so hopefully you have a general understanding now of how machine learning works and what makes a hard cascade classifier unique the great part is the code for all of this is really straightforward i don't think you'll have any trouble understanding it the art of doing this well isn't so much in the code it's more in gathering the data to train your model with to get good results with this you need quality data and lots of it the more the better now we need two types of data we need the positive images which are the images that contain the objects that we're trying to detect and we need negative images which will be screenshots from the game that don't contain our object at all the machine learning algorithm needs to see both what is the object and what is not the object in order for it to learn and you want to get examples of your object in as many different conditions as possible so every possible lighting situation different positions all of that in this series i've been looking for limestone deposits in albion online and the day night cycle has been giving me lots of problems so i want to make sure to capture screenshots of limestone from all different times of the day the quality of your data set is going to have a huge impact on how good your final detector is so keep that in mind so the first step is getting our positive and negative images so let's write some code that'll make that really easy on us so picking up from where we left off last time i'm here in the main file and i'm going to remove all of the vision stuff from our last few tutorials just to give us a nice clean workspace for our classifier so as you can see we still have our window capture and we're going to use that to get a screenshot and then show that and as always we'll output our fps and then you'll press q to quit next i want to create some hotkeys to save these screenshots on demand and i want to save the positive and negative images in different folders so let's go ahead and create those folders now let's write some code for saving these screenshots to either the positive or negative folder using a hotkey and down here we're already listening to cue key presses on the keyboard to quit our program so let's just use the same strategy to listen to a couple other quick keys to save screenshots to either the positive or negative folder so here's the code i came up with to do that we're going to save the result of weight key to a variable and as before if that key that we pressed was q it'll go ahead and destroy all the windows we have and then the break will break it out of our infinite loop so it'll exit the program but if we press f we'll go ahead and save that screenshot that we're on to the positive folder or if we press d we'll go ahead and save that screenshot to the negative folder and to save those image files we'll just go ahead and use i am right from opencv and we'll pass in the screenshot we got from window capture as the data to save and for the file path we'll go ahead and save to the positive or negative folder and for the file name we'll go ahead and use the loop time value just to make sure all these file names are unique so now when we have the game running and then we go ahead and run our script and then if we give our opencv window focus and we press f that should save a screenshot to our positive folder and if i look in that positive folder i can see that screenshot saved there or if i move to an area of the game that doesn't have any limestone i can go back to our opencv window and press d and that will save a screenshot to our negative folder and you can see that appears in our negative folder here and now all we need to do is run around inside our game and take lots of positive and negative images so i spent about an hour doing this six and a half hours later and i got a little bit over 100 images in my positive folder and a little over 100 images in my negative folder too and if you're working on this project for yourself you might want to get a lot more samples than that but i was curious just how far we could get with just 100 samples of each and you can see from the samples i got i made sure to get images both from the day and the night time part of the day night cycle so after you gather your screenshots it would be a good idea to review them and make sure to check for any mistakes and to make sure that you did get a good variety of images but if you don't like the results you're seeing you can always come back later get more samples and then train your model again now that you have all your raw data let me show you opencv's official tutorial on how to use cascade classifiers so from opencv's tutorial page go ahead and scroll down to object detection and in here you'll find two tutorials the first one is for how to use a cascade classifier and the one below is for training your classifier and since we'll be making our own classifier instead of using one of the pre-built ones that comes with opencv you'll want to check out the training tutorial and if you read this article you'll find that not only do we need those images that we captured we also need some text files to tell the trainer where to find those images and for the negative images the structure of this file is really simple we just need to create a text file and on each line we'll put a path to one of those negative images so let's write a quick utility function to do that for us so i created a new file for this i called it cascadeutils.py and then i've written a short function here to generate that file so the file will be called neg.tx and it'll go ahead and create that file if it doesn't exist or if it does already exist it'll overwrite what's there and then we'll loop over all of the file names in the negative folder and then for each one of those files we'll go ahead and write a line in our negative text file and it'll just write the path to each one of those files and to run that function i'll just do it manually so in my terminal i'll just go ahead and run python and then in the python terminal i'll import that function and once it's imported i can just run it and once it's done i can just exit out of python and if we look in our file directory you should now see this neg.txt file and the contents of it it should have the paths to all of those images in your negative folder now for our positive images we need a similar description file but it's not quite as simple you can see from the documentation not only do you need the path to those images you also need how many objects that you're looking for appear in that image and then you need the bounding rectangles for each one of the objects that appears in each image and doing this manually would be an insane amount of work so luckily opencv has a tool to help us do it there's this opencv annotation program that will allow us to go through all of the images in our folder and it'll display each one of those images one at a time and allow us to draw rectangles around each one of the objects that we want to detect and when we're done doing all that it will output the positive text file that we need and the trick here is where do we find this opencv annotation program because if you were to just type opencv annotation in your terminal and run that you're probably going to get command not found and we installed opencv using pip for this project and if you go digging around in those files you won't find this program there either so what i recommend you do is you go to this installation in windows page on the opencv website and then go down to the installation using prebuilt libraries and click on the link to sourceforge and i've been using opencv version 4.2 on this project but opencv version 4 no longer has this annotation tool that we need as i understand it the maintainers of opencv haven't gotten around to porting these command line programs to opencv version 4 yet because they've been more focused on deep neural networks which is forgivable because there's only so many resources to go around but if you want these command line programs you'll have to go back to version 3.4 to get them unfortunately we can keep using version 4.2 on our main project the cascade classifiers we create will still work with that version we just don't have access to the command line tools that we need the latest version that does have these tools right now is version 3.4 and we want to get the latest version of 3.4 so when you get to the sourceforge page go to home and then sort by modified date and here you'll see the latest version of 3.4 is 3.4.11 so click on that and in here you'll want to download the exe file for windows once you have that file downloaded go ahead and run it it's basically just a self-extracting zip folder and once you pick a folder to extract it to it'll save there in an opencv folder and once you have this folder we go ahead and find the command line programs that we need and you'll find those programs in build slash x64 slash vc15 slash bin so here's that full path and in the x64 folder you might find a vc14 folder as well and that'll also have these programs and they probably both work equally as well i just like sticking with the the newer version these folders are for different versions of visual studio but that's not important and we'll be using a few of these command line programs today so first we'll be using this opencv annotation program and that'll help us mark the objects in our positive images and output that text file that we need and then next we'll be using this create samples program and this will use the text files that we made to create a vector file that will be used by the cascade trainer and then finally we'll be using this train cascade program in order to actually train our model so we'll be using the annotations program first to create our positives file and the way you do that is in your project terminal you'll want to type in that full path to the annotations program and then it takes two arguments first it needs to know where to save that positives file so we do that using the annotations keyword and then i'm going to set it to pos.txt and then it needs to know where to find those positive images that you want to mock up so to do that we use the images keyword and then we tell it what folder to look in and if you forget how to do that i'll save all the commands that i'm using to the cascade utils folder so you can find that up on github and then when you run this it'll open up an opencv window with your first image in it and here you want to draw rectangles around all of the objects you want to detect so first you'll click once for the upper left-hand corner of the rectangle you want and then you'll drag it until it encloses your object and then once you have the rectangle how you like you'll click again on the lower right hand corner and that'll leave you with this red rectangle and then you press c on your keyboard to confirm and this will turn the rectangle green and if you've drawn a rectangle around an object that you shouldn't have or you don't like how you drew the rectangle and the rectangle's still red you can simply click elsewhere to create a new rectangle but if you've already confirmed a rectangle that you don't like and you want to undo it you can press d and that'll remove your last confirmed rectangle and once you've drawn a rectangle around all of the objects in your image that you want to use for detection you click n to move to the next image and then you just repeat the process and here's an example of an image with multiple objects that we want to detect and as you're drawing these it's okay if these rectangles kind of overlap each other and don't feel like you have to draw a rectangle around every object in your image that is the object you're looking for for example on this image i've got a limestone deposit up at the top here and another one kind of hiding behind this tree but if i don't think the image in this box is going to make for good training i can go ahead and leave it out and that won't cause any problems for you so use your best judgment whether an object that looks like this should be included in your data set or not and then you can press escape to exit or it'll exit automatically once you've been through all of your images and once you've been through all your images you should end up with a positive text file that looks like this if you've taken the time to gather 100 positive images you should really take the time to go ahead and mark all them up correctly and again this can take you a little bit it took me about an hour to go through all of mine but the more attentive you are here the better your final results will be and if you forget what any of those commands are they are in the official documentation that i showed you earlier and then i've also put them here as comments in the cascade utils file and then one thing i almost forgot to mention in your positive text output file you want to go ahead and review that to make sure there's no issues with it and one of the problems you're likely to find is that your slashes are in the wrong direction so as you can see in our negative file we have all forward slashes and the output from annotations creates backslashes and this is going to become a problem in our next step so go ahead and change all those slashes to be for slashes and you can just use find and replace to quickly do that all right so the next thing we need to do is we need to create a vector file from our positives data file and to do that we'll be using the create samples program and the first thing this program wants to know is where our positives data file is and that argument name is info and then we just give it the name of that file and i can give it just the file name instead of the full path because i've already cd into this folder that contains that file next it wants to know the width and the height of the detection window size that you want to use so here you have some options i've seen that both 20 and 24 are pretty common and you won't be able to detect any images smaller than the size you give it here so here i've got 24 pixels by 24 pixels will be the smallest object that i can find and as you make these values larger it's going to take a larger window size so it might improve your detection a little bit but it's also going to make training take a lot longer and it's going to make your detector itself a little bit slower too but this is a value you should definitely play around with a little bit and after that we've got the number of vectors that you want to create and for this you should just put some number larger than the total number of rectangles that you drew on all of your positive images because we have a small enough data set that we don't want to throw out any of those vectors that you drew and as long as this number is larger it'll include all of your rectangles so if you drew 250 boxes around objects in your annotation step if i put this to a thousand it's going to include all 250. and it doesn't matter if you put this to 2000 or 10 000 it'll just include that 250. and the final thing it wants is where to save its output so the file name or path where to save that and that argument name is vec and i'm just going to save mine to positive.vec when you run it you should get some output that looks like this and at the bottom you'll see that it's created 253 samples so that means that i drew 253 rectangles around objects in all of my positive images and if you run it and you get an error that looks like this that means that create samples couldn't find or couldn't open your images so in that case you'll want to go back to your positive text file and make sure that all of your paths are correct so it should look something more like this and again in the cascade utils file i've got that command that i ran in case you forget it and if it ran successfully you should end up with this positive vector file if you try to open it you'll see that it's binary you can open it anyway and you'll get a bunch of like this but this is good this is what we expect all right so now we're finally ready to train our model first we need to create another folder to hold all of the output from our training i'm just calling my folder cascade and the name of this command line program is opencv underscore train cascade and the first thing we tell it is the folder where we want it to save all of the data so that argument keyword is data and i've just given it the cascade folder next it wants to know where to find the vector file with all of the positive sample vectors and that argument keyword is vec and in the previous step i named that file pos.vec next it wants to know where to find all of the negative images and that argument keyword is bg for background we just give it the name of the text file where we saved all of those image file paths next we tell it the width and the height of the detection window and this must be the same as the values we used in the previous step where we created the vector file next we tell the trainer how many positive and how many negative samples we want to use for the training in this num positive argument needs to be less than the number of rectangles that we drew so that's less than the number of samples that was output by create samples and for num positive if you were to use the value that's exactly the number of samples you have or just a few less you're likely to run into an error and that error would look something like this cannot get new positive sample and if you get this error when you're training then you need to either lower the num positive value or you can try lowering the min hit rate value and for the num negative this actually doesn't need to be less than the number of negative samples that you have so even though i only have 103 negative images it'd be perfectly fine for me to put something like 1000 here and that's because for your negative images the trainer is just going to look through those and it's going to cut out small samples from each one to create the negatives so it's a little up for a debate what the best number is to put here i found a lot of advice online that said you should put this as half the number of positive samples that you have but i've also found advice that says the best practice is to put this at twice the number of positive samples that you have so in this case that'd be 400. so this is a number that you can definitely play around with to see what gives you the best results for our initial test i'll leave this at 100 as a 2 1 ratio seems to be the most common advice and the final thing we need to tell the trainer is how many stages to train so the more stages you train the longer the training will take and also the better your model will get at detecting the images and your positive samples so initially you might think the bigger the better here but as we'll see that can actually lead to overtraining but 10 stages is a good number to get started with when you only have 200 positive samples so let's go ahead and run this and see what happens all right so i took 31 seconds to trade in 10 stages which is pretty fast and if you wanted to train more stages from here you could just run the same command again but increase the number of stages so here if i increase the num stages to 12 it would pick up where it left off at 10 stages and just train those last two stages so that's pretty nice but before we do anything like that let's first take a look at the output we got so if we take a look in our cascade folder you should see a bunch of files in there now and the cascade.xml file is the model itself so if you wanted to save this model you'd want to copy this file and put it somewhere for safekeeping and the rest of these files are things that the trainer uses so we can pick up training where it left off and the output that we get in the terminal also tells us some interesting information so let's take a look at that so for each stage of the training you're going to get a table output like this and hr stands for hit rate fa stands for false alarm and n is the weak layer number so layers 1 and 2 those are your outermost layers and in this example layer 9 is the lowest layer with the finest amount of detail and kind of what we're looking for is we want this final layer to have the smallest false alarm rate as possible without also overtraining and one possible indication of overtraining is if you look at the negative acceptance ratio which is the number right here and this number actually looks pretty good but if you see something with more than four zeros after the decimal point that can be a sign that you've overtrained so it seems like a good spot to explain what's actually going on during the training so when you're training your trainer will show your model a random image and that image will either be one of the objects that you drew a box around or it'll be a random cutout from one of your negative images and then the model will try to predict whether this image is from the negative or the positive pile and once that prediction has been made your trainer will then tell the model whether that prediction was correct or incorrect and the model will then make adjustments to itself based on whether it was right or wrong and then this cycle just repeats thousands of times so it's sort of like learning with flashcards when you're a kid the model is seeing these images it's trying to classify them and over time it gets better and better so now that we've trained our model let's go ahead and use it for some object detection so back in our main script the first thing we want to do is we want to load the trained model and to do that we just use opencv's cascade classifier class and we initialize it with the path of the model that we trained so remember that was the cascade.xml file inside of the cascade folder and then we're going to save this trained model object to a variable i called mine cascade limestone and then to do the object detection is really simple this classifier object will have a method on it called detect multiscale and all you need to do is give it an image to look at so we'll give it a screenshot from our game and then using the model that we trained this method will look through that screenshot and it will return a list of rectangles denoting all the objects it found that it was trained to look for and all we need to do now is to draw those rectangles on an output image to see what we get and we've already written a method to do that in our vision class so let's go ahead and reuse that so i'll go ahead and initialize a vision object with no needle image because we don't have any needle that we're searching for and of course i need to make sure that our vision class is still imported and our vision class isn't currently set up to accept none for the needle image path it'll throw an error when it tries to read that image in so we can fix that quickly just by saying only do these needle image stuff if we actually get a needle image so i just added this if statement here to the constructor so now loading this vision class without a needle image should be fine and now we can still use that draw rectangles method from the vision class to draw the rectangles on our screenshot so we call draw rectangles we give it the screenshot and we give it the rectangles to draw and it'll give us an output image so instead of showing the unprocessed screenshot we'll just go ahead and output the detection image now we can test this out and see what we got so i'm just going to run main.py and here's the output from our cascade classifier and as you can see right now it thinks that a whole bunch of things are limestone deposits so we need to do a better job of training our model but it is at least drawing boxes around some of the actual limestone deposits so that's a good sign so let's go ahead and train our model for two more stages all right we're up to 12 stages let's see what sort of difference this makes you can see the results are still about the same so now it becomes a game of adjusting the parameters we use with train cascade in order to get the most out of our training data and by the way it can be a good idea to train your models in a separate terminal window outside of your ide so here i've got git bash open instead of doing this in vs code and that's just because these trainings can take a while sometimes and i'd rather my terminal be occupied instead of vs code and when you're training a new model you need to either delete all of the files out of your cascade folder or you need to save this model to a new folder so you would change it to a different folder by changing the data parameter here but i've just been clearing out the cascade folder and if you do this and you want to save that model before you delete it just remember to go ahead and copy that file somewhere else and you can find all of the arguments that are available to you in the official documentation that i showed you earlier so this is just on the cascade classifier training page and i'm down at the cascade training section and the first arguments you're going to want to add are these two these precalc buffer size arguments will just give more memory to your process of training your model and that'll just help your model train faster and the values that you use here are going to depend on the specs of your pc i have 16 gigs of ram in my pc so i've decided to dedicate six gigs worth of that ram to the one precalc buff size and another six gigs to the other precalc buff size so i've got 12 gigs out of my 16 gigs of ram available for my classifier and this will just help you train your models faster two more arguments that are good to play with are these two the min hit rate and also the max false alarm rate so if you set your max false alarm rate something like 0.3 that means it'll keep adding layers during your training steps until it reaches this 30 false alarm rate so i'm talking about this false alarm right here in the output table it'll keep adding more layers until this final number is below 0.3 and the min hit rate argument will do a similar thing just for your hit rate so if i put this at 0.999 it will only add layers to this as long as the hit rate remains above 0.999 other things to play around with are your number of negative samples so we tried 100 already i think 200 would be good one to try so our number of negative and positive are equal also 400 would be worth trying so we have twice the amount of negative as positive samples and an extreme like 1000 would be worth trying to and you might find your best value somewhere in between like maybe it's two to three so 200 positive 300 negative and different parameters will work best in different situations so you need to play around with these and find out what works best for your project if you have too many false positives the best things to do are to add more negative samples or you can try training for more stages if you have valid objects that are being missed that means you over fit your data so try reducing the number of stages overtraining or overfitting means that you trained your classifier to only recognize images that are exactly like the ones in your positive folder so if you do that it's not going to be able to generalize to slightly different images so as you adjust these parameters you're playing a game where you're trying not to get too many false positives but you also don't want too many misses either and eventually you'll hit a limit of what can be achieved with your data set and the only way to improve from there is with better and more training data and to test how good my model is i've just been manually booting up the game to see what my results look like and i know this is a really subjective measurement but it'll give you some sense for what parameters work better than others so the things i've tried are messing with the number of negative samples reducing the max false alarm rate increasing the min hit rate and stage bracketing so as long as i was getting too many false positives i kept increasing the number of stages and then once i started having a lot of misses on valid objects that's when i would reduce the number of stages and i'm pretty happy with the results i ended up with you know as we move through the day night cycle it still does a pretty good job of detecting these limestone deposits some things i didn't try are using a different value for the width and height for the detector window since training has been so fast for me i think that we could probably increase this a little bit another thing i thought about trying is if there's specific false positives that i'm getting a lot maybe i could add those to my pool of negative images and that could potentially help our model learn that those aren't valid objects and here's the arguments i ended up using on my best model so i had 200 positive samples and then i used 1 000 negative samples i ran that for 12 stages and set the max false alarm rate to 0.4 and the max hit rate to 0.999 i decided that i'd rather lean towards more false positives rather than more misses because in the bot script that i'm going to write i think i'll be able to hover over some of these potential object detections and then when you do that you get a little tool tip so i'll be able to use that to verify whether what i think is a limestone deposit actually is so in this example we're seeing an incorrect detection here it thinks that this rough stone is actually limestone we can see i can make my bot actually hover over that position and then i get this little tool tip that'll tell me that's not the correct spot but there's only a few false positives here so it won't be overwhelming for my bot to do this and that should be everything you need to know to get started with cascade classifiers as you saw programming wise it's not too complicated but gathering up all that data can be tedious and as you get more data the training will take longer and longer that's the name of the game in data science i think a project like what you saw here is a really good place to get started with machine learning because you get an understanding of the practical applications first and then if you're interested in diving deeper into the theoretical stuff you'll have some context for understanding the overall picture of what you're trying to achieve so hopefully this was helpful and let me know in the comments if you want to see more machine learning content as for the rest of this opencv series goes i didn't know when i started this that i'd be spending all summer on it so i'm a little eager to get it wrapped up but before i do i want to make one more video for you guys where i'll combine the object detection techniques that we've been talking about with automated mouse and keyboard inputs to make av source gathering bot it's going to be a rather unpolished spot but i'll give you enough so you can get started on your own automation project so i'll see you then you
Info
Channel: Learn Code By Gaming
Views: 52,698
Rating: undefined out of 5
Keywords: opencv, python, cascade classifier, train cascade classifier opencv python, haar cascade explained, opencv machine learning, train opencv model, opencv windows tutorial, opencv object detection, windows haar, haar training, machine learning project in python step-by-step, computer vision projects, opencv cascade classifier training tutorial, opencv computer vision, opencv course, how to opencv in python, opencv detection rectangle, opencv tracking, machine learning game ai
Id: XrCAvs9AePM
Channel Id: undefined
Length: 32min 29sec (1949 seconds)
Published: Sat Aug 22 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.