Creating a YOLOv3 Custom Dataset | Quick and Easy | 9,000,000+ Images

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hey everyone and welcome to today's video it's the AI guy here and in today's video I'm gonna be showing you a quick and easy way to generate a custom data set of hundreds or thousands of images that already come with labels and are annotated with bounding boxes all the hard work is done for you by the end of the video you'll have your own custom data set gathered and ready to be trained on by Yolo version 3 so let's get right into the video so first things first I'm gonna be showing you where we're gonna be getting all these images from so you're gonna want to go ahead and google open images data set then go down and get version 5 so open images data set is an open source data set of images over 9 million images that have been put together by Google so you're gonna want to go ahead and click explore and then it'll bring you automatically to a category round or melon that was just random and you're gonna wanna go over to segmentation and click on detection that'll show bounding boxes and if you go to options you can click the first box so that you only see watermelon boxes you don't see other boxes like plant and you can go ahead and look through this category this is a great starting point for if you're trying to learn how to do a custom Yolo v3 object detector so this data set that Google's put together of over 9 million images has over 600 classes so just like this like watermelon you can scroll down you see this list is huge and if we go and click on another one so let's go to click on B we have hundreds or thousands of images of bees that already come with labels and bounding boxes around them so this is going to shortcut so much of your time trying to put together data set if you can just go ahead here and find a class you're going to work with so now I'm gonna go ahead and show you guys how we actually get these images downloaded and how we can work with them so luckily for us an amazing toolkit was built that allows us to actually download the images of a certain class or any of a number of classes that we desire so just to disclaimer that I was not the one who created this toolkit this is my repository the AI guys code but I've forked it from the original toolkit and what I've done is I've added a Python script that will convert the labels and annotations from the format that this toolkit generates into the format that Yolo version three takes in for its training this will allow you to have a fully ready to go data set using this toolkit for Yolo version three so first things first is you're going to want to go over to my repository so the AI guys code and then it's oh I D version for underscore toolkit I'll put a link to this repo in the description of this video so once you're here you're gonna want to go ahead and click on the clone or download and just copy this the clone or download the code a zip and then you're going to want to go if you're on Windows open up a power shower command prompt or if you're on Mac or Linux open up your shell so you're gonna go to whatever folder you want to download the code in so I'm just in my C Reap reposes where I put all my code so I'm going to do get clone and then right click so we're going to clone in the repository with all the code perfect it's quick and then we're just going to CD into it so I'll go mention that python is a requirement for this this demo so if you don't have Python installed you want to go ahead and do that so your first command is going to be two pip install - our requirements dot txt and what this will do is this will just make sure that you have all the libraries this will go ahead and get all the libraries for you so that you're ready to run the commands I should already have all of them downloaded so you'll see that my requirements are all satisfied but much might take a little bit as it goes ahead and downloads them for you so now we are ready to just go ahead and get our classes so so I recommend going back to the open images and searching pausing the video and taking some time to search for one of the classes that interest you or that maybe you want to make a custom detector with so go ahead find one of them and then start back here when you're done so once you have your class that you want to go you're gonna go back to your shell and you're going to type the command Python main dot Pi downloader and then you're gonna do - - classes so this is where you put your class so I'm gonna do balloons so I just write in balloon and then if you put a space you can just keep writing as many classes as you want so I'll download two class images for two classes I'll do balloon and I'll go airplane one thing to note here is that if you're looking for a class for example bell pepper that has a space it's two words and has a space in between in the shell when you're writing in that class instead of having okay I'll just show you so instead of going bell pepper like this because it's gonna think that's two different classes you want to do Bell understand underscore pepper so just replace the space with an underscore okay perfect so now you're going to want to go and do - again and do type underscore CSV so what this does is this allows you to choose where you're getting the images from within the the data set so it's looking for the subset so it's either gonna look for here in the training data or the validation and test data so the training data is where the majority of the images are because it's the largest percentage of your data set so I recommend always keeping this as type CSV train because that's going to allow you to download the more images if you require and then you're just gonna go - - limit and this is how many of each class how many images you want to download so for this example I'll just go ahead and download 400 images of each but I could easily put this as like two thousand or however many images I want you can put a high high mount so we're just gonna go ahead and run that command so when you first run this command you'll see that the OIG v4 downloader pops up on the screen which means it's running properly and there's two CSV files that you need to download so you're trying to go ahead and do Capital y enter Capital y enter and it's gonna go ahead and grab both of those two CSVs that kind of work is indexing to allow you to grab the images and download the proper images so we're gonna go ahead and let this download and then we'll come back so the CSV s are done downloading and now you can see it's going ahead and getting in the pink it's getting now the balloon images first so it's downloading from the training images and it says that it's found 3,000 over 3,000 images that are labeled and annotated for balloons so you can see how easily you could easily go and get thousands of images for balloons if you that was what you were doing your detection on but I'm eliminated at 400 so it's gonna go ahead and download the 400 images and then it's gonna do the same for airplanes as well so I'll come back when I have all the images downloaded because it is a lot of images so it might take a couple minutes so when the images are done downloading and it's complete you're going to want to go and open up a file folder and in the toolkit oh ID folder dataset and this will be trained validate or test depending on what you chose for your type CSV I chose train and then you're gonna see whichever classes you chose to do so I have airplane and I have my four hundred images of airplanes and then there should be a label folder also if you go ahead and here there should be a txt file which has the annotations for each of the images so we can go ahead and open one and we can see that this image would have six airplanes in it and it gives the coordinates but this is not the format we want this format gives you the X minimum Y minimum so that the coordinate the top left coordinate and then it gives you the x max and the y max which is the bottom right coordinate so if we go ahead and open up the repository github we can see this image shows that clearly so the format it returns is it gives us this top left coordinate and the bottom right coordinate as well as the label in like written format so it says airplane so now we want to go ahead and convert this into a format that Yolo accepts for training so we can train a custom detector so you're going to want to go ahead and go back to your root directory and you should have a classes txt file so you're gonna want to go ahead and open that up and you're gonna want to write one class per line so I did airplane and I did balloon so you're gonna go ahead and write one class per line for whichever classes you decided to download so that's what I did I did airplane and balloons I'm gonna go ahead and save that and then I can close it so now we're ready to go back to our shell and all we have to do is do Python and then it's convert annotations don't need that so that's that's Python convert annotations dot pi go ahead and click enter and it's gonna go ahead and convert the annotations for each of those txt files so we'll wait for this to download and then we'll go and check it out so it looks like airplanes done so we'll go back to our file folder go back down to oh i g data set train an airplane and now we should see that there is now a txt file in the folder where the images are for each image so there's still the ones in the label but now there should also be ones inside of the folder above so we're gonna go ahead and open that up and we can see that it is now in a different format so it has converted our label to a number value depend this goes back to the classes txt so airplane was on the first line the zeroth line so it has a zero and if there's a balloon it would be a one so that's how Yolo notices its classes and then it has converted the coordinate into yolo v3 normalized version so it's done it for you now all of your images come with a txt file that is in the yellow version three format and ready to be trained on one thing I will mention because now you're you would want to go ahead and move all of these images and files into your darknet where you have your darknet installed if you don't have darknet already installed i recommend checking my previous video where i go through the whole installation and download process that'll go ahead and get you set up and ready to now go ahead with the custom detection x' i'm not gonna do the custom detection all of that jazz in this video i have another video plan that will be released in the next week or two and that will go over how to set up the custom detection so if you want to see that video make sure you subscribe to the channel and like this video and give me a comment down below saying custom detection and that'll go ahead and rush me to get it finished up so you have this finished one extra and that helps to know is to if you can add - multi classes and set it to one when you're running the command for python downloader and getting the classes by setting multi classes to one what it will do is it will actually download all of your classes that you've chosen and it'll do one common folder for all of them so it'll have the airplane and balloon images inside of this folder that way when you do the conversion you have all the images you want for your custom detector and all the annotations all in one place easy to transfer over so yeah that's that and I'll go ahead I have done the custom detection for balloon and for airplane so I'll go ahead and run the command that I showed you guys last time but it's pointing to my custom object out data my custom you although config as well as my custom weights so if I run this on balloon jpg which I took from Google Images it'll go ahead and find the annotations on it and you can see that it found the balloon it's not perfect because I didn't run it to the full extent and on as many images I did it for time so I just ran it on like a quarter of the amount of time I should have but you can see it still works and then I will also show you one on airplane so this will go ahead and work too if you want to see this video I'm going to stress it again subscribe to the channel drop a like and comment down below saying custom detection get me to finish up this video and I will go through step-by-step how I got to this point and it made the custom detection setting it up and training it so if you guys want to see that video I urge you to drop drop a like on this and in the meantime if you want to go ahead you can go over to the repository I showed you for darknet in my previous video Aleksey a be stark net and if you just ctrl F and look for how to train on custom objects he will go ahead and give you some insights here on what to do and this is what my next video will be based off of it'll be how to do the custom detection and setting it up so yet you can train and have a custom object detection for Yolo v3 up and running so that's it for this video I really hope you guys enjoyed it and I'll see you guys in the next video [Music]
Info
Channel: The AI Guy
Views: 46,793
Rating: 4.9567566 out of 5
Keywords: yolov3, yolov3 custom detection, object detection, computer vision, darknet, google, custom dataset, open images dataset, yolov3 object detection, deep learning, machine learning, artificial intelligence, yolo, custom object detection, machine vision
Id: _4A9inxGqRM
Channel Id: undefined
Length: 15min 54sec (954 seconds)
Published: Mon Jan 06 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.