How to Train YOLOv7 On a Custom Dataset

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey there this is Jacob from Rebel flow here today to show you how to train YOLO V7 on your own custom data set so before we dive into getting our hands dirty with training I'm going to first review what's new in yellow V7 so you may have seen other yellow networks like yellow V5 that are the iterations of the yellow Network which is used to detect objects and images so yellow V7 is really the latest evolution of these new networks where people are trying to push the accuracy of the models to be more accurate and the inference to be faster so in this graph you can see here that the yellow V7 is trying to push that Frontier of how fast you can inference and how how good of predictions you're getting back according to the the Target that you're trying to model so um a little bit more here on what's new in yellow V7 so here we can see an example of what yellow V7 is primarily uh used to do which is to detect objects and images so here you can see bounding boxes with Associated confidence the confidence is that yellow V7 is detecting courses in these images and for you for your use case you can think about what sort of task you want to model and what sort of objects you want to be detected in your images and what you want to be training yield V7 to see so a little bit about what's new about the network is obviously as I said things are getting faster and more accurate and the way they did this is by improving the way the neural network is configured so especially in the backbone here there's efficient ways that you can be putting the backbone together that pulls features together from your images and then there's some clever model scaling techniques of how when we make the neural network smaller or larger we can change different layers to accommodate those different sizes also another technique that's used in it is re-parameterization where they're keeping tracks of different sets of weights and averaging them to make a model that's a little bit more robust and then there's also some of this auxiliary head tuning so rather than just tuning at the end of the network you might want to tune uh sort of like a shadow prediction head that's about halfway up to make it so you can propagate gradients through the network with a little bit more efficiency so now we'll kind of just dive into the uh the code base here so this is the yellow V7 repo you can find it on GitHub and one can use GitHub and you can see here that it's laid out with train pi and a detect Pi those are kind of the most important ones so train pies what we'll be using today to train the neural network and detect Pi is what you can use to make inference once you're no network is trained so importantly here you're here today to learn how to train a custom yellow V7 object detection model so the most important thing to train a custom object detection model is you need to gather a data set of Representative images so here I'll show a little bit of example of how you can find data on roboflow and how you can construct your own so if you're just exploring you can go here to robofo universe universe.roboflow.com and you can search for data sets for example here I was kind of thinking what it might be like to model aerial sheep so you could look here and see what the community has done and see if there's any example data sets that you can pull from and then of course if you have your own industrial example and you need to bring your own data so you need to be bringing your own data to construct a data set that is particular to the thing that you're trying to train so this is a very important concept especially for production apps that are going to be using yellow V7 and in production so here's an example of hard hat data set that you can have in roboflow and most importantly here you'll have your images so in your data set you'll upload images that you can upload here and then with each image you can draw bounding boxes to have the thing that you want to detect so here maybe we want to detect if people are wearing a helmet or not and so we can look at these annotations and you'll need to be drawing these boxes to be able to train yellow V7 so the way I like to think about The annotation process is like you're almost kind of actually training the network of each new box you draw and the more data you have and the more robust Uh custom data set you've assembled the better your performance is going to be in pretty much all cases so here's an example of maybe we missed something maybe we wanted to annotate a tree or something here you can do that here and go and pass through your images and annotate them there after you're done annotating you generate versions so this is where you can choose some standardization processes around the images that you make and you can also make more of them in Robo flow from the initial images that you annotated this this is good uh to get a bunch of new air image variation just from the base annotations that you made um and then once you're done with this uh you can export your data so we export into all kinds of different formats here notably the yellow V7 Pi torch one is going to be a particular interest here and that's going to give you a link to where the data set zip is as well as some python code that you can use to download it so today I'm going to be using the cash counter data set on universe so here you can see this data sets assembled to do different uh you know forms of cache so cash and coins and so if we wanted to get this data set we could go here and then we'd go here and we'd go download and then yellow V7 Pi torch show download code once we show the download code here we can see uh this this python snippet that we would bring over into the notebook so now I'll go ahead and dive into the collab training notebook so if you're not familiar with Google collab it is essentially a place where you can get a free GPU from the courtesy of Google um and you can prototype things up here and you can share it easily so I'll post the link to this notebook uh here below so if you're following along at home you can have the code up on one side and the video up on another and and do this with your own data set so the first thing this uh this code does is to clone the repo and then we install the requirements so this will install a bunch of things like Pi torch which is things that are necessary for the neural network to be running on the collab machine and um we'll set up the whole environment that way so you can see here that I have already run these cells and I'll just kind of uh Buzz through them one by one after that um we can download uh or install the roboflow package to download the data set and uh here you can see that you'll need to bring your API key which will get auto-generated from the website when you use universe.romflow.com or app.romflow.com to assemble your own data set and the data set gets downloaded here and you can see that all of our cache counter images and such have come down so in here you can see the cache counter folder and here we have the Train the valid and the test so during training we'll be iterating through these terrain images here and we'll be uh seeing kind of how well we're doing on validation images um so the validation images aren't seen by the network tearing training and we'll just use them and their annotations um to kind of get a sense of how well we're doing Mueller training uh The Next Step that we do here is we download the YOLO V7 pre-trained weights so this is the yellow v7.pt file that is already pre-trained on the Coco data set which is a large about 200 000 image data set that's often used to train object detection models and evaluate them and so the yellow V7 authors have published those weights for us and we'll we'll pull those down so we can start from a basis of some knowledge of modeling something rather than just kind of randomly initialized weights now here this is the actual training cell so here you do train.pi so you invoke that with python command and you need to do that within the yellow V7 folder and the important things about this command is there's um a few things that you may want to set so if you really want to dive into the yaml you can look and you can actually be trying to change the network architecture a little bit for the most part I recommend recommend just taking that as given and then you can choose your batch size so here I chose 16 and then choose you can choose the number of epochs so epochs you probably want to do a lot more than one so I just did one here to kind of show things through flowing through the pipes but your network will learn with the more epochs you do so I probably recommend doing at least about 100 or 200 epochs to train uh your your neural network and then we'll point to the data set location so this shows it uh the the actual data that it needs to learn from and then these are the weights that we just download voted to initialize you know from from another Point um there's more parameters I'll link this blog below which has a lot of what I'm talking about in writing um and there's a lot more options that you have here and there are some things that you can get into and experiment with in tuning some of the hyper parameters like the learning rate and some of these other factors that allow you to choose different things about the way that the yellow V7 network is trained of course A lot of these things that come out of the box are going to be what's you know typically been researched or found by the community to be working best but that said sometimes you can find slightly better training results by tweaking some of these things so it could be interesting to to dive in there and and see what you can find um so once training kicks off we can see that the model uh gets initialized here so we can see the convolutional layers getting formed and then we can see um down here how we formed kind of like the YOLO detection layer um and so our images will be passing through this network and then making predictions out of this detection layer and once training kicks off you'll see that your data gets loaded in here into the data loader so it's prepped to be iterated through and then with each Epoch it will cycle through the batch sizes of your images which is considered an iteration so it'll go through the iterations of your data and back propagate through the network and you'll see here that you get some scores back of how the network is doing on your different classes and uh the overall map score so this map at 0.5 is kind of the main one to be looking at you want to see that climbing and climbing and climbing and the highest you can get there is 100 um or 1.0 um so that's something to watch and obviously the more epochs you train for the better fit you're going to get and then after that you can take a peek at the network by looking at it doing real detections so here we'll see uh we can do python detective high and we'll invoke the weights that we just trained so this best.pt is the model formulated at the end of training and will freeze it there and then run in prints with these weights and you can also modulate this confidence lower confidence will lead to more predictions higher confidence will lead to less is one one way to think about that so we run these over all the test images and display the results um so you can see here we only trained for one Epoch so it's not really doing that well uh obviously when you do your own and you'll you'll do a lot more uh epochs uh you know this this uh training will look a lot better but you can see here even in one Epoch we've already started to kind of get the notion of quarters and we start to get the notion of where objects are and what what objects are um which is pretty cool to see even with a with a quick with a quick run um so after that um you can save your weights if you want to take this so if you want to take this and make it into an application you can take your weights export them and uh then you can try to rebuild you know a Docker container or something like that uh that has all these dependencies to be able to run this in pytorch you can also look into converting this into Onyx and tensor RT and off to other destinations where you can run your model and you really want to make sure that uh you know the training has looked good and then you you have some conversions that get your model to place that you you want to to get it so now one important topic to um talk about before you go off the deep end and and uh deploy your bottle into production is you want to be thinking about this data set that you've trained as kind of a living breathing thing so inevitably once you deploy your yellow V7 model you'll find new use cases where you might want to tweak the data set and teach the network new things so an important concept here is called Active Learning where you might want to upload new data that you've encountered in the wild back to your data set and then continue to iterate on it so this shows an example of using our python SDK to re-upload data back to your project so this same project that you were in you can access it again with your API key and then you can upload you can upload data to it you can make it make an inference and then if you say want to write some rules around that so here it has you know some rules around the confidence of your prediction so if the confidence isn't where you want it or or there's some sort of interval that you're looking between you could upload those images back to your project programmatically and then you'll see them pop back up here in your data set and then you can export and start over again and go through the same Pipeline and and uh robofo is all about building those ml Ops pipelines so um you know if you're using yellow V7 at production capacity of course give us a look and um hope you enjoyed this tutorial on trading yellow V7 it's an exciting time to be involved in computer vision and living at the Forefront of what is possible and seeing these new network neural networks evolve and I hope you enjoyed this video and uh we'll see you in the next one
Info
Channel: Roboflow
Views: 56,420
Rating: undefined out of 5
Keywords:
Id: 5nsmXLyDaU4
Channel Id: undefined
Length: 14min 26sec (866 seconds)
Published: Mon Sep 12 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.