What's New in YOLOv8 | Model Deep Dive

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey there this is Jacob from roboflow here today to talk about YOLO V8 here you can see an example of yellow V8 inferring on NBA basketball players and this is an example of how you can train yellow V8 with a custom data set to detect objects of Interest so yellow V8 is the latest iteration of the popular object detection model YOLO and it dropped yesterday and so here we'll go and kind of dive into the history of yellow and how yellow V8 was made and then we'll look a little bit about some initial evaluation of how YOLO V8 is doing against previous models and then we'll talk about a bit about what you can do now that yellow V8 is out and what we recommend for things for next steps that you can you can do with the Elevate model so what is uh you know how did YOLO grow into yellow V8 well so originally the yellow architecture was published in a c repository called darknet which was maintained by Joseph Redmond its founder who when he was working on it as PhD at University of Washington now this kind of grew into a project where there's a lot of Open Source contributors and people were continuously deving upon the yellow model and meanwhile once you got to the yellow V3 version Glenn joffker started working on shadowing the yellow V3 version in in pytorch so some some of you may remember the yellow V3 pytorch repo where the pytor straining was starting to replicate the yellow V3 Dark Net training and then it became clear that the the pytor training actually started to surpass that and so uh ultralytics which is Glenn's company released yellow V5 which was the pytorch version of yellow V3 essentially with some tweaks and some additional modeling things and then from there the community started deving on it pytorch made it very convenient for people to be working on the model and you know operating in it and obviously python is a lot better than C and so it kind of grew in popularity in the community started building on the yellow yellow V5 repo now a lot of other models were released uh off of this so some branched off directly out of this repo that were kind of like essentially Forks like scaled yellow V4 yellow R yellow V7 these came off uh from researchers uh one can you and Alexa a b and uh they were you know pretty much the same repository but tweaks to the network and tweaks to some of the modeling routines um and then some emerged in pytorch just completely external to the yellow V5 repo these were like yellow X and yellow V6 um and so uh meanwhile ultralytics still have the yellow V5 repo where people are primarily you know the the main object detection Community is working here even though they're kind of these new versions that are pushing soda a little bit further but they're a lot harder to use so they're a little bit less used and then ultralytics put a lot of work in the last half this year into research and yellow V8 which they launched yesterday um which pushes uh some of the soda metrics so here we'll dive into a little bit about what has changed in the model architecture so um you can see here this is the YOLO model so the yellow model has a backbone which is a series of convolutional uh layers as you kind of pull the pixels down at a different resolution uh sizes essentially and then that's those features are passed through uh a neck where they're pulled together and then they're put into a head which is used to make the detections and the detections uh are based on loss metrics so the traditional yellow has a box loss okay class loss and an object has lost and uh yellow V8 does a number of iterations on improving uh this loss function at the end and then the way these boxes are defected so the biggest thing about yellow V8 in my opinion is that is Anchor free so this means that uh yellow V8 does not predict based on bounding box anchors which is what the other models used to do where you would have some sort of anchor bounding box and then you make some prediction uh from an XY off of that Anchor Box but this is notoriously tricky because what if you have a data set where your objects are a lot skinnier like manta rays or maybe they're really tall and thin like giraffes and uh then you'd need to be kind of like making an algorithm to auto set these anchors based on your training set that can get complicated and actually a lot of models wouldn't do that they're just a model on cocoa and they wouldn't do anything with their anchor boxes but now um yellow V5 doesn't have those uh small medium and large anchor boxes rather it just has this so we can look at the the netron app so I recommend opening up the Onyx files for yellow V5 yellow V8 and you can kind of look at the layers and look at how things have changed and here you can see that there's really just one main output head here now in the new model and then here if you look at the yellow V5 model you have three different head sizes so that's in my opinion probably the biggest change and then the loss obviously change as a result of experimenting with that the other thing is there's a new convolutional layer so this is basically where yellow V8 is improving on the way that the backbone and different convolutions are are made in the network and you can kind of dive in there and you can actually dive into the pie torch code if you want to look more into that and then um and and in this blog we have a nice diff of uh where the research was done so you can you can look at you know the different ways that the the code was constructed and uh the the next big thing so this is a this is a augmentation related thing so when you're actually training these models you're going through and you're calculating um you know you're making small perturbations to the images uh each time you go through the epochs that's known as online augmentation and one of those augmentations is called Mosaic which is where you put four Images together so you take four images and you stitch them together like like so and uh that that is a pretty good technique because it makes your network learn different locations of where the objects are and then maybe you need to learn a little bit of objective occlusion and maybe you need to learn to predict an object on like slightly different backgrounds and things like that it seems to be a pretty good technique but it's been found empirically you know that you should actually close Mosaic before you finish training that's because maybe your validation set really isn't Mosaic at the end of the day it's just a good convenient technique to do in the main body of training and this is a good example of something that yellow V5 and yellow V8 have been very good at is looking at the Practical things around the neural network not just the architecture and so closing closing Mosaic OG is actually you know a pretty big thing to have into the code Repository now let's dive a little bit into the accuracy so uh the yellow V8 uh Coco accuracy so Coco is kind of the main Benchmark that people use in object detection which is uh some 80 classes of common objects like TV in person and bicycle um and it's a pretty good it's a good Benchmark released by Microsoft and it gives you a proxy of how things are doing so in terms of that the yellow V8 models improve on their blv5 predecessors by somewhere like 25 to 30 percent map for relative to model size and and speed of inference and so that that's pretty good um and it's more Stark for the smaller models that got improved when you get up to the larger models it's more like a five percent Improvement or so on on the map uh valve but this is uh considered to be state-of-the-art for object detection now so that's an exciting thing especially to have you know the code in in this repository to be getting those kind of numbers now here's another angle at things that we take at Robo flow so we have uh this Benchmark called riboflow 100 which is a set of 100 data sets from robovo Universe which has over 100 000 data sets of uh people uh like you and various computer vision practitioners working on their own data sets and so we can Circle this Benchmark to more accurately answer the question of how well is this model going to work on my own custom data set so Coco's a decent eval but it's a single data set and there's a lot of different ways that data manifests itself and there's different domains that your your data may may fall into so we evaluated this on rf100 and this is the result that really was the most exciting to me so you can see here we have the average scores for yellow V5 yellow V7 and yellow V8 so the interesting thing here to start off the bat is that yellow V7 actually did worse on average across these hundred data sets than yellow V5 did pointing to the fact that even though Yellow 5 might have pushed soda on Coco uh it still wasn't generalizing the Practical data sets as well as yellow V5 was but now yellow V8 really pulls up all the data sets um on how they do um relative to Yellow V5 which is means there's there's kind of the Stark and if you look at it on a day set by day basis there's the Stark Improvement on all data sets either getting pretty similar to what they got on yellow V5 or a market Improvement which is uh pretty impressive to see that kind of uh performance improved performance across the board and then another way to look at this is you can look at it broken down by uh imagery type so here you can see yellow V8 is still winning across the board of the RF 100 data sets um the last thing I want to talk about is a little bit about the ovate repository so this new repository is sort of going to be a place where people can collaborate on the network very similar to the way yellow V5 was and it's exciting to see how the community will rally around this I think the best case scenario is that a lot of people who are previously working on these Forks of yellow V5 will actually do their research in that repository and then hopefully those neural networks will get loaded in and I I know that Glenn NAU should have been working uh pretty steadily to make this repository very friendly to use so they have uh basically released a CLI which is similar to the way we used to use yellow V5 you can use it on the command line and there's different tasks so you can detect you can classify and you can segment and then you can train predictor valve and then in the python package you can do the same operations except natively in your own python code so you get both of these when you pip install ultrolytics and that's going to be a pretty nice thing for people to operate under as they use the new networks now the last thing I want to talk about is what should you do now that you know a little bit more about yellow V8 and you're ready to get started with it so the first thing we have is we have a how to train yellow V8 on a custom data set and this will be linked in the video and this is if you want to get started with your Hands-On with your own data set you could do something like we did with the basketball players here or you could do you know whatever your data set is and then also another thing I think you should check out is checking out real quick Universe for uh yellow V8 models that are on uh on ruffle universe so you can see here all the public projects of things that people are doing um on your V8 so you know tennis and robots and football players all that sorts of stuff and then the last thing um you know that I I want to point to is that you can actually deploy your V8 model now on Rebel flow so you can uh train uh on a custom data set and then you can upload those weights back to roverflow and then you get all the different Rover flow deployment patterns that you may or may not be familiar with which is host API that scales to infinity and Edge Dockers that are optimized for various Hardware types which is a nice way that you can work on your data set and webflow and you can deploy it so you can can create an active learning loop around yellow V8 and really create a really great production system that will take your model to new heights so thanks for watching this video if you liked it please like And subscribe and we'll see you in the next one
Info
Channel: Roboflow
Views: 28,658
Rating: undefined out of 5
Keywords:
Id: x0HlrCjJDjs
Channel Id: undefined
Length: 11min 35sec (695 seconds)
Published: Thu Jan 12 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.