Real-Time Object Detection with YOLOv8 and Webcam: Step-by-step Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey guys and welcome to new video in this video here we're going to talk about the new YOLO V8 model so we're going to take a look at the GitHub repository here uh we're going to scroll through it we're going to see how we can actually use this yellow V8 model so it was actually just released yesterday it is really nice and we're just going to take a look at it so here we have all the different kind of like files this is just an improvement to the YOLO V7 model this is created by ultralytics so this is basically the creators of YOLO V5 as well so you only finds what was act like a really crazy model it has been used for a really long time I'm still using the yovi 5 model for some specific application and also have a couple of videos here on the channel about the uv5 model recently I just created a course about the yellow V7 model so if you're interested in that um definitely go in check that out we have focus on how does the architecture actually works we train a model and then like the most important thing is the deployment of these trained yolvisation models we can also use the yellow V8 models directly inside that course we can just instead of using the YOLO V7 model we just used the jewel V8 model and then the most important thing is to act like deploy the models so show different kind of techniques how we can export uh export them all to different formats how we can use that format in our own python script for deployment with onanx um Pi torch and opencv so this is actually like really nice again it is really useful to know how we can actually deploy these deep learning models and you'll be you'll be seven eight models that we're trained in our Google collab so again we create a data set we label our data set training cool and collab or on on some other different like devices and then we export the model as in in some framework we optimize our model and then we deploy it with the framework that is suitable for our application depending on if you're running like for example on an Intel CPU Nvidia GPU if you have TPU and available or if you just want to optimize on CPU on X for different kind of like Frameworks or if you have tensorflow or you're running on its devices we have all these different possibilities with um with acts like deploying our models so here we're just going to take a look at the result from the YOLO V8 model then we're going to go into python see how we can actually do live inference with this new dual V8 model we're going to try a couple of these models to see the results how fast the acts like going to run I'm just going to say now that the results that we're going to get is really awesome we're going to have a high number of frames per second I'm going to show you that with this small model we're able to run a 200 frames per seconds on my Nvidia RTX 1490 graphics card we can also run this extra large model as you can see here we're going to run that as well and we will actually still get 50 frames per second you'll still be able to run in real time on like lower like Hardware or like lower resources even on the CPU you should be able to run these models here with a high number of frames per second compared to some of the other models here we can see the results they are benchmarks up against the cocoa data set we have the mean average position of 50 up to 95 intervals then we have the yellow V8 model in blue and then we have the yolvis V7 and yellow V5 so on this channel here we've been using all these variations of online versions of the YOLO models but now we can just see that this yellow V8 model it just has way higher like actually way higher like mean average position um on the cover data set compared to some of the other models here is slightly better than um than the yellow V7 mod here but it is still better we can see that the Nano model here is actually like way better than um the ulv5 model compared to Yellow V6 it is not really that much better where you can see the small model is act like um it just have lower number of parameters but again we also increase the the mean average position the extra large model here they just have less parameters they're more accurate they can run faster and all those different round things so it's basically just a version Improvement of the previous models over here to the right we can see the latency and also just like how many frames per seconds and how long does it actually take to process these images this is Benchmark against our 800 Nvidia GPU I'm not really a fan of them benchmarking it with this dpu here because this is not really like a GPU we're using for deployment it's not really a common GPU used in most systems this is gpus for like some Cloud resources where you're actually like train your models um so that is at a really good GPU for that but not really for deploying it on on actual like devices but here again we can just see the milliseconds per image so this is the processing time that it takes to process these images and again we can just see that we have really high mean accuracy um and it can act like it can actually process these images here rather fast so here we had like two milliseconds per image so this is around like uh it is actually like 500 frames per second with this really crazy GPU which is not really necessary and I think this is not a good Benchmark but this doesn't really matter it still shows the performance it shows the comparisons between the different models you can see we have higher accuracy and we also have lower inference speed again depending on the applications you can go for like the Nano model the large models mod models the model I've used the most is the small model because we just get really high accuracy we still get real accuracy and we also have really like low inference time so it is basically just like a really good model for um a lot of like common and standard um AI machine learning deep learning applications and projects we can see some documentation of how to install it we can basically just go in and PIV install it I create a new Anaconda environment so it went into my anaconda prompt and then I basically just created a new um Anaconda environment you can do it with this I'm using Anaconda as my python distribution and then you're gonna have this contact create you just specify here name and then you just specify the name so I created one called YOLO V8 then when you have created that we can just go in and activate the environment the webcanda activate and then we have YOLO V8 then we can activate our environment now we can go in PIV install ultrolytics we can go and install all the requirements um from here so we can basically just PIV install this requirements text files it will install all the dependencies of your computer if you don't already like have those installed in your base based installation or you just want to create like a whole new environment we can see all the different kind of like files so they actually use a new structure instead of calling these Python scripts when react like want to train do predictions and so on from the command line so now we have this yellow command we can run the different analytic tasks so we can set it equal to detection classifying segmentation so these are the different three different tasks that we can do with this new yellow V8 model that we can do detection classification and segmentation we can also set it for a specific mode so we can do training prediction and validation we can also set it to export and then we can choose the format that we actually want to ex export our model as so the export types is actually like really nice supported with this new UL V8 model we can export to basically like any format that we're using right now as the like the main standard for both like deploying the models on optimized install CPUs gpus like tensor RT and just in tensorflow and tensorflow and Pi torch in general and also the common on the next on the next framework where we can deploy the models on different kind of like opencv um as basically like just a general format for like doing interference with these neural networks here with the owner next runtime here exists call short snippet of how to use it in Python so this is basically what I'm going to do and then we're going to run it to see live interference with uh with a webcam with this new YOLO V8 model we see some checkpoints the the MS Dimensions here so 640 by 640 in average position parameters and also like flowing point and operations per second uh here we see like different kind of like Integrations for a data set you can use Robo flow to actually label your own data set export the data set and train your YOLO V8 model directly as we've been doing with the yellow V7 ulv5 models and all those different amount of things we can do some training in some notebooks we can do some locking off our act like training deploy them on platforms like neural magic we can export it to tensorflow Pi torch on an X iOS open window here and tensor RT so we have all these different light Frameworks supported which is really cool so it's not just here at the top here we can actually get the Google collab notebook so we can just open it up in Google collab we can go and train our own Yola V7 model we can see how we can do segmentation like detection and also classification I'm going to create now another video about that so definitely hit subscribe on the Bell notification under the video If you want to get notification when I upload new videos here on the channel so I'm actually going to create another video where we're going to train these new YOLO 8 models on custom data sets we're in the training from scratch play around with some of the different kind of parameters export them I'm going to show you how we can export these models and deploy it with different Frameworks but this video here we're just going to see the inference results and take a look at this newly released YOLO V8 model because it has really nice performance and it runs really fast and as we're going to see it will be able to run in real time so I'm not doing it to visual studio code we're going to take a look at how to actually use this in Python so we can do it in a couple of lines of code instead of just passing in command lines we can then use it in Python now where instead of coming it's passing in like the argument list to our Command Prompt we can basically specify them as arguments to our function so I have this predict and then we can pass in the different kind of like the different kind of like input arguments to our function then we can basically do predictions on our source so this is the webcam we can also do it on images folders and also videos but if you specify server here it will use the webcam you can specify all these other different kind of like parameters so here we're going to set show equal to true so we can see the results um of our inference you can go inside the autolytics documentation and see all the different kind of like arguments that you can pass in but these are the most common ones just to get it up running to see how the inference acts like works you'll get the outputs in the command prompt or like in in the act like output terminal down here at the bottom you'll be able to see the results the classes that it is detecting and also how fast it processes these images when you terminate the program you can go in here and print the results you can extract all the information all the boundary boxes classes and so on of your detections up here you can basically just specify the models first of all it's just try with the small model and see how the inference work with that so basically this model that predict here it just returns or it just uses this detection predictor which contains some pre-processing steps and all those different on things so we might actually be able to go in extract that information make some adjustments to these functions and so on but again I'm going to create another video where it's like going to deploy these models ourselves because again we just have these python models or own index models or depending on what format we actually export it to we're going to open up with almost V we're going to oh mobile webcam have our while loop running we're just read in our images past images through our models here do some processing show the results in our own way so we basically just create everything ourself instead of just having this predict function here where it's kind of like hard to extract information and use it in our own project and application so we're definitely going to cover that in another video together with act like training our own YOLO V8 models so now we're just going to run it and see the results I have a webcam here that it should open up so here we can see that we're running on the GPU I have a GeForce RTX 4090 here we just see what what is act like detecting so here we can see that it is detecting a mouse a person a TV here as my um as my computer like my PC we have a laptop we have a TV so this acts like correct predictions even though we can even only see my hand it takes it as a person with really high confidence score here we can see all the different outputs we can see the image Dimensions the resolution of our images how many objects we're detecting what type of Optics and then we can see the inference time over here to the right so actually like with five milliseconds so this is 200 Hertz so this corresponds to 200 frames per second so again this is just really crazy inference time we're able to like run it in in real time so this is the small model so it should actually run pretty fast if you just here terminate it we can just try it with the Nano model just to see how fast that act like runs women terminate the program I'm here I'm just going to print the results and then we can extract all the information of all the detections that we had so now we're just going to run the Nano model and then we'll take a look at the extra large model because with the extra large model we get some really nice results here I'm just interested in seeing how fast this act like runs here we can see it still runs like four or five milliseconds it is closer to four minutes four milliseconds for uh for inference speed so this is act like even faster so it's above 200 frames per seconds when we're doing inference we get to see where I can move the webcam around so it only takes like in 30 images from the webcam so it doesn't really take up a lot of processing power and by running these algorithms we can see it's still it is really still really good at detecting all those these different kind of like things I can turn it around we can see if we can detect some other things in the background so here we have a chair in the background and we have a couch kind of like a couch a dog couch but we see that these results here are pretty nice let's just try with the extra large model here to end it off with and we get all the detections we change this to X this is really easy to get started with you can basically just set it up with these lines of code um play around with yourself try it out on your own laptop and so on so now we're going to run it with the extra large model and see your results so first of all let's just see how many frames per seconds we get so we get 20 milliseconds so that is around like 50 frames per second so even though we're running the largest model here on my computer like the largest Jewel V8 model we still have 50 frames per seconds and way over real time if you have like a page lower GPU compared to a 490 RTX you'll still be able to run like 20 30 frames per second with this extra large model but again the extra large model is it's probably like overkill for most applications and projects and you will be more than fine with this small model and even maybe like the Nano model when you have fine tuning on your own data set here we can just see that it detects the TV correctly we have TV we have laptop so this is a laptop we need to take the mouse over here the background we have the keyboard mouse Persian Mouse in the background TV so these are just some really nice predictions as you can see here really great model it was just recently released and we're going to get some real nice results this is just an improvement over the other models so thanks for asking about this video here and again remember to subscribe button and Bell notification under the video also like this video if you like the content and you want more in the future it really helps me and YouTube channel out in a massive way I'm also currently doing these deep learning tutorials computer vision tutorials where we go over the basic theory about like deep learning how neural networks actually work how we can create our own networks the different kind of parameters how we can tune them how we can train our own functions how those parameters affect the new network while training and then also how we can deploy the models for our own projects and applications so if you're interested in that tutorial I'll link to it up here or else I'll see you next week guys bye for now
Info
Channel: Nicolai Nielsen
Views: 60,330
Rating: undefined out of 5
Keywords: yolov8, yolov8 neural network, yolov8 custom object detection, yolov8 object detection, yolov8 tutorial, object detection yolo, object detection pytorch, object detection python, opencv object detection, opencv yolov8, opencv python yolov8, object detector, object detection yolov8, opencv, yolov7 dataset, detect objects with yolov8, yolov8 opencv, opencv dnn, opencv neural networks, deploy yolov8 model, how to deploy yolov8, yolov8 vs yolov7, yolov8 vs yolov5
Id: IHbJcOex6dk
Channel Id: undefined
Length: 15min 22sec (922 seconds)
Published: Thu Jan 12 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.