Simple YOLOv8 Object Detection & Tracking with StrongSORT & ByteTrack

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hey guys welcome to a new video in this video here we're going to take a look at the Jola V8 model we're going to do object detection and then we're going to apply some Triggers on top of that in this video here we're going to use the by tracker and the strong sword algorithm to track our Optics around in the image frame we're also going to look at the reunification step with our tracker so basically we're going to do update detection optic tracking we're going to track our Optics around in the image frame and then if our objects leave the frame and they come back again we want to apply this re-indification step so we can actually still keep the same ID and keep track of our Optics even though they go out of the frame and come in again so this acts like really cool it is really useful for real live applications so let's just jump into Visual Studio code and let's get started so we're going to use the visual studio code we're going to go over how to set up this yellow V8 model you can also train your own custom jewelry model so all the things that I'm going to cover in this video I basically have courses and videos about that here on the channel I actually like I have a course where we go over how to use the trackers with the yellow V8 model so how we can actually implement the triggers from scratch how we can implement the Kalman filter because all these trackers is actually just building on top of the kelman filter we go over all the theory behind these trackers in that course so definitely check that course out if you're interested for the Elevate model you can train it on your own custom data set I also have a course where we basically deploy these YOLO models both with onanx omsv and all those different things so if you're interested in that definitely check it out I also have videos here on the channel about how we can set up this UL V8 model for custom object detection and custom instant segmentation but basically here we're just going to import the different modules that we're going to need so we have Pi torch numpy opens V we also have these ultralytics here where we're going to use the YOLO V8 model so we just import YOLO we can then specify what model do we want to use of the UL V8 models both for segmentation and also just optic detection they also have some OS jml here as we're going to show you in just a second so this is a utility function that we're going to use acts like need this in this file so we're just going to delete that so we're going to use supervision to plot the detections that we get from our optic detector and also for the tracked Optics from our pipe tracker and also our strong sword tracker then we have this from bytrack dot by tracker we just imported the byte tracker and we do the exact same thing for strong sword and then I have some weight files over here to the left so this is basically the strong sword this is cloned from the strong Source GitHub repository and the same thing for bytrack this is basically just how to set them up we have some matching functions we have the kelman filter um in my course let's show and go over all the theory behind the Kalman filter and how to implement it in Python so we both have to Pi trigger again all these trackers they build on top of the Kalman filter where we wanted to have this update and prediction step so we can try to predict and also track the state of our objects which is our boundary boxes so we have to buy trigger the column filter if you go down to the Deep sort algorithm or like a strong sword it is basically just building on top of the deep sore algorithm we can see that we have this deep we have our models we also have this re-indification model as well so you're gonna basically just see like this is old implementations for the models we also have the strong short down here the pi the pi version so this is basically just setting up the whole um strong sword tracker we have our update method here and all those different things so just go back again we have the files over here to the left we also have all the weights that we're going to use but here we're basically just going to go through the code how to set it up first of all we can set if you want to save the video or not we also have this util class here from a yaml parser I'll just go into that and show you guys that so this is basically just for opening up the config files for our tracker so this is just a utility function that we're going to use then we have this class optic detection class this is the one from my GitHub and from my previous YOLO V8 videos where we're basically just going to load in the yellow V8 model from ultralytics we're going to do some processing to it referred through the model we get the results out and then we can post process extract information so first of all here we're just going to initialize our optic detection class with a capture index when we want to apply this on a live camera feed from our webcam which is at the capture index we're using Cuda if it's available or else we're going to use the CPU so I'm just going to close this one here and I'm also just going to zoom a bit out so you guys can see what's going on so back to like need up here at the top I deleted the path here we're going to use that later on so I can delete these three so we're actually going to use the path later on then we go down load the model here so this is basically just a function for loading the model I'll just scroll down to it so basically we just use the YOLO class here YOLO we can specify the model that we want to use I'm just going to change this to the small model and we can see here that we can both run the segmentation model and also the optic detection model and we can actually apply the Triggers on top of both of them and then we're just going to return our model let me just scroll up here again and then we have loaded the model we set up the class names dictionary we have the box annotated from supervision so this is basically just how we can set up our annotators how we can visualize the results and the predictions that we get out from our model and our trackers then we have this re-id weights so this is basically the weights for our re-invacation Network we just specified a path to it so this is specifically trained on a data set for redefine Optics going out and in of the frame so the deer behind our reunification step is that we when we actually do our object detection we also want to track our Optics around but then we take the bounding boxes we crop out the image inside of the boundary box and then we can use that to find features for our specific objects so say that we want to detect persons and count persons walking around in the frame then we can use this reinvacation network so we take the boundary box of the person we find the features from our neural network which is pre-trained on a larger data set then we basically use those features and then we use that in the matching step when we're actually tracking our Optics again so if you go out of the frame and you come in again you will still keep the same ID and that can be used for a lot of different kind of applications it can be used in like counting systems attendance systems also if you have different kind of like Optics that you want to track and they're sometimes like leaving frame coming back again but you still want to keep track of the exact ID of that object so we have our reunification weights now we can go down and set up the tracker so if we choose the by tracker we basically just call it here we set that up as Arc tracker or else if we use our strong sword algorithm if we specify that we're basically just going to set it up in this way so first of all we have our tracker config so this is basically just our yaml file this is the exact same that they're using for pie track and also strong sword we have our wild male parser we set it up as a config merge from file we basically just pass in the config and now we can go in and set up our pipe tracker so this is a class that we're going to initialize or instantiate we have our self.tracker we set up the different account like parameters for these trackers like most of them is act like um default ones but here we actually have a pretty good config so we can set up a config for that so we're just going to use those default parameters for the config if you just go inside one of them we can just go in and check and check for now so here we have the configs strongsort.yml we can set up the max age Min distance Min intersection over Union distance and so on and these different round things here are covered that in the course but basically like the most important ones is the n in it which is basically the number of detections in consecutive frames we want to have before react like track the Optics so before we create a track we wanted to take the optic free and free concept frames and then we have the magnet here so if we lose track of our optics for more than 40 frames then we will discard that track and then we won't keep track of that object any longer it also has a max distance and all those different kind of things which is basically like threshold values that we can play around with to trim our triggers so this is really nice we set up this config config file here that we can then load in we basically just set up our by tracker and a strong sword trigger here as well so this is how we can set up our tracker we also saw how we can set up our model so this is the way that we set up the model we have our predict function we take in a frame through the model we get the results out and then we can return the results for further processing after that and also visualizing our results so now we have this method here called Draw result we basically just have some list where we're going to extract the information into so this is basically just we have a for Loop running through all of our results we get the class ID we get the X Y X Y so the top left corner and the bottom right corner we append that to the list we have the confidence scores the class IDs and also the boundary boxes refer them through here we have our detections we can set up the detections with supervision and then when we have set that up we can basically just pass them into our box annotator and this is for visualization of our tracks and also the detections we have with our yellow V8 model so this is basically just taking the results extracting information and then we can use that in post-processing for visualization or applying them to our trackers later on so we're just going to return to frame and also the bounding boxes now we have our call method of this class we set our capture index when we create a new instance of our class we just assert that our captors act like open we set the resolution of our camera so 1280 by 720. if we want to save the video we're basically just going to create a video writer with opencv then we set our tracker equal to self dot tracker so here we just we basically just set up our tracker we just specify like if our tracker is actually using some new networks and some models then we're doing a warmer before we actually like doing our inference with those models for the tracking part so that is just a good thing to do so here we're basically just going to warm up our model we have some outputs here so we just create we just basically just create some variables for our output for our current frame and also from uh for our previous frames now we're going to have our standard while loop just running loading in images from our webcam calling all these different kind of like methods processing the results doing detections tracking reunification and also visualizing the results so we'll do all that inside of this while loop here so let's just go through it line by line first of all we just read in a frame we started to counter to start with so we can actually time how many frames per seconds we get by both running the update detector with the yellow 8 model the tracker but both by tracker and also the strong short tracker and also when we want to do this reunification step then we read in a frame from a webcam we stored in the frame variable we just assert that we actually return true and that we're able to load in a frame from our webcam so then we're going to do the prediction we get the results back when we have the results we can go down and draw the results with our drawer utility function that we just showed up above so basically just going to visualize the results we get the frame back again now we can actually go in and update our camera so we can actually go in and do camera motion compensation so go in and use the previous frame and also the current frame and then inside of our tracker we have a method called camera update so basically just going to update our camera um so we do compensation for our camera motion when moving around and that is actually really helpful when we're applying these Triggers on top of our optic detectors and that will just make it way more robust so now we have our results we just have a follow we're running through our results then we have this update method here for our trackers we just call it update function we throw in the results so this are detected bounding boxes and then we also throw in the frame then we get the results out here or like the outputs from our trackers this will be our tracked bounding boxes then we can basically just have another photo running through all our tracked Optics or detract bounding boxes we can just extract the information we get the first four values for our bounding box we can get the track ID we can get the class confidence score at the top left corner of our offer X like bounding box so we can then draw the ID on top of our bounding box and then we can see if we're able to keep track of our Optics in the frame so this actually really just set up we just read in the frame through our object detector the yellow V8 model and again it is really easy to use when we have this class set up we then just take our results further through the tracker we get the new track results out and then we can do whatever we want to do with that and we can just visualize them and see if we're able to keep track of the same ID which is the idea behind the object drivers and also the reunification step and network that we are using then when we have done that we're just going to end the timer so we can see how long an x-like takes to both do predictions with dual V8 and fruit and fruit with tracker and then we're going to put out the text the number of frames per second and then we're just going to show the frame with our detections if we want to save the video we're just going to Route the frame out to our video capture that we have opened up or the video right over there will open up if hit Escape or Q on a keyboard at any time we're basically just going to terminate our program break off the while loop we will release our output video if we're going to save the video and we will also release our webcam and destroy all the windows that we have opened up with ohms V so now we have implemented the whole class we're now ready to run our update detection with our tracker we basically just create an instance of our class optic detection we throw into capture index of our camera or a webcam that we wanted to live update detection on with tracker so again we then have an instance of our detector we can call our call method on our detector and then it will run the program so now we're just going to run the program and see how it works so this should open up my webcam I'm just going to verify that we have the correct um capture index we're using CPU I'm on my MacBook right now so I don't have my GPU for X like running the inference so here we can see that it is not able to open up my window capture it just does the exact same thing so let me just try another index here so we're just trying to First index now let's see if we're able to open it up now so we again we can see we're using the yolvis small model you'll be a small model now we can see that we're able to run it we get the inference up here we get around three to four four around four frames per second when we're running and ference here on my CPU we get pretty nice detections now we can see that we're actually tracking the Optics around in the image would you take like a person over here so we get some false predictions um they have they probably have like low confidence score we'll also to take the chair here in a bench but the most important thing that we're going to take a look at is me as a person high confidence score for that and here we can see the ID that is assigned to me then we can just move around I'm just going to move around so we can see that it actually still keeps track of me so this is actually like a pretty nice tracker as you can see like we can move around I can even like take my hand up in front of the camera and it will still keep track of me it will still assign id1 to me even though like I'm putting my hand in front of the camera and we lose track like we come back and I still get assigned um one to me as the ID yes now we can see we've got id4 here I can actually try to just go a bit further back um and then I'll try to move out of the frame come in again and then we can see if we get a new ID we can see that it just tracks me around nicely in the frame we have only around like four frames per second there will this will be they're better like if we were running it on the GPU but still really crazy results here on the CPU all right okay so we can see that sometimes like when I'm going out of the frame um we actually like lose track and it assigns a new ID to us but like if we're in the frame if we just go outside the frame for a couple of seconds it is still able to actually like keep track of us it could be because the number of frames per seconds um because we're basically just taking the new network if we're moving to the frame it also depends on like where are we going out of the frame and how do we look when going out because we're basically having this matching uh between the features from our boundary boxes and attract abounding boxes in our frame but we see here like when I'm moving around it looks pretty good so that was for it by tracker so now we're going to go up and act like this specified that we want to use um the strong sword algorithm so that should act like be a bit better that is at least my experience so let's try to see if strong Shore is act like better we're just going to run the program and then we're going to do this exact same test so we're gonna basically just compare them see how it looks side by side should open up my camera right now we have load in the weights now we can see I have ID 6 assigned to me this acts like a bit slower now we can see we get around like two to three frames per second so again these trackers when we apply the Triggers on top of our optic detectors it just acts like taking up a lot of processing power and if you were to drag run this in real-time applications you will definitely need to run it on the GPU if you want to run it on a CPU in real time you need you might need to like um have this trade-off between accuracy and speed and then you'll probably go with the store algorithm which can still run like okay real time on the CPU so just try to do some tests I'm going to move around and see if we can keep id6 this just tries to like hold it for a bit longer let's see still 96 let's try it even longer still 96. still a lot easier you can see this strong short algorithm here is acting like very very good um it takes a bit more processing power but you can just see like the results here they are just really awesome and also like way better compared to like the bike tracker that we just tested out let me just try to grab the frame and see if we can actually lose the ID foreign okay so we actually lost the ID we got like uh 122 or something like that but then we came back in the frame again and then it acts like re-identified me as six so we can actually see that we can actually go back again and correct for it and find the correct ID so this is a really awesome tracker like you can just see the results here they are very very good it takes up a lot of browsing power but you will still be able to run this really fast on a decent CPU so these are some really nice results that we got out from this update detection with tracking the bike tracker and the strong short tracker are acts like some really good state-of-the-art trackers we sort the results with the strong sword algorithm was actually slightly better than the pipe tracker and it was act like able to keep track and do reunification even though I went out the frame for like several seconds even though I was holding my hand up in front of the camera so again these are really great trackers like if you're doing some update detection projects try to put some trackers on top of it if it's suitable for your application um also if you have a dpu you can run it on that will also benefit you a lot and also the reunification element if you want to create some attendance systems counting systems and all those different things so this is really useful for creating like real life applications with computer vision and AI so thank you guys for watching this video here and again remember to subscribe button and bellification under the video also like this video here if you like the content and want more in future again I have all these cores here covering both like you'll V8 how you can train it on your own custom data set deployed in different ways the theory behind all these trackers so if you're interested in that there will be a link down in the description I also have tutorials wherever go over like old Basics about computerization deep learning and so on and if you're interested in one of those tutorials I'll link to it up here or else understand guys bye for now

Info

Channel: Nicolai Nielsen

Views: 15,262

Rating: undefined out of 5

Keywords: yolov8, yolov8 object tracking, kalman filter, kalman filter implementation, how to implement kalman filter, deport object tracking, kalman filter object tracking, sort object tracking, how to use deepsort, object tracking python, how to do object tracking, object tracking yolov8, yolov8 object detection, object tracking course, how to use object trackers, tracking in computer vision, yolov8 tracking, how to use yolov8, strongsort, bytetrack, strongsort yolov8, yolov8 bytetrack

Id: oDALtKbprHg

Channel Id: undefined

Length: 21min 43sec (1303 seconds)

Published: Fri Mar 17 2023