Object Tracking YOLOv8 and ByteTrack (Player Tracking and ByteTrack Algorithm Explained)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video I will show you how to track multiple objects using YOLO V8 and by track from ultral litics and explain how the B track algorithm works I will start off by going over what is by track multiobject tracking talk about how does by track work and finally go over a tracking soccer players example with yolo V8 and by track I'll be using a pre-trained YOLO V8 model but if you would like to train your own data sets you could go ahead and check out this video that I have have down here so by the end of this video we will see how we can track these soccer players in this video right [Music] here so what is bite track multiobject tracking so bite track is a multiobject tracking algorithm that allows you to track multiple objects in your scene so it could be used with an object detector of your choice in our case we're using YOLO V8 but it could track each object and for each object what it's going to detect it's going to have a bounding box and an ID associated with it so what truly sets apart by track is that unlike the other trackers this one actually takes into account of the low threshold so other trackers usually just look at the high threshold here we're looking at both so later on we discuss how the algorithm works we'll see how it's actually using both thresholds into its decision making so here you can see the Moda so what it stands for is multiobject tracking accuracy so this is the Y AIS right here Moda so you can see that by track is the top compared to all the other ones you can see it's the highest in Moda and compared to the frames per second is the best but of course for all applications you got to run it in your own test case to see if it actually makes sense because everyone tends to say their model is the best so you just got to test it out so how does SP track work so there's several parameters in the by track. Yo file which we'll take a look at later on in our code but to really know what each one does it helps understand the algorithm in more detail So Below is a summary of the by trck algorithm through the steps 1 through 7 down here so the first thing you got to do is choose a detector in our case we're using YOLO V8 and this detector will help us obtain a detection score so this detection score will be used later on to compare some of our thresholds but we're going to store each object in t where it's going to tell us two information so there's the ID and the bounding box so what we want to do is compare our detection score with the threshold for each object so if the detection score right here as you can see if it's higher greater than the high threshold then you want to store it in D high if it's less than the lower threshold we want to store it in d low Okay so So based on the quality of the detection we're going to store it in either D high or d low and then use that later on so next up we want to predict the new locations of the object in t using the common filter so common filter is a classical control method that's used for predictions so this actually relies on that I won't go into details here but leave a comment if you're interested in that so after we use a common filter we're going to go ahead and run our first Association so the first assoc ation what we're going to do is compare T and D high so usually when we make this comparison it's going to use IOU intersection of union or a Reid for feature distances so what they do is it uses a matching method called Hungarian algorithm based on your match threshold so match threshold is something that's set in your parameters file that you'll take a look at later on so what you have is you're going to store the unmatched results unmatched results in D remain and T remain which we'll use later on so next up we have the second Association so here what we're going to do is we're going to compare our D and our T remain using IOU only okay so this time we're not going to use the Reid only IOU and some of the reasons is explained in the paper but it's mostly due to occlusion so we want to store the unmatched results into T re remain extra re so it's a remainder of the remainders so next up we have the unmatched tracks that we're going to have and it gets stored in t- Lost for up to some track buffer which is the number of frames before discarding okay so some of these parameters you'll see later on in the yammo file are the actual parameters you could set so some of them that we covered here is the threshold this threshold that threshold and then the track buffer is like another one the mash threshold is another one that you'll be setting so just pay attention to how this actually works so finally we're going to create a new track called T and then from D remain if it's above the new track threshold then it's going to create a new track so that's the pretty much the summary of the algorithm and you can see that this is going to be another parameter that we see in our yamamo file later on okay so now let's go ahead and take a look at our tracking soccer players example with your V8 and by track so here we have our main function here so we have a main function here called run detect object and run track object so for the Run detect object what this is good for is if you want to really understand what's happening and the quality of your results you may want to run this to see what's happening so in this function that you can see here all I'm doing is calling the predict function so this predict what it does is it takes in it takes in the source here so what we're going to do is we have you can see here we take in the source which is our video path and then we have show true and then some line width one so I'm setting the line width one to be one so I can see everything more clearly um but it's up to you what you want so if you run this you could see the just the tracking the object detection only not the tracking so this could be good if you just want to see what use are between could range from like 7 to something goes to like 3 so depending on how well your object detection is you may need to retrain it or you might need to play around with a different model so this is a good Benchmark to kind of evaluate what's happening here okay so if later on we take a look at our next function here so the next function that we're going to be running is the Run Track object so this run track object let's take a look at this function here so scroll down a little bit this is some of the main main things that I'll be using but you can see here we have our track object here so we have a frame count is initialized to zero and this is the number of frames that we want to skip to be n equals one and we have an image scale that we're changing so you could change the scale of your images if you wanted to fit inside or zoom in zoom out I just have that in there so here we're starting a capture so CV2 video capture and then from here we have our main while loop so inside of our while loop we're going to read in our frame so this is the image frame that we'll be using for most of our code and then here we have a new width and height this is optional but here I have an option to resize if I want it to be smaller and then finally we have the frame count skipping so it's based on the number of frames I want to skip I could keep skipping on so if we keep looking forward we see this part is the results so what we have here is we're going to actually call the main tracking function here so you can see I've made the definition here what it's going to take is first thing is a source so Source we're going to get directly from the frame okay so that's what that is and then persist is true so what persist does you want to make sure you set this this is pretty important because if you don't set this to True your frame ID is going to keep toggling because it's going to treat each frame as a new frame but persist will actually relate the frames together so make sure that's set to True don't forget that and then finally we have the tracker this tracker is what it has is it could be your input to your by track yaml file okay so that's the file that has all your parameters so if we take a look at this this is our by trck yaml file here so here you can take a look at the different things you can set so here you have the tracker type right now we have set it to bite track but there's other things that you could set as well and then here is a high and low threshold that we talked about in the algorithm so here we have a set to 0.25 and a low Threshold at 0.1 and then we have the new track threshold set to 6 track buffer of 30 frames and then our match threshold of 8 so here go ahead and play with these parameter values and see what works best for you okay but it's going to be a bit of trial and error so you just got to test it out and see what's the best best results for you okay so continuing on to the rest of the code if we take a look at this we see that we follow up after we read it we're going to take the results and then from here we're going to extract our boxes so from these boxes we're going to extract out some IDs because boxes contain the ID attribute so we're going to get the IDS and then we're going to put it everything in the for Loop here so inside of the for Loop there's going to be a bunch of different objects each with a box and each with an ID so from these things we could go ahead and use some of our CV tools to draw our own custom rectangle and then put our text for the frame ID here so you could choose whatever way you want this is just a way if you want to customize your drawing on your screen and then finally we're going to go ahead and run our CV2 Doom show of the frame this this will show you each frame and then as well as the updated result okay so if I go ahead and run this we could see the results in action okay so let's take a look at this it's going to take a minute to run and then after it's done loading we should see it up and running so here we go you can see everything is up and running and you can see that each player has an ID associated with it and then you can see that players are not included like this person uh id1 here you can see that it does pretty good in terms of tracking so this guy here id1 you know id4 id9 these guys are pretty uh like alone but you could take a look at these players here that are very close together they may have some challenge being uh properly identified as you can see here so take a look at this scene in this area you can see you know this guy here these guys on the top right there's a little bit of a challenge with this tracking algorithm but you can see this some of these that are again like spaced out apart seems to be pretty doing pretty well so you can take a look at up here player like id20 it crosses over it does a little swap so these are some things that this algorithm has some trouble with so hopefully you got a better understanding of how to use spy track and if you found this video helpful give a like And subscribe and I'll see you in the next [Music] one
Info
Channel: Kevin Wood
Views: 1,031
Rating: undefined out of 5
Keywords:
Id: 6LGpf-a1K1Q
Channel Id: undefined
Length: 12min 1sec (721 seconds)
Published: Sat Mar 23 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.