Object Tracking with YOLOv5 and DeepSORT in OpenCV

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so in this video here we're going to do update the tracking with yellow V5 so we're going to have a yellow V5 model running for optic detection and then we're going to use the Deep sword algorithm to do tracking of those detected objects so this is actually like a really cool video like in previous videos where I've been talking about like how we can do optic detection and so on but now we're going to add tracking on top of our optic detector because when we're detecting optic we're just doing it from frame to frame so we just pass in a frame to our model we get a detection out and then we just get the detections out so if we're not doing detections every for example like running on a live webcam uh then maybe we'll miss some detection from frame to frame but if we have a Tracker we can actually track our bounding boxes and our detections over time and then we can actually keep track of our Optics even though we loot detections uh from frame to frame so if we're for example like running 30 frames per second and we have a couple frames or we miss our detection because we have some occlusions or we just doesn't detect it in the model or if it's not larger than our threshold the confidence score then we might lose track and then we'll just lose track of the optic completely where if you're using a trigger we can actually track the exact same object we can assign an ID to that object and then we can track it over a number of frames and then we can play around with some parameters and so on but first of all from the Subscribe button and Bell notification under the video here only 10 of them do you guys watching this video here are actually subscribe to the channel it's just a single click and helps me and your channel out in a massive way so here we're going to use the yolovice vibe model I also have a course with the yellow V7 model if you want to use that model instead of the old V5 it would actually like be faster to run with the only 7 model and you will also get better performance and also show you how we can actually deploy it then you will get the boundary boxes out and then you can just directly use those detections and and pounding boxes with deep sword algorithm here for uh for tracking your detections in the image so definitely check that out I also have quarters about how we can use ohmsdv GPU support so we can actually like improve the speed by by a lot like up to like 30 times or if you want to go from like um like a couple frames per second to almost like 30 40 frames per second in your algorithm then you like be able to create real-time applications here I'm going to show you you'll be 5 detective running on the GPU and then we also use the update tracker it will run on the GPU as well so this is actually going to run really really fast we're not going to see the results at the end but here we're just going to jump straight into the code we're importing the module so we're going to use opensv numpy time and also torch we're going to use pytorf to actually load in our yellow V5 optic detector then we have created a class here for the yellow detector I have videos about how we actually initialize this class here and then we created a class in another video so may definitely like check that out if you want to know more details here we're just briefly going to go over it so we can talk about uh the Deep sword tracker afterwards so here first of all we just have our initialization function which is for in the model name if you want to do it on a custom optic detector or we can just use the pre-trained one of the pre-trained YOLO model as well here we're just going to set the class names we're going to set Cuda here if it's available or else we're just going to run our model on the CPU then we're going to load our model so we use torch pytorch to load our model then we're going to inside ultralytics and then have they have this repository YOLO V5 if we throw in a model name here it will load the custom update detector or like the custom the only five model that you're trained I have videos about like how we can train your own ulv5 model and again the course about the only seven model we're all over like the architecture of the model like how does the model work what's the advantages and I also go over like the advantages over yellow V5 so definitely check that out if you're interested but here you can either like you load a custom model or just a pre-trained uh you always YOLO V5 model here and we're going to use the small model of this example here if no model name here is passed into the function we can also like have some score frame here so this is basically the forward pass that we're going to do of our frame so first of all we need to resize our frame here to the correct format for July 5 and then we basically just have our model we have our frame we pass it into our model we do a four pass in our model and then we'll get the results out here at the end then we can split up the results here to our coordinates of our bounding box and also the labels for the detections that we're doing so let's say that we wanted to take the person then we're throwing out the boundary box or the corners of the bounding box around me and then also a label which will be the corresponding like class that we're detecting so that could be for example be a person then we can also convert a class to label we can plot the boxes here as well so if we have greater than some confidence score threshold then we can act like both drawer or we can actually like just append our detections to a list and then we can return that list do something else after that that is what we're going to do in this video here and then we're going to draw our detections after they have been through our tracker so here we're basically just taking out the coordinates for top left corner and the bottom right corner so we're going to click drawer boundary box with that we're not going to do that here because we're just going to do it after we're past our detections through our tracker here we're just going to get the different kind of like format here that we need for our deep sort tracker so here for the default tracker we actually need the top left corner and then we also need the width and also the height so this is the format that we need to pass it into and then here the fourth item here in a row a vector here or like a row vector or Matrix for our detection it will be our confidence score and then we can also pass out the label for our update that we have to take them and this over here we're just going to do person but we can also change these to all the other different kind of like classes that we have inside of the cover data set if you're using your own data set you will need to go in and do your class labels initialize initialize your classes yourself but we can also go in here and do for example like cup detection so you can just do that and then it will detect cops in the image instead but here we're just going to go with person then we're basically just returning the frame and also the detections so this is the whole YOLO V5 model detector again you can use a custom model or a printed model you can even go in and use the jolvis 7 model if you've been through my course here we're just going to open up a video capture so we're going to open up the webcam first of all we said the webcam resolution here so it is 1280x720 so this is basically just the resolution of our webcam we initialize our detector here which is equal to YOLO detector the model name is equal to none because we're just going to use a pre-trained model detecting people in the frame then we're going to import from our deep slower algorithm here so basically here we have our deep sword real time so first of all you need to install that so basically we can just go in and PIV install it so here is just pip install and then we just just have this deep sort and then we have this real time one so here again my requirements are always satisfied but you should go in and pivot install this deep sort real time module then we can just go inside that and import the Deep sort tracker we can then initialize it here so then you'll actually get a folder over here to the left where it basically just stores all these different things so we have like a Kalman filter we have like our linear assignment neural network matching so we're using the neural network to do the matching setting up some metrics for like how can we match the detections to each other so that is why it's called the sort instead of just the sword algorithm that we have used in another video we'll have this detection class which is basically just initializing the detections with the boundary box class like class confidence score and so on we have a Tracker and then we also keep track of our tracks so we can actually go in and set our tracks when we're running our tracker because inside of the tracks would actually like be detract boundary boxes from our YOLO detector here first of all we need to set up some different kind of things so these them just like default values like all these values down here or default I just want to show you like what are the different kind of like parameters that we can initialize our deep sword tracker with so max edge here is probably like one of the most important ones so this is like how many frames uh the real live the loudest tracker here to miss before we just discard that track or that boundary box or detection that we're like detecting and also tracking in the frame here we have an init which is basically just the number of detections that we need before we initializing a track or a detection so basically we're doing we need to do two detections and two consecutive frames before we actually initializing this as an object that we want to track over a number of frames then if we miss this optic here for five frames then it would like like delete that track if you're not losing track within like five frames you can like tune these brothers here for your own application but if we are doing two consecutive um two consecutive detections we will initialize an object every track and then we just drag it until we lose track of it four five frames and then we will act like track it really nicely around in the frame as you're going to see later on when we run the program we can also do some other different kind of parameters here we won't really go into details with that we have some like feature extractor we have a neural network botjet uh we have a metric here for like the matching cosine distance that we can set up and also the non-maximal suppression overlap function value here as well you can play around with that for your own application but the most important ones are the max H and the in in it so here basically we can just go down and again we can open up our webcam so now we're basically just going to open up our webcam um here we're going to read in a frame from a webcam we start our timers so we can time how many frames per second that we're actually like get with running this YOLO update detector plus tracking device Optics around with the Deep sword algorithm so here we're basically just going to do a detection we just have our detector.score frame we pass in the image we do a forward pass and then we get the results out so this will be the results then here we can call this detector.plot boxes we have our results we have our image and then we basically just pass in the height and the width here and also the confidence score so we can actually like um rescale our image we can set in the confidence score so if it's above a confidence score we will act like save our detections and then at the end here we're basically just returning our detections and also the image we're not doing anything with the image right now but you could actually go in and draw the detections you can maybe withdraw the detections in one color and then detract objects in another color and then you can see the performance and compare like how does the Optics take their work with and without the tracker so basically here we just have our predictions then we can just pass that into our optic tracker so this will be our tracks that it returns so basically here we have object tracker we hit dot update tracks then referring the detections and we also throw in the image here we can see that the boundary box expects to be in a list of detections so then we can basically just see here that in each Tuple we have this left left top Corner we have the width and height we have the confidence score and the detection class so this is basically just a format that we need our detections to be in when referred into our object update tracks here if you just go inside this function here it acts like hold all the different kind of like algorithms within deep sort it uses the kelman filter and so on first of all we need to do some embeddings of our detections and so on we do non-maxima suppression so that is what we're doing here then we update our tracker before we update a Tracker we're actually like running this prediction step in our Kalman filler and then after we have predicted uh our tracks we will actually like run hit run this update so if we don't have any detections it will just do this prediction State and then we would like to like just use the track from our act like common failure that is tracking the Optics but if we have some good if we have actually have some good detections that we want to use in our tracker we would like to like update our trial here with our detections and then we just have this Loop going over over again where we just do a prediction step and an update State uh update state state and step for our Kalman filter and then basically the Calvin filler just tracks these detections and bounding boxes over a number of frames so that's basically like the idea behind deep sword uh it's basically just this sorting algorithm running like common filter onto the hood and then we use some like deep learning metrics try to like keep track of the matches always like matching the boundary boxes and detections which with each other so basically here it just Returns the track so this will be all the tracked bounding boxes from our detections then we just have a for Loop running through all the tracks we get the track ID and then we also get the the bounding boxes here so then we just convert our ltrp to our bounding box so this will be our top left corner and the right bottom like the bottom right corner then we can just draw a rectangle and also put the text so we're going to draw a founding box around that with ohms V on our image and we're also going to put ID on top of our bounding box because right now we're not just doing detections we're also doing like tracking so we need to keep track of specific objects so if you just have a boundary box and we just like to have a detection here miss a couple of frames we have a detection here we can't really keep track of the ID we don't really know like what is the exact ID so this is more like tracking Optics around than just doing like object tracking and getting some some values of where the update in this single frame here so again we plot our boundary box and we also plot our ID on top of our boundary box so we can keep track of the correct boundary box you can also use that for like indexing different kind of like monitoring boxes if you don't want to know like how's people moving in the frame then you need act then you actually need to keep track of the IDE also if you're tracking some Vehicles if you have like some attendance system or something like that then you can actually have an ID or just want to know like how are people moving in the frame we end our timer here and then we just calculate the number of frames per second that we get we're just using intro if it escape on a keyboard we will turn into the program we will release our webcam and destroy all the windows that we have opened up with opencv so now we're going to run the program here and I can like see the results with this update detection and tracker so here it's just opening up the program now we can see the frame here we're detecting one person inside the frame so here we have around like 60 frames per seconds on my TPU and again we assign an ID on top of the boundary box so here we just keep track of me we don't lose track of me at all in the frame even though I'm moving around here we lost track you can see for five frames probably it should actually like tune that threshold because we're running like 60 frames per second so it's only like a couple of frames that we need to miss before we discard our track but here I'm just going to move around we can see how nicely it acts like tracks me around in the frame you should again fine tune the different kind of values we can also try with another um with another class here just to see how it works if we just move any cameras around instead so basically here we're just going to go up and change that to a cup and then we're just going to hit save and now we're going to run the program again I'll take the webcam down so we can act like track a cup here instead so here we can see the detection for the cop class this is actually like a class but it's under the cup class as well but here we can see it just keeps track of it really nicely even though we move it around here we lost track of it so this is the class this is really hard to actually like track around in the image frame because again this is not really as good as just a cup so here we see we keep track of it it is id19 even though I'll move the camera around here again is it true in the threshold values I can take up another cup here again we just snap on it directly we can move the camera around we still keep track of the object so again this is a really nice optic detector and a really nice tracker it can be used for a lot of different kind of applications and definitely like if you're using up detection you want to keep track of the IDS you want to like get that information as well then it's really good to use the tracker and used in most real live applications because this is just really go cool even though we have some inclusions we can even still like keep track of it you can see here the cup maybe the detection lost track but we can still act like detected and if you have a static environment and you want to have like for example like occlusion occlusion is coming in front of your camera so maybe like people walking in front of the camera in some kind of like um Warehouse environment then you can basically just set up like the maximum maximum number of images that can that can be missed you can just set that to a really high value and then maybe like even if you have occlusions for a couple of seconds uh they appear in your setup then you can still keep try every update behind that occlusion or like behind that person walking in front of your camera so this is also one of the advantages of using a Tracker together with an object detector if I just take the camera up here we can see that just running this update here we run around like 130 40 frames per seconds on my tpq and then when we act like doing our tracking we're down to around 70 maybe 80 around 70 frames per second but this is still really fast by getting these Avengers here off the tracker where we're tracking around the Optics in the image frame instead of just doing detections on individual frames so that's pretty much it for this video here guys it's been really cool this can be used for a lot of different kind of applications in the real world because often you don't just want detections on individual frames you actually want to keep track of like where are your Optics in the scene so this is really cool for that thank you guys for watching this video here and again remember the Subscribe button and Bell notification on that video I also have these courses here that you can check out I both have for omsv GPU so how we can actually use homes VP GPU we cover some really nice things all the different aspects within opensvp GPU support and then also have this course with the YOLO V7 model with architecture how we can train it how we can deploy it with opensv and onnx and pytorch so thank you guys about this video here and I'll see you next one bye for now
Info
Channel: Nicolai Nielsen
Views: 24,766
Rating: undefined out of 5
Keywords: object detection opencv, object tracking, object tracking python, object tracking opencv, how to track objects in opencv, opencv, opencv python, centernet, SORT, sort object tracking, deepsort object tracking, track detected objects, track people in opencv, people tracker, opencv tracker, tracking objects computer vision, computer vision, tracking objects, object tracker, object tracker opencv, yolov5 tracking, yolov5 object detection, yolov5 object tracking, yolov5
Id: IuVnYfg4vPQ
Channel Id: undefined
Length: 17min 47sec (1067 seconds)
Published: Wed Dec 07 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.