YOLOv8 native tracking | Step-by-step tutorial | Tracking with Live Webcam Stream

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

two days ago yellow V8 team dropped a news that they added native support for trackers in their peep package this is great news because most of The Trackers I worked with are terribly packaged by the way you can watch me struggling with byte tracker in one of our recent videos the link is in the top right corner and in the description below anyway I decided to put that tracking capability to the test and use it both in CLI and SDK scenarios and I even managed to build a small object detection counting app that uses yellow v8p package both for detection and tracking [Music] this time because we want to have access to live webcam we'll use VSCO instead of jupyter notebook we start in fresh directory and create python virtual environment first we activate it and install two python packages that we are going to need the first one is ultralytics this is the package that contains yellow V8 model here the installation might take a while because it uses quite heavy dependencies like opencv torch or torch vision before the second one is supervision it will give us easy bounding box annotation plus the ability to count track objects but that will come at the very end of the video so now just to confirm that everything installs properly we will use ultralytics yellow V8 CLI to run tracking on my webcam stream to do it we'll pass few arguments the first one are weights this argument basically dictates whether or not you will use small or large model whether or not you will use object detection or instant segmentation by the way when you will run that command for the first time like I'm doing it will download the weights on your machine so that will take few seconds depending on your internet connection the second one is Source it can be anything it can be image it can be linked to YouTube video and in our case it's zero which is the ID of that webcam on my machine and the last one is show equals true it will basically result in US seeing the inference on the screen instead of it just happening in the background CLI is apps absolutely great when you just want to execute quickly some standard procedure like in our case running object detection plus tracking on video stream is pretty standard stuff but whenever you just want to do something a little bit custom you are basically bound to using SDK and this is what this video is mainly about so without further Ado let's jump into the vs code and let's build object detection tracking and counting app that will run on our webcam stream we start by creating empty python file and we'll create a main function inside that file we'll pass for now and we'll make sure that that function will be executed if the whole file is run as a standalone script and we'll add some print statement inside the main function just to make sure that everything works let's jump into the terminal and test and sure it works as expected cool that was easy now we go back to vs code and at the very top of our script we add import statement to import yellow from ultralytics we can go into main function and create our first instance of that model I'm going to use yellow V8 L weights I'm using RTX 3070 so that GPU should be just enough to run a large model in real time I'm doing results equals model track pass once again our webcam as the source and I'm adding show through exactly how I did for the CLI we can see that the weights are downloaded once again but that's because we are using different size of the model previously it was large right now it's nano but when the downloading it's done it runs pretty well on the live stream takes like around 15 17 milliseconds to process single frame and that's good enough for us now I can really easily go from object detection to instant segmentation just by changing the name of the weights that I'm passing to Constructor of yellow model and you can see that once again we are downloading the weights and now we are running instant segmentation in real time similarly if you would like to use your own custom model for tracking you just pass the path to your local PT file into the same Constructor and if you don't have your custom yellow V8 model trained yet make sure to watch our tutorial you will learn how to do it the link once again is in top right corner and in the description below cool so now let's go back to object detection because that's what I want to do in this video and as we saw the current script is basically triggering the processing and when it's done we can continue the rest of the script I don't want to do it this way I would like to have access to every frame separately and to do it we pass additional argument to model track method called stream and set its value to true as the result the method will return python generator and will be able to use for Loop to process every frame separately every entry is yellow V8 result object and that object has certain set of fields that we will be accessing one by one starting with original frame because I plan to use custom annotators I need to have access to original image to be able to draw on it now just to confirm that everything works properly I'm using opencvm show to print the current frame on the screen and I'm adding a small braking mechanism to be able to just use escape to kill the app now I can run the current version of the script in the terminal and it will do nothing else than just display a video coming from my webcam on the screen so far so good so let's add bounding box annotations this is the moment where supervision comes into play so we add imp support statement at the very top of the file and then convert the result coming from yellow V8 into supervision detections we can use from yellow V8 to do that thanks to that conversion we'll be able to use all the tools in supervision peep package to process those detections and one of those tools is bounding box annotator just over the for Loop we initiate the instance of that class and pass few optional arguments into the Constructor those arguments will basically influence the appearance of the bounding boxes the list of all of those arguments along with examples is obviously in supervision documentation now you can go back to the for Loop and use bounding box annotator annotate method to draw our bounding boxes onto the frame so we pass the frame as the scene and our detections and we basically can test the script looks like we have a small typo here in text thickness arguments so let's fix it and rerun the script and after just a few seconds we see that we have working application that can do object detection on the live stream that's cool problem is however that we are using class IDs instead of class names so let's fix that so we can easily overwrite the default display text over the bounding boxes we just need to provide a list of labels now we use list comprehension to Loop over our detections and we will parse a text that will be displayed over the bounding boxes so we will use the name dict that is located in the model to access the class name and we will add a confidence and pass labels as an additional argument to our annotate method and when we run the script once again we can see that we have much more detailed labels on top of bounding boxes we see the class name along with the confidence everything that we saw up until now is basically regular yellow V8 SDK API but now we'll finally play around with the result of the tracker and the team decided that they will start tracker IDs inside boxes class for object detection models and inside mask class for instance segmentation models so let's access result.boxes.id value and store it in detections Tracker ID but before we do that we need to convert the pi torch tensor into the numpy array so I'm calling that CPU that numpy array and cast the numpy rate values from float to ins and now you can go back to our list comprehension and add the Tracker ID at the very beginning of the label and when we run the script we see that we have all the information we wanted we have the Tracker ID we have the class name and we have the probability the problem is however that the moment that there is no detections in the scene the script crashes and this is because we no longer have tracker IDs Pi torch tensor inside our boxes class so to prevent the script from crashing we check whether or not the bounding boxes ID is none and then save it if it's still there now when I run the script you can see that the Orange is there on the scene I can take it and the script still runs smoothly without any problems perfect one thing that is a bit annoying is that whenever I grab the object in the scene I produce this additional detection coming from person class this is obviously quite normal behavior for the yellow V8 model my hand is there so the model detects it however I would like to prevent that from happening we can do it by filtering out all detections with class ID equal to zero this is the class ID that is tied to person class in Coco data set and right now you can see that my hand is completely ignored the hand is there but the person class is not the integration of tracking into our script went really fast so I decided to go a bit crazy and add one more functionality into our application quite recently I recorded a video where I showed you how to count objects crossing the line and I used vehicles as an example you can watch that video it's visible right now in the top right corner and you will find the link in the description below anyway I decided that I will add the same functionality into our livescript and that is also quite easy the first thing that we need to do is decide the location of the line in my case I decided I would like the line to split the screen in half so the resolution of my video is 640 by 480. the first point will be located in the middle of the top Edge the second point will be located in the middle of the bottom Edge next I initiate the instance of line Zone class past my start and end points in the Constructor I initiate another class which is line a zone annotator that class will be basically responsible for drawing the line and displaying all the counters we pass very similar arguments like with bounding box annotator so line thickness text thickness text scale all the regular stuff the line is pretty long so let's break it into few separate ones for better readability now let's scroll a little bit lower and just under bounding box annotator method yeah that line is also pretty long so let's break it too and just under that line we will trigger our align Zone and we will annotate it so like I said first thing first we call the trigger method of our align Zone past detections as the argument and then we call annotate method of our align annotator past our frame and our align Zone as arguments and we are basically ready to test our script so you can see the line that's good the counters are zeros now we bring the Apple into the scene uh it has the Tracker ID number one we bring the orange into the scene it has a Tracker ID number three that's because our hand has the Tracker ID number two and we can freely manipulate uh those objects and whenever they cross the line uh we see that the counter is increasing cool that's all for today I really hope that you enjoyed that quick tutorial and we strive to produce it as fast as possible after the news dropped that the yellow V8 team added the trackers into the PIP package and I think that the functionality works really well it's very simple to use and I will certainly add it into my Armory because I absolutely hate the installation of most of The Trackers as I most likely already said like three times in that video obviously there will be be use cases when you would like to have that full control over the tracker maybe filter out detections before the tracker will see them stuff like that happens and in those cases most likely you will need to install the tracker on its own and basically write a ton of custom code to do that but like we saw in this tutorial you can actually write a pretty complicated logic and basically use a standalone tracker and by the way supervision works pretty well with that tracker too so keep that in mind anyway like you can see we try to be up to date with all the changes in all the major computer vision Frameworks so if you want to be up to date too make sure to like And subscribe and my name was Peter see you next time bye

Info

Channel: Roboflow

Views: 27,256

Rating: undefined out of 5

Keywords: yolo, tracking, multiple object tracking, multitarget tracking, traffic control, yolov8, bytetrack, deepsort, yolo tracking, yolo v8, object tracking, sort, yolo object tracking, yolov8 object tracking, yolo v8 object tracking, python real time object tracking, video object tracking, deep sort, vehicle tracking, conveyor tracking, computer vision tutorial, computer vision, yolov8 bytetrack, yolov8 native tracking, yolov8 tracker

Id: Mi9iHFd0_Bo

Channel Id: undefined

Length: 14min 16sec (856 seconds)

Published: Fri Mar 10 2023