Traffic Analysis with YOLOv8 and ByteTrack - Vehicle Detection and Tracking

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

a few months ago we created a tutorial showing you how to track and count Vehicles crossing a virtual line on a road today we are taking that project to the next level and we will build a real-time traffic analysis application that will allow you not only to detect and track vehicles but also to understand the direction of their movement it will count how many of them turn right left or travel straight and all of that on a very busy roundabout with four points of Entry this is such a cool project so back a lab because we have a lot to cover but before we start let me just tell you that the whole project is open source we made it accessible on GitHub as one of example scripts in supervision repository the link to the whole repo is in the description below inside you will find instructions showing you how to set up local python environment how to download the input video that I'm using in this tutorial and how to download the custom yellow V8 model that I trained on my very small data set of aerial images the data set itself is accessible on roboflow Universe the link to the data set is also in the description below okay enough of the talking let's dive in we start with pretty much empty python file it looks like there is a lot of code over here but it's actually not all I have is few import statements at the very top arguments processing logic in the main section I can pass Five arguments Pathways to my custom model source and Target video of paths set IOU and confidence thresholds I pass all of those arguments into video processor and Trigger process video method and if we jump into video processor implementation it's pretty much empty class there are three methods all of them blank and if I go to terminal and Trigger the script with some example arguments all I get is print statement with names of those arguments and the values so now we can go into Constructor remove those print statements and assign my arguments into class properties and I can now go up and add one more import statement so we will import Java from ultralytics here is a documentation if you want to take a look and we'll create an instance of that model as one of the fields in my class instantiate the model with Source weights path as an argument and now in process video function we will start to write some logic so we start with logic to Loop over the frames of the video we are using supervision get video frames generator it will return every frame as the numpy array we can then pass that frame to process frame method get some result and display that result using opencv I'm adding some additional logic to conveniently kill the script when I press q and now we can jump into process frame method the first thing that I will do here is use our model to run the inference on our frame I'm passing confidence and IOU thresholds and setting verbose to false to prevent additional logging then I'm using supervision to convert the result of the model into detections class and we will use that class to filter and annotate so it will be quite important now we can go into the Constructor and create an instance of bounding box annotator that bounding box annotator will draw boxes around the detections with additional label and its annotate method is actually taking detections as an argument we can decide on the color scheme of those boxes so I'm using default color palette from supervision I decided to specify that the top of the file because we will do a lot of cool annotations in that script and I wanted to have some reference so we created bounding box annotator instance in the Constructor now we are just creating a copy of the input frame using the annotate method that I mentioned and we can run the script okay we see that the model runs as expected there is a little bit of flickering especially with underrepresented classes like bus or track there are also some false detections it probably won't matter because later on we'll filter out detections and keep only those that are on the road okay now it's time to add tracking it will allow us to assign unique ID to every vehicle visible on the scene and make that ID consistent across the whole video to do that we'll use by tracker a Tracker that we use already in several videos on this channel however in the past we used original byte track implementation and now by track is available in supervision so I decided to use this opportunity to show you how easy it is to plug it in into quite complicated project like the one we built today okay so the first thing that we need to do is to create an instance of byte tracker so we do that in the Constructor so that we will be able to access that by tracker in all the methods now we can jump straight into process frame and right after the inference we add one more line and pass our detections into update with detections method of the tracker it will update the detections and add Tracker ID as additional field and the last thing that we need to do is jump into annotate frame method and make sure that we will display that Tracker ID that's why I'm using list comprehension to create a list of labels pass that list into annotator that's it and when we run the script we see that bounding box annotator is displaying Tracker ID instead of class ID the color of the box is still tied to class of the vehicle but we'll change that in a second and make sure that the color of the box is tied to the zone that was triggered when the car drove into the roundabout but before we do that we need to specify the zones here I'm pasting zones coordinates defining the colorful rectangles you see on the visualization and now I will use those coordinates to instantiate polygon zones it is a utility coming from supervision library that will allow us to easily tell whether or not the detection is in or outside the zone to instantiate polygon Zone I need few arguments of course the polygon with coordinates but also the resolution of the full video frame and triggering position triggering position allows us to Define which part of the bounding box should be used to calculate whether the detection is in the zone we can use the center of the bounding box or the bottom center or any corner of the box and if that point is in the zone the hole detection is counted as in in case of aerial images it makes a lot of sense to use the center of the bounding box so I'm passing position center as the default value of this argument the whole function will return the list of polygon zones so I just need to go back and import list and Tuple typing and now you can use list comprehension to Loop over polygons and like I said I pass polygon frame resolution and triggering position to the Constructor now we can go a little bit lower into the Constructor of video processor and we'll use one more utility coming from supervision library video info to extract the resolution of the frame of our input video it's very easy we just use from video path and pass one of the arguments we got in the Constructor and now I can instantiate zones in and zones out to simplify the implementation I created two lists of zones the entry zones and the exit zones later on in the video we will use those zones to create two types of events car entering and car leaving the roundabout we will use that information along with the Zone ID associated with the event to figure out whether or not the car drove straight or turn left or right now we just need to specify all required Arguments for initiate polygon Zone function first we initiate zones in and just below we do the same for zones out great our zones are defined now it's time to draw them on the video so to do that we will first Loop over zones in and zones out and we'll do it at the same time using python zip function we will also use enumerate to get Zone ID an actual drawing will be done using draw polygon function from supervision Library so we call draw polygon pass the coordinates of our end zone and the color that we want to use and just a little bit below we will do the same but for the out Zone we can now take a look at the result and the zones are being drawn looks awesome now we can start to think how to filter our detections based on those zones we will place our filtering logic inside process frame function and just below the tracker we will create for now empty list of detections that triggered zones and now we can Loop over entry zones and you see that polygon Zone has a method called trigger so we Loop over the zones and call Zone in trigger pass our detections and as a result we'll get a Boolean numpy array with true false values depending on whether the detection triggered the zone or not now we can use that mask to filter detections and keep only those that actually trigger the zone so we do that for every Zone as a result we get a new detection object for every Zone we append those objects into this empty list and now we can use detections merge to merge multiple detections objects into a single one I will just do a little bit of cleanup I guess we don't need to have separate variable for that mask we can pass it directly into detections and now what you see on the screen is we only keep detections that are currently in the zone all other detections are discarded of course what we would like to have is keep all detections that have been in any of those zones at any moment in time we can get that information directly from the Zone but we can create a piece of logic that would manage that information for us let's scroll to the very top of the file and we'll create new class that will be responsible for storing that information let's call it detections manager of course we will create a Constructor for that class and inside that Constructor we will create a dictionary let's call it Tracker ID to Zone ID and give it digged into int type and like the name suggests it will be responsible for mapping information about the tracker to the zone that was triggered by detection with this tracker the update method of this manager takes all detections and a list storing detections that triggered zones for this specific frame as a result we'll get a new detections object where we will only receive detections that have triggered an entry Zone at any moment in time and inside that function will have two nested four Loops first we Loop over detections then we Loop over tracker IDs in those detections and we update our mapping dictionary with a pair of Tracker ID and Zone ID but instead of assigning the key and value of bracket notation we do it with set default function so what's the difference between updating python dictionary using standard square bracket notation and set default function well set default will only update dictionary value if the key does not exist in the dictionary while square bracket notation will update the value even if the key is already there this is something that we would like to avoid as the car can pass from multiple zones and we would like to set the value of the dictionary only for the first Zone that was triggered we could use extra if statement to check if the key is already in the dictionary or you set default and do two things at the same time here we are using numpy vectorize function to map our tracker IDs into Zone IDs all you have to do is pass the Callback function that will perform the actual mapping and the input numpy array that we want to map here instead of accessing the value using square bracket notation we will use the get function we do that to prevent the key error that would be raised if we pass a Tracker ID that is not defined in our dictionary we need to take that scenario into consideration because there are multiple cars that will never cross an entry Zone like the cars that are in the middle of the roundabout at the beginning of the video the get function allows us to handle those cases and pass the default Zone ID value I decided to use -1 as our zones are numbered starting from 0 and up so -1 value should never occur as a valid Zone ID now at the very end I can use supervision to filter out any detections with class ID equal to one we do that as we are only interested in visualizing vehicles that have passed through any entry Zone initial implementation of our manager is ready now we can plug it in into our processing pipeline the first thing we need to do is to create an instance of manager in the Constructor now we can scroll a little bit lower into process frame remove detections merge and Trigger manager update and the hardest part is already behind us we can persist the information about the zone that was originally triggered by the vehicle and filter out any cars that didn't trigger Amazon now to improve the quality of our visualizations it would be awesome if we could draw the path traveled by our vehicles and to do that we are using Trace annotator it's a feature that is not yet officially released in supervision but you can still access it if you use release candidate packet version at least at the time of that recording now I can go into Constructor and just below box annotator create an instance of Trace annotator plus the same colors to find thickness and the length of the trace and the only thing that is still left to do is to go into annotate function and Trigger annotate method and that small change alone made our visualizations a lot cooler and more informative because we are always displaying the position of the vehicle up to 100 frames in the past you can easily see which cars are moving slower and faster now the last thing that we have to do is identify exit events so that we can close travel paths and understand how many cars flow from an entry Zone into an exit Zone to do that we go back to our detections manager and add new argument into update function we will add detections zones out which is a list of detections associated with triggered exit zones now we Loop over the detections in detection zones out and later Loop over every Tracker ID in those detections and check if the Tracker ID is in tracker IDs to Zone IDs if so that means that this Tracker ID already triggered one of the entry zones so let's find out which zone was that we can do that by simply accessing the value in the dictionary now we go back into Constructor of our detections manager and create new field it's going to be recorded paths it's also a dictionary but a little bit more complicated let's use a zone out as a key and the value is another dictionary where Zone in is a key and the value is a set of tracker IDs that travel this path now we go back into our nested for Loop once again use set default use zone out ID as the key and empty dictionary as the value then access this dictionary and once again you set default but this time use Zone ID in as the key an empty set as the value and finally add our Tracker ID into that set now we can scroll back into process frame function and we need to perform similar calculations as before only this time for our zones out so first we update our for Loop to Loop over zones in and out at the same time using zip function and then we create detections zones out capture the information about triggered zones and add it into our list finally we added that as an argument to manager update and we run the script we use a small print statement to display the state of closed paths in our manager and we see that the moment redcar enters yellow zone the dictionary is updated similar with the blue car looks like everything works as expected so far so good using a print statement to display the state of the counters is kind of lame so I think that annotating the frames and creating separate counters for every entry Zone would add a lot of flavor to that project and make everything look a lot cooler so to do that we are iterating over zones out and additionally we are using enumerate to get the ID of the Zone we use supervision get polygon Center utility to calculate the center of every out Zone and we are going to add our account counters exactly in that place now we take a look if a given zone out ID is present in the recorded paths dictionary if not then that means that there is no path finished from any entry Zone to that exit Zone and in that case will not display anything however if the key is present then we are going to display our counters so now we are looping over every Zone in ID that is present in the dictionary and count how many object is in the set and use draw text utility from supervision to create nice looking counter with the color associated with the source Zone color located in the middle of our exit zone now the problem is like in the case of the yellow zone there are multiple Source zones and our code as it is right now would draw one counter over another so to prevent that we are adding a one more enumerate and shifting text anchor y coordinate by 40 pixel for every Zone ID that we want to display and at last we can take a look at final result vehicles are being filtered and annotated based on initial entry zone They triggered we can see how fast they move thanks to trace annotator we can easily see how many cars came from every direction overall I got to admit I'm quite proud of this project and that's all for today of course analyzing traffic flow is only a beginning you can do so much more you can calculate the speed of moving cars you can estimate traffic density or even you can automatically detect traffic violations let me know in the comments if you would like me to continue this project and show you what else is possible and please let me know in the comments if you like the format of this video I usually use Google collab and today I worked from IDE I'm super curious which one you think is better and if you find that video useful make sure to like And subscribe and stay tuned for more computer vision content coming to this channel soon my name is Peter and I'll see you next time bye

Info

Channel: Roboflow

Views: 26,334

Rating: undefined out of 5

Keywords: Python, Computer Vision, Object Detection, Supervision, Roboflow, Tracking, Zones, Traffic Analysis, Car Tracking, Aerial Images, YOLOv8, ByteTrack, Real-Time Object Detection, Real-Time Tracking

Id: 4Q3ut7vqD5o

Channel Id: undefined

Length: 23min 9sec (1389 seconds)

Published: Wed Sep 06 2023