Vehicle and Pedestrian Detection Using YOLO-NAS and Kitti dataset

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone my name is arohi and welcome to my channel so guys in my today's video we will perform vehicle and pedestrian detection using yonas algorithm and kitty data set but before moving to the Practical implementation let's understand why vehicle and pedestian detection is important then we will discuss about the yonas algorithm and the kitty data set in brief after that we will make our vehicle and pedestrian detection application so let's start so vehicle and distant detection are critical components in the development and deployment of autonomous vehicle and Smart City applications these Technologies enable machines to recognize and respond to the presence of vehicles and pedestrian in their environment ensuring uh safer navigation and uh interaction with the urban settings okay so in the context of autonomous vehicle the vehicle and pedestrian detection system utilize a combination of sensors cameras and AI algorithms to accurately identify and track the positions of other vles pedestrians cyclist and other potential obstacles on the you know on the road so this information is crucial for autonomous systems to make informed decisions about speed directions and when to apply brakes among other actions to avoid collisions and ensure the safety of all Road users for Smart City applications vehicle and pedestrian detection Technologies are employed to enhance Urban infrastructure and traffic Management Systems by integrating these Technologies into the traffic light or surveillance systems or other Urban uh infrastructure cities can monitor and analyze traffic flow and they can also analyze the pedestrian movements more efficiently right so this enables uh smarter decision making regarding uh traffic light timing adjustments congestion management and then the optimization of pedestrian Crossings okay so this all ultimately you know will lead to a improved traffic efficiency and pedestrian safety so that's how this vehicle and pedestrian uh detection application um can be helpful so now let's talk about the algorithm yolon Nas algorithm yolon Nas is a state-ofthe-art real time object detection model developed by DC aai so this yonas model is generated by d's auton knac neural architecture search technology and this model stands out among all the object detection models available in the market by offering the best trade-off between the accuracy and latency so this YOLO Nas surpassing other models like YOLO V5 YOLO V6 YOLO V7 and YOLO V8 with its unbeatable accuracy and speed combination okay so we can see in this graph over here you can see that all versions of yonas which is small medium large with quantization and without quantization achieves good accuracy also you can see that the math value is increased as compared to the previous sorta model which is Yolo V8 all right and guys um the use of Nas is one of the secret uh sources to the success of YOLO Nas as none of the previous YOLO family of models were generated by it and the other factors that make YOLO Nas the fastest object detection model to date are it's a quantization friendly basic blog so what it is is yonas introduces a new basic block that is friendly to uh quantization and and uh which address one of the significant limitations of the previous YOLO models and they can go to 8 bit easily without losing a lot of accuracy also yolon Nas leverages Advanced Training schemes and post training quantization to enhance performance so these are the reasons what that's why I have decided to work with this yonas today and now let's see the data set now type K Kitt data set and then open this first link so this is the uh Kitty data set website and kitty data set is a popular data set used for computer vision tasks particularly in the context of autonomous driving and guys since it release uh Kitty has become one of the most cited and used data set in autonomous driving and computer vision research and this data set is collected from uh variety of urban suburban and then Highway environments offering a broad spectrum of real world scenarios and this diversity helps in training model that are capable of operating under you know different settings and conditions which make them more versatile and reliable okay and this Kitty data set uh include detail annotation for vles pedestrians cyclist and some other uh objects in 3D and 2D format we can perform 2D object detection 3D object detection we can perform tracking we can perform lots of um this computer vision task using this data set and if you want to see the data set so first just okay before move showing you the data set like 2D and the 3D data set let's click on this setup if you'll click on this setup you will see the you know the entire setup of the car on which they have collected this data and uh you can see over here the sensor setup all the sensors and their details have been mentioned over here and what all things they have used to set up this car to collect the data all that Set uh you know sensor setup detail you can see on this page okay and uh this over here you can see this um stereo flow and depth what all these things are this is just uh you can say that this Kitty data set include a variety of data types okay so we will talk about that also in a minute but for now just see this object under this object you can see 2D object 3D object and then the board's IE view what it means is that you can work with you it has 2D data uh object anot annotations 3D object data annotations and the bird eye view okay now for our today's problem we are performing object detection using honas which is a 2d object detection model so for that we will go to this 2D uh object and you will be redirected to this page okay and on this page you can see this first link download lift color images of object data set which the size is 12 GB we will download this file and how you can download it just click on it and this download will start and you will get a zip folder inside your downloads folder and to download the training labels you will click on this link download training labels of object data set and the size is 5 MB when you'll click on this you will get a uh ZIP file inside your downloads folder now let me open the downloads folder and show you where this file is okay so let's go to the downloads and here you can see this one uh this one okay these are the images and I have unzipped it and this is the folder and this is the the other folder the training labels and after unzipping it this is the file now let's open these folders one by one and see what all things are there so was first I'm opening this training folder uh images folder data object image folder inside that you will get two folders training and testing and let's open training folder inside that you have one one another folder and inside that ha you can see all the images so we have 7481 images training images let's open one image and see so these kind of images you have okay training images now let's go back and go to this testing inside testing we have test images okay so we have 7518 uh test images so guys this is how uh these are the images okay now let's see their corresponding annotation files text files okay so for that we are clicking on this label fold folder and inside that see guys here you can see that we only have training folder under images we were having two folders training and testing but uh under label we are only getting the labels of training data set now let's open it and you can see we have 7481 annotations which are in text format so we are not having the annotations for our uh test files and we have train uh this annotations for this uh uh uh training data only now let's open one of The annotation file and let's try to understand what kind of values we have in it let's open this annotation file so this is the first annotation file and this is a class name and then we have these different values over here now let's understand what these values are so guys each line in The annotation file describe one object in a scene so in this text file we have only one line that means there is only one object in this image so suppose if you have a image which have three different objects then you will have three different lines and one line will give you the information related to one object so now let's understand all what all uh components are there in this annotation file for the so the first one is this one is a object type and in this case we have pedestrian and this indicate the type of object annotated and see uh so guys in this Kitty data set the different uh objects are car pedestrian van cyclist truck and then there is a miscellaneous class also tram person sitting and don't cares okay so these are the different uh eight classes uh Kitty data set is trained on and this don't care class simply refers to a origin of an image that contains object which are uh not of interest for our specific task okay so after that we have the this 0.0 now what it is so this is truncation basically so what it it measures the value is 0.0 and this truncation measures how much of the object is visible the truncation is typically between zero 0 means fully visible and one means not visible so in our case it's zero that means the object is fully visible and then we have the next value which is this zero occlusion it indicates the level of occlusion of the object so uh let's suppose over here we having a zero value so that means the object is fully visible and if if over here we have one that means the object is partially included and if we have two over here as a value which means that the uh object is largely included and three means unknown okay so here you can have 0 1 2 or three value and I've told you what all values uh what every single value means okay and then have this we have this minus 0.20 so this is the observation angle 0.20 so the observation angle of the object so this uh minus 0.20 is a observation angle of the object relative to the camera view okay and then we have these four values as our bounding box so guys remember so as I've told you we are going to use yonas model and yonas model is a object detection model and we can perform other task also with the help of yonas like image classification and segmentation but for our today's class we are focusing on object detection so for object detection we need a bounding box bounding box simply means a rectangular box around the object so for that we need four coordinates so these are this is the first coordinate second third and fourth these are the four values so these four values Define the pixel coordinates of the bounding box surrounded the object in the image so it is given by four values first one is Xmen and then we have y Min then we have X Max and the fourth one is y Max okay so these are the four values after that we have these three values 1.89 0. this and this now these three values tells us this is basically a 3D dimension of the object okay 3D dimension means height uh width and the length so height over here is 1.89 M width over here is 0.48 M and then depth over here is um sorry the length over here is 1.20 M okay so then we have this after that we have location these values over here these three Value First value second and this third value this tells the 3D location of of the object in the camera coordinates so which is right down and forward three okay so um after that and it is also in uh meters guys after that we have this last value which is 0.01 this 0.01 tells the rotation of the object around the Y AIS which is up in radians and that's it guys so this is in uh this Kitty data set every annotation file will have these kind of detail you will have object type then occlusion type and then uh the movement in the 3D um 3D dimension 2D Dimension object bounding box whether the object is included or not so all these details you will get for each uh object okay but now the next thing is we have downloaded the data right so now once you download the data the our next task is to convert convert these annotations into the format which YOLO model accepts right so for our YOLO model what what things we need we only need the class label and the bounding box so we need this and these four values okay so apart from these things uh bounding box coordinates and the class name we don't need any other detail so you now you have to write a script which will help you to have the annotations in the YOLO format now let me show you I have converted these annotations into the YOLO format now let me show you that okay so guys for that let me open the folder in which I'm working so I have a separate folder with this name YOLO nasky data set so inside this folder I have a data set folder so guys right now please don't worry about the other folders I'll tell you in some time what these folders are and what why we are using these folders okay for right just for now just see that I have a folder with the name of data set just create a folder with the name of data set inside that folder here remember this data object image 2 and data object label two these are the folders We have downloaded from the kitty data set in my downloads folder have shown you right okay let's go back to the downloads folder you can see so this folder and this folder when we extracted them we got these two folders so so these are the two folders which I have pasted inside the data set folder okay all right so once you have that uh now let's open this image uh folder so guys training and testing folders we already had from uh the kitty website now what I have done is I have created a separate folder with the name of validation and from training images see uh earlier if you remember in the download let's go back to the download and go back to the image folder open it and go to this training and image to here you can see we have 7418 images so what I have done is I have I have splitted this data into two parts one I have used for training and the other one I have used for uh validation so I just um took 481 images from the trining data set and I have created a separate uh folder with the name of validation and I have pasted these 481 images in that okay so that's what I have done over here over here okay so created this validation folder and here I have pasted those 480 images here okay now in the same way now let's go to the this label folder in the same way uh now is over here I have the annotations which are in the YOLO format I have converted the annotations now because because uh what we want for YOLO for YOLO algorithm we need the class name and the bounding box coordinates so I have that over here train labels in the same way I have created another folder with the name of validation labels and in this Val labels I have pasted the annotations files which are related to those 480 images which we have pasted just in the validation folder okay so let's open it and you can see we have 480 annotations in it and in train labels we have 7,1 IM um annotations over here now let's open the very first annotation file and see now you can see these are the bounding box coordinates and this one is the class ID of The Pedestrian object okay and now our data set is in uh YOLO format now we can use the algorithm on it okay so now now let's come over here so guys here you can see this Jupiter notebook this is the Jupiter notebook in which I have written the code to train our model save the train weights and then to make predictions now guys if you want to run this code on Google collab I'll show you the steps first how to perform this um uh training on Google collab okay and after that I will show you how to work it on a local computer now let me open the Google collab notebook first and show you what I have done over there so let's go here and this is the uh my uh Google Drive and in my Google Drive I have created this folder and in this folder you can see this is the Jupiter notebook I'm talking about where I've done all the training and rest of the part and this is the data set folder this is the same data set folder which I have just explained you you can open it and see inside this label you will have labels for training and validations and inside the images you will have training validation and test images okay so your data should data set should be over here like this this is our data set and this is our code okay now let's open this Jupiter not uh this notebook and see so guys first thing is uh if you're working on a Google collab notebook so you have to select the GPU first and how you can do that just go to edit notebook setting and from notebook setting under this Hardware accelerator select this GPU and then save okay after doing that you can run this command and you can see if this if GPU is being used by a notebook okay collab notebook and here you can see uh this is the GPU we are using and guys same thing if you want to see this GPU detail with the help of this py code then you can run this import torch and then if Huda is available get the name and this you can see the GPU name is Tesla T4 and GPU is available true means we are this notebook is using the GPU okay so first this is the first step after that in this line I am just um mounting my Google Drive okay and guys after writing this in this line here so here what I'm doing is uh I am just cre creating a symbolic link which so with the help of this is basically just a link so instead of writing this whole thing every time whenever I want to provide the path of the Google Drive I have created a link which means is instead of writing this whole thing every time I will just write my drive and the entire path of the drive will be given okay will be used so you can see in next line only so CD if I have created this link that what then what I have to write over here then then I have to write this whole thing over here and then we will uh and after that I will say that I want to enter in this folder but now after creating this link I'm just writing my drive which which means that I want to uh I'm writing this location only okay this path only so my drive and then the folder in which our code is and our data set is I have entered in that folder using this okay now once you enter in the folder now uh guys to use this uh yolas is very simple you just need to install super gradients package and you can install it like this pip install super gradients and once you install the super gradient package okay wait after that install this IM utils package also that's it guys now you your environment is ready okay now you can train your model now first thing whenever you want to train your model the first thing is to define the trainer so how we can Define the trainer so this is the cell this is how you can Define the trainer uh first we are importing the trainer then this checkpoint directory checkpoint directory and this over here experiment name so guys what it means is it simply means so it simply means that um when when we will start the training after the training we want to save the trained model in this checkpoints folder okay inside this checkpoint folder you will get another folder with this name and inside this folder you want to save your trained weight your last weight best weight all the logs everything related to the training you want to save at this location so you can change them you can give any names over here okay so uh this is the first thing after that now let's talk about the data sets and the data loaders and guys super gradient is fully compatible with pyos data sets and data loaders so you can use your data loader as is uh if you want to know more about these things you can visit these links if you want to learn uh more about working with super gradient data sets and data loaders and configuration files you can check this link over here okay so here I am defining the data loaders I'm importing the data loaders so after that inside this data set parameter you have to provide the path of your training images and labels validation images and labels and then um your classes so these are the classes so Kitty data set have these classes car pedestrian van cyclist truck miscellaneous tra person sitting and do care so guys from here from this sale everything if whether you're using a Google collab notebook or you are working on your local computer all the stepss are same so I am am going to my Jupiter notebook which is on my local computer and I will show you the further steps on that so here so see this is the trainer and after that here we have defined the path here you can see the path of my training images path of my training labels and the path of the validation images path of validation labels and this is a list of classes which we have in Kitty data set okay so now let's create a uh data load of for training data set and validation data set so for that this code is for training data set and this is for validation data set and you can see here inside this data directory uh in over here we are providing this value what this value is this is the dictionary we have defined above from data set parameters we are taking data set directory and we are providing it here in the same way train image directory train validation directory so we are providing all the values like this so so this is the uh train data loader and validation data loader so after that now let's um start with a model Let's uh pick the model we want to use and we are not training a model from scratch so what we'll do we'll pick the pre-train model okay so there are yolas have three different types of model one is medium and then small and the large so over here I'm using yolon as l l means large model and then here in number of classes we are providing the number of classes which we have in our data set and then pre-train weights Coco means the yonas model large model which we are using it is trained on Coco data set okay so inside our model variable we have our model now now guys now over here we have to Define some training parameters you can change these training parameters okay to see how your model work when you change the parameters you can find tune the values okay so for now uh here you can see we have all the things optimizers you can Define and then uh learning rate and initial learning rate warmup EPO okay so these kind of values are here and guys one more thing to remember that when you use this code over here you have to set the number of classes in two places one is over here over here inside this see the number of classes you have to define the classes here and the other thing is you have to define the classes over here also now these are your training parameters which will be used when you'll start training the model now let's train the model to train the model you only need to call this trainer and trainer do train guys remember this trainer this was our first cell let's go up and here you can see see here we have imported the trainer and inside the trainer we what we have written that we want to save all the trainings inside a checkpoints folder we want to have this folder and in this folder we want to save all the logs and the trained models and any other file if it is there which is related to training a model everything should get saved in this so let's go back here yeah here so model we are providing the model which model is this this is the model which we have defined here above here this model okay so model we have defined training parameters we have given and this is a training data and then the validation data so we have started the training I have trained this model for 200 EPS after training the model you will get a checkpoints folder now let's go back to our code and hit here you can see this checkpoints folder right so let's open this checkpoint folder under that you have this folder and inside that here you can see all the files and the models which are related to the training this is the best model this model we will use for making the predictions and this is the latest model this is the best model this is the average model and here are other files you can uh see all these files and you can learn more about it so this is how training we can train our model now let's use this uh best trained model and make predictions so for that I am importing the models first and these are the classes we have okay eight classes and here inside this best model variable I'm using this yolon L model the number of classes these and then this is the weight so from the checkpoint folder where we just saved our um train we weights so from there I am picking this best model and I have a folder with the name of test images from that uh folder I am picking this image and I want to make predictions on it using this method and do show will show you the output on the screen now let me show you the test images folder what I have over here test images folder so inside that folder I just pasted few images from uh images for testing so that we have okay this is how you can make predictions on single image now let's suppose you have a folder and in that folder you have multiple images and you want to perform predictions on all those images then this is how you do write the name of that folder you have to write the entire path so in my case my Jupiter notebook and this test images folder is in same directory so that's why I've written the folder name only but if your folder is at some other location then you have to provide the path accordingly so this is the folder which have images and then I'm um using the predict method on it and then I want to see all the images with the detections which are there in that folder so this is the first image and this is the card detected this is the second image and this is the third image fourth and fifth okay now let's suppose if you want to test the video then how you can do that so I have two variable this is my input video this is the name of the input video and this video is in test videos folder and I want to save the output video with this name inside this folder right then the model to device predict method input video I have given so I want to make predictions on this video and then I want to save for saved you can write do save and this is the file name uh with which I want to save the output okay so now let's see the input video first and then I'll show you the output video so input video is in testore videos folder so this is the input video okay just give me a minute so this is one of the input video now let's see the output video so for that I am going to this output videos folder and here three out here you can see the output we are detecting the pedestrians we are detecting cars and vans okay so in the same way if you want to see the output on some other videos I'll show you that as well let's see the another example so this time let's talk about this video this is the input video let me open and show you first okay so this is the video now in on this video let's perform uh detection so inside output video this is the output okay so guys this is how you can use yolon Nas for object detection so and guys if you want to know more about yolon Nas architecture then you can visit this page so I'll share this link in my description section you can uh go you can you can check this page okay and if you want to you can visit this da a super gradients page over here and here you can find different notebooks and tutorials and uh you can learn more about the different data loaders and how you can work with your custom data sets in detail and what all um different parameters are there and what all different things are there which you can do with your as all those things you can uh check on this super gradients GitHub page and guys also uh you can check this super gradient page over here so what is super gradient this is um py based deep learning uh framework open source framework and using which yolas uh using which yonas has been trained okay so you can learn more about it from here and after that there are uh other models like uh today we have discussed about only object detection but with yolo Nas we can perform semantic segmentation pose estimation and image classification you can try these models also and uh in my upcoming videos I'll talk about I will work on the semantic pose estimation and image classification models but till then if you want to try these models you can try that okay so I hope this video is helpful and guys the link of this code is given in description section you can check that code and I hope this video is helpful and guys if you like my video please like share and subscribe my channel thank you for watching
Info
Channel: Code With Aarohi
Views: 862
Rating: undefined out of 5
Keywords: objectdetection, computervision, autonomousvehicles, yolonas, yolov7, yolov8, yolo
Id: Dv-RLQMXtVY
Channel Id: undefined
Length: 34min 53sec (2093 seconds)
Published: Thu Feb 22 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.