Oriented Object Detection using YOLOv5-OBB: Inference, Training Using GPU and Implementation

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in the previous tutorial we have used YOLO V5 obb to do inference using CPU but for real-life application to maximize performance of the YOLO GPU should be used in this tutorial we will do some modifications in the code to make YOLO V5 obb run using GPU and we will also train YOLO to do inference on custom classes finally we will create a very simple application using YOLO V5 obb in this application we will make the camera rotate by the angle that was detected by YOLO obb so that the camera aligns with hand-pointing Direction we will begin this tutorial from installing an annotation tool which we will use to create training data firstly install Python 3 pip pip is a package management system used to install and manage software packages written in Python next install the piqui 5 Dev tools package this package contains various support tools for piqi5 developers such as user interface compiler and resource file generator install lxml lxml is a library for processing XML and HTML in Python now we are going to set up annotation tool we will use this label image to Repository thanks for the authors for sharing this great work on this page we can find instructions on how to install and use this tool clone the Repository before annotation we have to create a label move to the data directory which is inside the label image 2 folder open the predefined classes text file add hand label on the top of the labels list now go back to the label image 2 folder and open a terminal execute the label image Pi script from the file menu click on the open dear button select a directory in which you have images to annotate and click the open Button select a rotated rectangle button firstly draw a small rectangle using the left button of the mouse then using the right button rotate it to desired rotation angle we can move rectangle itself when the cursor is on the rectangle and we can alter vertices length or rotate it when the cursor is on the vertex of the rectangle when changing length of sides and rotation angle try to fit your object inside the rectangle as precisely as possible once you are satisfied with annotation press Ctrl and S keys to save The annotation file The annotation file will be saved in the same directory with the images as we can see The annotation file contains Center coordinates of the rectangle length of the sides and rotation angle we have to do this procedure for all images we will use for training after we have created a data set we still have one thing to do as has been described in Yolo V5 obb repository The annotation format should be represented by coordinate of each vertices category and difficulty to do the conversion we will use the script in this Repository so clone this Repository execute the vote to YOLO V5 obb Pi script note that we have to specify the directory of XML files we want to convert in the file of class names if the conversion goes successfully text files will be generated as we can see it follows the format required by YOLO V5 obb in the previous tutorial we installed YOLO V5 obb using CPU but in this tutorial we will use GPU as I have described in the previous tutorial for the installation we should execute this command so let's do this as we can see an error emerges this error is discussed on several forums including this one it seems that in recent pytorch versions THC headers are deprecated let's see what modifications we should do to fix this error the left code is an original code and the right code is a new one we have to include these five header files Odin is the foundational tensor and mathematical operation library on which everything else is built now let's try to install it one more time we still have an error this time it seems that we have several undefined functions and statements this happens because function names and auton are different from those in THC this is explained in this page we can find that for example the H seal diff should be changed that sealed div functions such as th Cuda check and th Cuda Malik are also mentioned here so we can use this information to fix our code let's see what modifications we have made in the poly nms Cuda code firstly in line 214 we have to replace th sealed div with its sealed div secondly th Cuda Malik is replaced with raw Alec Huda caching allocator performs memory allocation using the Cuda memory allocator memory is allocated to a given device and a stream thirdly replace th Cuda check with C10 Cuda check fourthly replace th Cuda free with raw delete now let's execute the installation command once again we have successfully installed YOLO obb now let's execute inference script execute the detect Pi script note that unlike previous tutorial we will specify 0 for device argument this means that we will use GPU number zero other arguments remain the same we get a runtime error this message says that we are trying to do calculations using tensors defined using GPU and CPU which is prohibited let's check the code the problem seems to be in the nms rotated wrapper Pi script here is the line where the error occurs but the problem actually lies in the line above here tensor is specified using CPU whereas we specified GPU as an execution device so we should fix this problem to fix this problem we should modify three files detect Pi script General Pi script and nms rotated wrapper Pi script let's see what modifications we have made on the left side we have the original code and on the right side we have modified code we should modify the non-max suppression obb function we modify it to take device type as an argument in the general Pi script we modify non-max suppression obb function definition and also modify obb nms function to take device type as an argument finally in the obb nms function we are modifying function definition and also modifying torture range statement to take device type now we are ready to execute the inference task open a new terminal and execute the detect Pi script note that in the previous tutorial we specified CPU as a device but this time we are specifying 0 which means that we will use GPU number zero after inference is done we can see results in the runs folder as we can see inference is successful before executing training we should place our training data properly in the data set folder we have a hand data folder the hand data folder contains training data and data for validation both of these folders have the same structure in the training folder we have an images folder and a label text folder in the images folder we have images for training in the label text folder we have annotation files note that name of images in corresponding annotation files should be the same in the training data yum file we specify a relative directory to the train folder and validation folder even though we will train only one class it is said that training process is more stable when there are more than two classes so here we have added the class background as a dummy class to do training we should run the training Pi script when running this script we should specify wait file data set hyper parameter file batch size device type and image size this time we set batch size 4 and epic number 80 but these figures vary depending on your GPU specification and data size after the training is complete move to the folder where the results are saved copy the best PT file and move it to the YOLO V5 obb folder in this tutorial we already did this and we have renamed this file to best hand PT file now run detect Pi script after inference is done move to the folder where the results are saved the hand is successfully recognized we need to install one more library before moving to the next step install dynamixel SDK this package provides dynamixel control functions using packet communication now let's see the code to move a Servo and direction in which hand is posing this code is based on the detect Pi script so I will explain mainly about Servo operation related things in this line we are converting inference results from tensor to the numpy array here we are extracting bounding box angle note that result comes in five parameter representation so the angle is the fifth parameter if there are no results we re-initialize bounding box angle here difference between Servo angle and bounding box angle is calculated we set the limit for difference because if the servo moves too fast it will cause negative effect for inference here Servo angle is calculated based on current Servo angle and difference here Target Servo angle is sent to the servo now open a new terminal before executing the python script we should set permissions so that the servo can access the USB port execute the detect Servo Pi script note that we should set 0 for Source argument because we are using USB camera even though recognition Precision is not great we can see that Servo rotates to the hand-pointing direction
Info
Channel: robot mania
Views: 3,141
Rating: undefined out of 5
Keywords: robotics, deep learning, ROS, jetson, YOLO
Id: ALD9KfCfZk4
Channel Id: undefined
Length: 14min 56sec (896 seconds)
Published: Sun Jul 02 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.