YOLO-NAS - A New Best Object Detection Model!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Thank you for joining this CS Board video about YOLO-NAS, a new and improved version in the YOLO models family for object detection, which can be used as a foundational model in this field YOLO models have been around for a while now. Presented in 2015 with the paper You Only Look Once, which is what the shortcut YOLO stands for, and over the years we saw various improved versions up until YOLOv8 which was presented earlier this year by ultralytics Now, a company called Deci has release YOLO-NAS and showed it achieves great results in very low latency with the best accuracy-latency tradeoff to date. In this video we will talk about how they were able to do that. Let's start with Neural Architecture Search which is the meaning of the shortcut NAS in YOLO-NAS name Most of the times, model architectures are designed by human experts. Since there is a huge number of potential model architectures, it is likely that even if we reach great results, we did not nail it exactly on the best choice of model architecture out there and we could still find a different model architecture that would yield better results. As a result, Neural Architecture Search was invented and it includes three main components. A search space which defines the set of valid possible architectures to choose from, a search algorithm which is in charge of how to sample possible architectures from the search space as there is no way to try them one by one, and an evaluation strategy which is used to compare between candidate architectures. So in order to come up with model architecture for YOLO-NAS, Deci used their own neural architecture search implementation called AutoNAC, which stands for Automated Neural Architecture Construction. They have provided AutoNAC with the details needed to search YOLO possible architectures, which created an initial search space of huge size, and AutoNAC, which is hardware aware, found an optimal architecture for YOLO-NAS, which is optimized for Nvidia T4, in a process that took 3800 hours of GPU Another attribute that helped YOLO-NAS reach great results in super low latency is quantization aware architecture. So what does it even mean? Well, objection detection in real-time is critical for various applications, such as a safe use of autonomous cars, so we want to deploy object detection models on cars, phones and more instead of running the model in the cloud. However, edge devices resources are limited so it make it hard to deploy large models on them due to their size and also long inference time. Quantization in machine learning usually refers to the process of reducing the precision of the model weights so they will consume less memory and run faster. This however many times comes with a decrease in the model accuracy The quantization technique used in YOLO-NAS is called INT8 quantization which is a way of converting the model weights from float32 to int8 so each weight is one byte in memory instead of four. They were able to do that thanks to a new building block called QARepVGG which they instructed the neural architecture search algorithm to include. QARepVGG was recently introduced in research paper from meituan. We'll save the details about this building block for a different video, but in short, it is an improved version of RepVGG block that is commonly used in object detection models that significantly improved the loss of accuracy after quantization They also used hybrid quantization technology to apply quantization only for specific layers in the model to balance information loss and latency Before moving on, if you like this content then please subscribe to the channel and hit the like button as it will help this channel grow Let's move on to talk about YOLO-NAS being a foundational model. So what a foundational model means? "Say we have two different tasks at hand, one model to detect bone fractures objects in an image and another task to detect fish objects in an aquarium. Instead of training each model from scratch, we can start with two instances of YOLO-NAS and fine-tune the first model on a bone scans dataset and fine-tune the second model on an aquarium dataset. With this approach we enjoy the strength of YOLO-NAS pretraining that was done on very large datasets with advanced techniques, while still adapting better to specific use cases thanks to the final fine-tuning." Thank you for watching and taking the time to learn about YOLO-NAS. Please let me know in the comments section if you have specific questions about the model.

Info

Channel: AI Papers Academy

Views: 4,873

Rating: undefined out of 5

Keywords: YOLO, YOLONAS, yolo-nas, yolo nas, object detection, deep learning, computer vision, foundational model, quantization, real time object detection, neural architecture search, NAS, AutoNAC

Id: uWBh0meTg08

Channel Id: undefined

Length: 4min 57sec (297 seconds)

Published: Fri May 05 2023