YOLO-NAS - A New Best Object Detection Model!

Video Statistics and Information

Captions Word Cloud
Reddit Comments
Thank you for joining this CS Board video about  YOLO-NAS, a new and improved version in the YOLO   models family for object detection, which can  be used as a foundational model in this field  YOLO models have been around for a while  now. Presented in 2015 with the paper You   Only Look Once, which is what the shortcut  YOLO stands for, and over the years we saw   various improved versions up until YOLOv8 which  was presented earlier this year by ultralytics  Now, a company called Deci has release YOLO-NAS  and showed it achieves great results in very   low latency with the best accuracy-latency  tradeoff to date. In this video we will talk   about how they were able to do that. Let's start with Neural Architecture   Search which is the meaning of  the shortcut NAS in YOLO-NAS name  Most of the times, model architectures are  designed by human experts. Since there is a   huge number of potential model architectures, it  is likely that even if we reach great results,   we did not nail it exactly on the best choice of  model architecture out there and we could still   find a different model architecture  that would yield better results.  As a result, Neural Architecture Search was  invented and it includes three main components.  A search space which defines the set of  valid possible architectures to choose from,   a search algorithm which is in charge of  how to sample possible architectures from   the search space as there is no way to try them  one by one, and an evaluation strategy which is   used to compare between candidate architectures. So in order to come up with model architecture for   YOLO-NAS, Deci used their own neural architecture  search implementation called AutoNAC, which stands   for Automated Neural Architecture Construction. They have provided AutoNAC with the details needed   to search YOLO possible architectures, which  created an initial search space of huge size,   and AutoNAC, which is hardware aware,  found an optimal architecture for YOLO-NAS,   which is optimized for Nvidia T4, in  a process that took 3800 hours of GPU  Another attribute that helped  YOLO-NAS reach great results in   super low latency is quantization aware  architecture. So what does it even mean?  Well, objection detection in real-time is  critical for various applications, such as a   safe use of autonomous cars, so we want to deploy  object detection models on cars, phones and more   instead of running the model in the cloud. However, edge devices resources are limited   so it make it hard to deploy large models on them  due to their size and also long inference time.  Quantization in machine learning usually refers to  the process of reducing the precision of the model   weights so they will consume less memory  and run faster. This however many times   comes with a decrease in the model accuracy The quantization technique used in YOLO-NAS   is called INT8 quantization which is  a way of converting the model weights   from float32 to int8 so each weight  is one byte in memory instead of four.  They were able to do that thanks to a new building  block called QARepVGG which they instructed the   neural architecture search algorithm to include.  QARepVGG was recently introduced in research paper   from meituan. We'll save the details about  this building block for a different video,   but in short, it is an improved version  of RepVGG block that is commonly used   in object detection models that significantly  improved the loss of accuracy after quantization  They also used hybrid quantization technology to  apply quantization only for specific layers in   the model to balance information loss and latency Before moving on, if you like this content then   please subscribe to the channel and hit the  like button as it will help this channel grow  Let's move on to talk about YOLO-NAS  being a foundational model. So what   a foundational model means? "Say we have two different   tasks at hand, one model to detect bone  fractures objects in an image and another   task to detect fish objects in an aquarium. Instead of training each model from scratch,   we can start with two instances of YOLO-NAS  and fine-tune the first model on a bone scans   dataset and fine-tune the second model on an  aquarium dataset. With this approach we enjoy   the strength of YOLO-NAS pretraining that was done  on very large datasets with advanced techniques,   while still adapting better to specific  use cases thanks to the final fine-tuning."  Thank you for watching and taking the time  to learn about YOLO-NAS. Please let me know   in the comments section if you have  specific questions about the model.
Channel: AI Papers Academy
Views: 4,873
Rating: undefined out of 5
Keywords: YOLO, YOLONAS, yolo-nas, yolo nas, object detection, deep learning, computer vision, foundational model, quantization, real time object detection, neural architecture search, NAS, AutoNAC
Id: uWBh0meTg08
Channel Id: undefined
Length: 4min 57sec (297 seconds)
Published: Fri May 05 2023
Related Videos
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.