How to Train TensorFlow Lite Object Detection Models Using Google Colab | SSD MobileNet

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hey everyone! In this video we'll learn how to  train a TensorFlow Lite object detection model   on our own dataset. We'll walk through  the process of preparing training data,   training the model, and exporting it all  using Google's free servers inside Google   Colab. By the end of this video you'll  have a fully trained lightweight object   detection model that you can run on computers, a  Raspberry Pi, cell phones or other edge devices.   I'll train a coin detection model as an example  for you to follow along with. This model can be   used in a change counter application that  tells you the total value of change in an   image. I'll provide the dataset and sample code  for this application but you can also use your   own dataset to train an entirely different  model. This video walks through a Google   Colab notebook I wrote for training models. All  you need to do is open the notebook in your web   browser to follow along. Click the link in the  video description below and let's get started! Colab is a free Google service that allows you  to write and run Python code through your web   browser. It connects to a virtual machine on  Google servers that's complete with a Linux OS,   a file system, Python environment, and best of  all a free GPU. We'll upload our training data   to this Colab session and use it to train our  model. Click the connect button to initialize   the environment. Make sure you're using a  GPU-equipped machine by going to Runtime   and Change Runtime Type and making sure GPU is  selected in the Hardware accelerator drop down.   The first step in training a machine  learning model is to create a dataset.   We need to gather and label at least 200  images to use for training the model.   If you don't want to gather images yet and just  want to practice training a model, you can skip   this step for now and download my coin training  images in Step 3. Building a good image data set   is the most important part of training a model.  I made a YouTube video that gives step-by-step   instructions on how to gather images and label  them using an annotation program called LabelImg.   The video also shows data set tips and best  practices that will help improve your model's   accuracy. Go check it out! To gather images  use a phone or webcam to take pictures of   your objects with a variety of backgrounds and  lighting conditions. For my coin detector I took   pictures with my phone and also set up a fixture  to take pictures with my Raspberry Pi camera.   You can also use images you find online but  I recommend taking your own pictures because   it usually results in better accuracy for  your application. Once you've gathered about   200 images use an annotation program called  LabelImg to draw bounding boxes around each   object in each image. Again, my other YouTube  video will walk you through how to do this. When you're done gathering and labeling images,  you should have a folder full of images... ...and an annotation file for each image. Create a zip folder called "images" and  add all of the images and annotation data   into that zip folder. We'll upload this  to the Colab session after the next step. Okay the hard part's over! Now we can let  the Colab notebook do the rest of the work.   First, we'll install the TensorFlow Object  Detection API inside this Colab. The API   contains scripts and libraries that we'll use for  training the model. Click the Play button - oops!   So when you click the play button you'll get this  warning about it not being authored by Google. Go   ahead and just click "Run anyway". Then, click the  play button on these first four blocks of code. Allow each block to execute. It'll take  several minutes to get everything installed. If you see any errors or warnings related  to package dependencies or requests to   restart the runtime, you can just ignore them.   We'll verify the API installed correctly  by clicking play on the next code block,   letting it execute, and verifying it says  model built successfully when finished. Okay! It says it ran okay, so we're good to  go. If you get errors go to the Common Errors   section at the bottom of the notebook to see  how to resolve them. You can also comment on   this video or send me a tweet at Twitter at  @EdjeElectronics and I'll see if I can help.  We need to transfer our training images onto the  Colab virtual machine. There's a few options for   doing this. The easiest way is to just upload  your images folder through Google Colab.   Click the folder icon and then drag your  "images.zip" folder into the sidebar.   It'll upload the images directly into the Colab  file system. It may take a while depending on   your internet speed. This little orange  circle shows the progress of the upload,   so it may take a while. Keep in mind  that the images will be deleted if the   Colab session disconnects, so you'll need  to re-upload them each time you restart.   Another option is to upload the "images.zip"  folder to Google Drive and then link it to the   Colab file system. Read the instructions here to  see how to do so. The nice thing about using Drive   is you don't need to re-upload your zip folder  every time you use this Colab. If you have a slow   internet connection or if your data set is more  than 100 megabytes, this will save a lot of time. Finally, as a third option, you can  also just use my coin dataset and   practice training a model with that.  I've uploaded 750 labeled coin images   to Dropbox. Download them into the Colab  file system by clicking Play on this block. These coins are United States currency, but I  know I have a lot of international viewers on   this channel. Keep an eye out for new download  links to coin data sets from other countries. At this point, whether you used option one,  two, or three, you should be able to click   the folder icon and see your "images.zip"  file. Now that the data set is uploaded,   let's unzip it and create some folders to hold the  images. Click on the next code block to unzip the   images and split them into train, validation,  and test folders. Each of these image sets have   a different purpose. The "train" images are  used for the actual training of the model,   the "validation" images are used to periodically  check progress during training, and the "test"   images are used by us at the end of training to  visually check how accurate the model is. I wrote   a Python script to randomly move 80% of the images  to the train folder, 10% to the validation folder,   and 10% to the test folder. Click Play on this  code block to download and run the script.   Now, when you check the file list, you  should see an images folder with train,   validation, and test folders  that each have images in them. Next we need to convert the data set into  TFRecords, which is a data format used by   TensorFlow. We'll use Python scripts to do the  conversion, but first we have to define a label   map for our model. A label map is a simple text  file with a list of classes that you want your   model to detect. We can create this text file in  Colab using the command in the next code block.   Replace "class1", "class2", and "class3" with the  classes you used when you labeled your images.   For example, for my change counter model, I'll  put "penny", "nickel", "dime", and "quarter".   Make sure you spell the classes correctly. Once  you've listed your classes, click Play on the code   block. A labelmap.txt file will appear in your  list of files in the content folder. It's just   a basic text file with your list of classes.  Here's what it looks like if you download it. With the label map defined, we  can create the TFRecords. Download   and run the conversion scripts  using the next two code blocks. The scripts will also create  a labelmap.pbtxt file which   contains the label map in a different  format that's needed by TensorFlow. Finally, click Play on the  next code block to store the   paths to the TFRecords and label map.  We'll use them later in this Colab. Okay! The next step is to set up training  configuration for our model. We'll select   the model from the TensorFlow Model Zoo,  which has a list of models that we can   fine tune on our own dataset. The models  have varying levels of speed and accuracy.   I wrote a blog post comparing performance  of several models from the Model Zoo. It   shows the accuracy each model achieves when it's  trained on my coin data set and the FPS they run   at on the Raspberry Pi 4. Go check it out to see  which model will work best for your application. I set up this notebook to make it easy to switch  between models for training. You can select which   model you want to train by changing the text in  the "chosen_model" variable to match one of the   options below for this video. I'll select the  "ssd-mobilenet-v2-fpnlite-320" model. Feel free   to try one of the other models. Click Play on  the code block once you've made your selection. Next, we'll download the pre-trained  weights and pipeline configuration   files for the selected model. Click Play  on this code block to download them. The pipeline configuration file sets all of  the parameters for training the model. Two   key parameters are "num_steps" and "batch_size".  "num_steps" defines the total number of steps to   use for training the model, and "batch_size" sets  the number of images to use in each training step.   For good numbers to start  with, let's set "num_steps"   40,000 and "batch_size" to 16. During training,  if you see that the model hasn't converged within   40,000 steps, you can increase "num_steps"  to a higher value and try again. The next code block defines other  information for the config file,   like the file path to the pre-trained  model files. Run it and confirm that   it prints the correct number  of classes for your detector. Next, we'll rewrite the pipeline configuration  file to use parameters that we just specified.   This code block will go into the config file  and override it with the necessary information,   such as number of classes, batch size, path to the  training files, and so on. Click Play to run it. If you're curious, you can check the  config file's contents by clicking   this next block. It displays  the full file in the browser.   The file contains all the configuration  parameters that are used for training. Finally, click Play on the next code block to set   the path to the pipeline file  and model training directory. All right! We're ready to start  training our object detection model.   Before we start training, we can start up a  TensorBoard session to monitor training progress.   Click Play on this code block, give it a few  seconds, and a TensorBoard interface will appear.   It won't show anything yet, because  we haven't started training. We'll use the "model_main_tf2.py" script  from the TensorFlow Object Detection API   for training. The script parses the  configuration file, loads the model   and dataset, and then starts training  the model. Click Play to begin training. The program will initialize and display some  log messages. Once it's done initializing,   it will start displaying training messages  every 100 steps. It takes a while, so if it   seems like nothing is happening, just wait a  couple minutes. If you encounter any errors,   please visit the Common Errors section  at the bottom of this notebook. Training takes anywhere from two to  six hours, depending on the model,   number of training steps, and batch size  you're using. Now, let's let it train for   a while. You can minimize the window and  work on something else while it's training. Okay, our model has been training for about two  and a half hours, and we've trained for about   29,000 steps. Let's go back up and check  TensorBoard to see how training is going.   Click refresh to update the interface.   TensorBoard has several graphs that  show the model's overall loss over time.   You want to look at the total loss graph.  As the model trains, the overall loss will   decrease. We should keep training the  model until the loss stops decreasing.   It looks like the loss is still going down just a  little bit, so let's keep on training. If you do   want to stop training early, click the Stop button  or right-click and select "Interrupt execution".   Otherwise, training will stop automatically  once it reaches the number of steps we   specified earlier in the "num_steps"  variable. In this case, it's 40,000.   Make sure to be present when training stops,  because if the session is idle for about 15   minutes, Colab will disconnect and delete  your runtime, and you'll have to start over. Okay our model has been trained!  Now that training is done,   we need to convert our model  to a TensorFlow Lite format.   Run this first code block to freeze your  model graph in a TFLite compatible format. When that's done, run the next code block  to convert the graph to a TFLite file.   The resulting .tflite file contains  the neural network and weights of   your object detection model in  an optimized FlatBuffer format. Our custom model has been trained and converted  to TFLite format, but how well does it actually   perform at detecting objects in images? Let's use  it on the images in the test folder to visualize   how accurate it is. Click Play on the next block  to define a function to load the model, run it on   each image, and display the result. The code is  based off the "TFLite_detection_image.py" script   from my GitHub repository, so feel free to use  it as a starting point for your own application. The next block lets you set the confidence  threshold and number of images to test.   Click Play on this block to start inferencing. The inference results from each image will  display in the browser. The results should   give you a sense of how well your model  actually performs at detecting objects   in new images. In my case, the change counter  model is mostly accurate ... but let's see if   we can find any that it incorrectly detects  coins in. It's still looking pretty good. Here we go. In this one it incorrectly identified  a nickel as a dime. But, it's really doing pretty   well. So when you run this with your own model,  if you don't see any results drawn in the   test images, go ahead and go back up and change  "min_conf_threshold" to 0.01, which is basically   the minimum it can be, and run the block again.  You should see some boxes drawn on your images. Next, we can quantitatively measure the  model's performance by calculating the   model's mAP on the test images. The higher  the mAP score, the more accurate the model   is. To learn more about how accuracy is  measured for object detection models,   check out this insightful article  from RoboFlow that explains mAP. We'll use an mAP calculator script from  another GitHub repository to determine our   model's mAP score. Run the first block to clone  the repository, remove the existing sample data   from the repository, and then download a script  that I wrote for interfacing with the calculator. Then, run the next script to copy the images  and annotation data from our test folder to   the appropriate folders in the repository.  These will be used as the ground truth data   that our model's detection results are compared  to. Click Play on the next block to convert our   annotation data to the format that's expected by  the calculator tool. Next, we'll reuse the same   inferencing function from the previous step to  run our model on every image in the test folder.   Unlike last time, it'll just save a .txt file  with a predicted bounding box data for each image,   rather than displaying the result. Go  ahead and click Play to run the code block. Now that we have detection results and ground  truth data to compare them to, we can calculate   mAP. Click Play on this last code block to run  my script for calculating the COCO mAP metric. The final score reported is your model's  overall mAP score. Ideally it should be above   50%. If it isn't, you can increase your model's  accuracy by adding more images to your dataset.   See my dataset video for tips and tricks on how to  capture good training images and improve accuracy. Now that your custom model has been  trained and converted to TFLite format,   it's ready to be downloaded and deployed in an  application. Run the next two cells to copy the   model and label map files into a folder, zip  the folder, and download it to your computer. The "custom_model_lite.zip" file containing   the model will be downloaded  into your Downloads folder. Okay, so now that we've downloaded our  trained TFLite model what can we do with it?   Well, TensorFlow Lite models are great for running  on a wide variety of hardware including PCs,   embedded systems, Raspberry Pis, phones, and  whatever other edge device you can think of.   This section of the notebook provides links to  instructions for running your model on various   devices including the Raspberry Pi, Android  phones, or Windows, Linux, or Mac OS computers.   I'll update this section with instructions  for other devices as I write them.   As you guys know, I love Raspberry Pis,  so in this video, I'll be showing how to   deploy your model on a Raspberry Pi.  TFLite models are great for running   on the Raspberry Pi because they require less  compute power than regular TensorFlow models.   The quantized SSD-MobileNet-FPNLite model  runs at about 2.6 FPS on my Raspberry Pi 4.   To run your model on the Raspberry Pi, first  you need to install TensorFlow Lite and prepare   a Python environment for your application. Check  out my TensorFlow Lite on the Raspberry Pi video   for step-by-step instructions on how to set  it up. It only takes about 20 minutes to step   through the whole process. If you'd rather just  run it on your PC, follow the instructions on   my Windows TensorFlow Lite setup guide. I'll  be writing a guide for Mac OS and Linux too.   Once you've got TFLite set up on your Raspberry  Pi, move the "custom_model_lite.zip" file that   you downloaded from Colab over to your  Pi. You can upload it to Google Drive,   or use a USB thumb drive, or do whatever  your favorite file transfer method is.   I'll copy my model onto a USB drive and  then fly it over to my Raspberry Pi. This Pi has already been set up with  TensorFlow Lite, and has a folder called   "tflite1" that holds all the scripts  and model files for running detection.   Move the "custom_model_lite.zip"  file into the tflite1 folder.   Then open a terminal, issue "cd tflite1"  and then "unzip custom_model_lite.zip". Before we run the detection script,   let's activate the virtual environment by  issuing "source tflite1-env/bin/activate". Now, you can run the TFLite scripts with your  model by using the "--modeldir" argument.  For example, to run the webcam detection script,   issue "python TFLite_detection_webcam.py  --modeldir=custom_model_lite". A window will appear showing  a live feed from your webcam   with boxes drawn around objects of interest. You can press 'q' to quit. Finally, if you trained a coin  detection model and want to try   it out with my example change counter application,  issue "python examples/ChangeCounter.py  --modeldir=custom_model_lite". Point the camera at coins on a surface.  I built a cool camera stand out of K'nex,   but you can also just hold it above the  table with your hand. The program should   identify each coin and calculate the  total value of all the coins it sees.   This is just one example of the many cool  applications you can create using computer   vision and machine learning. Check out my  website and YouTube channel for other examples. You can squeeze some more performance out of  your model using a compression technique called   quantization. Step 9 of this notebook shows how  you can quantize a model with TensorFlow and   recalculate its accuracy. If you have a Coral  USB Accelerator, the notebook also shows how   to compile your model for EdgeTPU. I'll release a  follow-up video walking through these steps soon. Okay, you did it! At this point you should have  a fully trained TensorFlow Lite model running   on your Raspberry Pi or other edge device.  There's a ton of different ways you can use   AI-powered computer vision to solve everyday  problems. I'll keep making videos and examples   showing how to build cool programs that use  object detection. Stay tuned for more videos.   In the meantime, if you wind up making  a cool application with TensorFlow Lite,   you can comment here or tweet it to me on Twitter  and I'll share it with the rest of my followers.   Thanks so much for watching this  video, and I hope it was helpful.   As always good luck with your  projects, and I'll see you next time.
Info
Channel: Edje Electronics
Views: 76,434
Rating: undefined out of 5
Keywords:
Id: XZ7FYAMCc4M
Channel Id: undefined
Length: 23min 18sec (1398 seconds)
Published: Mon Feb 13 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.