How to Train TensorFlow Lite Object Detection Models Using Google Colab | SSD MobileNet

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hey everyone! In this video we'll learn how to train a TensorFlow Lite object detection model on our own dataset. We'll walk through the process of preparing training data, training the model, and exporting it all using Google's free servers inside Google Colab. By the end of this video you'll have a fully trained lightweight object detection model that you can run on computers, a Raspberry Pi, cell phones or other edge devices. I'll train a coin detection model as an example for you to follow along with. This model can be used in a change counter application that tells you the total value of change in an image. I'll provide the dataset and sample code for this application but you can also use your own dataset to train an entirely different model. This video walks through a Google Colab notebook I wrote for training models. All you need to do is open the notebook in your web browser to follow along. Click the link in the video description below and let's get started! Colab is a free Google service that allows you to write and run Python code through your web browser. It connects to a virtual machine on Google servers that's complete with a Linux OS, a file system, Python environment, and best of all a free GPU. We'll upload our training data to this Colab session and use it to train our model. Click the connect button to initialize the environment. Make sure you're using a GPU-equipped machine by going to Runtime and Change Runtime Type and making sure GPU is selected in the Hardware accelerator drop down. The first step in training a machine learning model is to create a dataset. We need to gather and label at least 200 images to use for training the model. If you don't want to gather images yet and just want to practice training a model, you can skip this step for now and download my coin training images in Step 3. Building a good image data set is the most important part of training a model. I made a YouTube video that gives step-by-step instructions on how to gather images and label them using an annotation program called LabelImg. The video also shows data set tips and best practices that will help improve your model's accuracy. Go check it out! To gather images use a phone or webcam to take pictures of your objects with a variety of backgrounds and lighting conditions. For my coin detector I took pictures with my phone and also set up a fixture to take pictures with my Raspberry Pi camera. You can also use images you find online but I recommend taking your own pictures because it usually results in better accuracy for your application. Once you've gathered about 200 images use an annotation program called LabelImg to draw bounding boxes around each object in each image. Again, my other YouTube video will walk you through how to do this. When you're done gathering and labeling images, you should have a folder full of images... ...and an annotation file for each image. Create a zip folder called "images" and add all of the images and annotation data into that zip folder. We'll upload this to the Colab session after the next step. Okay the hard part's over! Now we can let the Colab notebook do the rest of the work. First, we'll install the TensorFlow Object Detection API inside this Colab. The API contains scripts and libraries that we'll use for training the model. Click the Play button - oops! So when you click the play button you'll get this warning about it not being authored by Google. Go ahead and just click "Run anyway". Then, click the play button on these first four blocks of code. Allow each block to execute. It'll take several minutes to get everything installed. If you see any errors or warnings related to package dependencies or requests to restart the runtime, you can just ignore them. We'll verify the API installed correctly by clicking play on the next code block, letting it execute, and verifying it says model built successfully when finished. Okay! It says it ran okay, so we're good to go. If you get errors go to the Common Errors section at the bottom of the notebook to see how to resolve them. You can also comment on this video or send me a tweet at Twitter at @EdjeElectronics and I'll see if I can help. We need to transfer our training images onto the Colab virtual machine. There's a few options for doing this. The easiest way is to just upload your images folder through Google Colab. Click the folder icon and then drag your "images.zip" folder into the sidebar. It'll upload the images directly into the Colab file system. It may take a while depending on your internet speed. This little orange circle shows the progress of the upload, so it may take a while. Keep in mind that the images will be deleted if the Colab session disconnects, so you'll need to re-upload them each time you restart. Another option is to upload the "images.zip" folder to Google Drive and then link it to the Colab file system. Read the instructions here to see how to do so. The nice thing about using Drive is you don't need to re-upload your zip folder every time you use this Colab. If you have a slow internet connection or if your data set is more than 100 megabytes, this will save a lot of time. Finally, as a third option, you can also just use my coin dataset and practice training a model with that. I've uploaded 750 labeled coin images to Dropbox. Download them into the Colab file system by clicking Play on this block. These coins are United States currency, but I know I have a lot of international viewers on this channel. Keep an eye out for new download links to coin data sets from other countries. At this point, whether you used option one, two, or three, you should be able to click the folder icon and see your "images.zip" file. Now that the data set is uploaded, let's unzip it and create some folders to hold the images. Click on the next code block to unzip the images and split them into train, validation, and test folders. Each of these image sets have a different purpose. The "train" images are used for the actual training of the model, the "validation" images are used to periodically check progress during training, and the "test" images are used by us at the end of training to visually check how accurate the model is. I wrote a Python script to randomly move 80% of the images to the train folder, 10% to the validation folder, and 10% to the test folder. Click Play on this code block to download and run the script. Now, when you check the file list, you should see an images folder with train, validation, and test folders that each have images in them. Next we need to convert the data set into TFRecords, which is a data format used by TensorFlow. We'll use Python scripts to do the conversion, but first we have to define a label map for our model. A label map is a simple text file with a list of classes that you want your model to detect. We can create this text file in Colab using the command in the next code block. Replace "class1", "class2", and "class3" with the classes you used when you labeled your images. For example, for my change counter model, I'll put "penny", "nickel", "dime", and "quarter". Make sure you spell the classes correctly. Once you've listed your classes, click Play on the code block. A labelmap.txt file will appear in your list of files in the content folder. It's just a basic text file with your list of classes. Here's what it looks like if you download it. With the label map defined, we can create the TFRecords. Download and run the conversion scripts using the next two code blocks. The scripts will also create a labelmap.pbtxt file which contains the label map in a different format that's needed by TensorFlow. Finally, click Play on the next code block to store the paths to the TFRecords and label map. We'll use them later in this Colab. Okay! The next step is to set up training configuration for our model. We'll select the model from the TensorFlow Model Zoo, which has a list of models that we can fine tune on our own dataset. The models have varying levels of speed and accuracy. I wrote a blog post comparing performance of several models from the Model Zoo. It shows the accuracy each model achieves when it's trained on my coin data set and the FPS they run at on the Raspberry Pi 4. Go check it out to see which model will work best for your application. I set up this notebook to make it easy to switch between models for training. You can select which model you want to train by changing the text in the "chosen_model" variable to match one of the options below for this video. I'll select the "ssd-mobilenet-v2-fpnlite-320" model. Feel free to try one of the other models. Click Play on the code block once you've made your selection. Next, we'll download the pre-trained weights and pipeline configuration files for the selected model. Click Play on this code block to download them. The pipeline configuration file sets all of the parameters for training the model. Two key parameters are "num_steps" and "batch_size". "num_steps" defines the total number of steps to use for training the model, and "batch_size" sets the number of images to use in each training step. For good numbers to start with, let's set "num_steps" 40,000 and "batch_size" to 16. During training, if you see that the model hasn't converged within 40,000 steps, you can increase "num_steps" to a higher value and try again. The next code block defines other information for the config file, like the file path to the pre-trained model files. Run it and confirm that it prints the correct number of classes for your detector. Next, we'll rewrite the pipeline configuration file to use parameters that we just specified. This code block will go into the config file and override it with the necessary information, such as number of classes, batch size, path to the training files, and so on. Click Play to run it. If you're curious, you can check the config file's contents by clicking this next block. It displays the full file in the browser. The file contains all the configuration parameters that are used for training. Finally, click Play on the next code block to set the path to the pipeline file and model training directory. All right! We're ready to start training our object detection model. Before we start training, we can start up a TensorBoard session to monitor training progress. Click Play on this code block, give it a few seconds, and a TensorBoard interface will appear. It won't show anything yet, because we haven't started training. We'll use the "model_main_tf2.py" script from the TensorFlow Object Detection API for training. The script parses the configuration file, loads the model and dataset, and then starts training the model. Click Play to begin training. The program will initialize and display some log messages. Once it's done initializing, it will start displaying training messages every 100 steps. It takes a while, so if it seems like nothing is happening, just wait a couple minutes. If you encounter any errors, please visit the Common Errors section at the bottom of this notebook. Training takes anywhere from two to six hours, depending on the model, number of training steps, and batch size you're using. Now, let's let it train for a while. You can minimize the window and work on something else while it's training. Okay, our model has been training for about two and a half hours, and we've trained for about 29,000 steps. Let's go back up and check TensorBoard to see how training is going. Click refresh to update the interface. TensorBoard has several graphs that show the model's overall loss over time. You want to look at the total loss graph. As the model trains, the overall loss will decrease. We should keep training the model until the loss stops decreasing. It looks like the loss is still going down just a little bit, so let's keep on training. If you do want to stop training early, click the Stop button or right-click and select "Interrupt execution". Otherwise, training will stop automatically once it reaches the number of steps we specified earlier in the "num_steps" variable. In this case, it's 40,000. Make sure to be present when training stops, because if the session is idle for about 15 minutes, Colab will disconnect and delete your runtime, and you'll have to start over. Okay our model has been trained! Now that training is done, we need to convert our model to a TensorFlow Lite format. Run this first code block to freeze your model graph in a TFLite compatible format. When that's done, run the next code block to convert the graph to a TFLite file. The resulting .tflite file contains the neural network and weights of your object detection model in an optimized FlatBuffer format. Our custom model has been trained and converted to TFLite format, but how well does it actually perform at detecting objects in images? Let's use it on the images in the test folder to visualize how accurate it is. Click Play on the next block to define a function to load the model, run it on each image, and display the result. The code is based off the "TFLite_detection_image.py" script from my GitHub repository, so feel free to use it as a starting point for your own application. The next block lets you set the confidence threshold and number of images to test. Click Play on this block to start inferencing. The inference results from each image will display in the browser. The results should give you a sense of how well your model actually performs at detecting objects in new images. In my case, the change counter model is mostly accurate ... but let's see if we can find any that it incorrectly detects coins in. It's still looking pretty good. Here we go. In this one it incorrectly identified a nickel as a dime. But, it's really doing pretty well. So when you run this with your own model, if you don't see any results drawn in the test images, go ahead and go back up and change "min_conf_threshold" to 0.01, which is basically the minimum it can be, and run the block again. You should see some boxes drawn on your images. Next, we can quantitatively measure the model's performance by calculating the model's mAP on the test images. The higher the mAP score, the more accurate the model is. To learn more about how accuracy is measured for object detection models, check out this insightful article from RoboFlow that explains mAP. We'll use an mAP calculator script from another GitHub repository to determine our model's mAP score. Run the first block to clone the repository, remove the existing sample data from the repository, and then download a script that I wrote for interfacing with the calculator. Then, run the next script to copy the images and annotation data from our test folder to the appropriate folders in the repository. These will be used as the ground truth data that our model's detection results are compared to. Click Play on the next block to convert our annotation data to the format that's expected by the calculator tool. Next, we'll reuse the same inferencing function from the previous step to run our model on every image in the test folder. Unlike last time, it'll just save a .txt file with a predicted bounding box data for each image, rather than displaying the result. Go ahead and click Play to run the code block. Now that we have detection results and ground truth data to compare them to, we can calculate mAP. Click Play on this last code block to run my script for calculating the COCO mAP metric. The final score reported is your model's overall mAP score. Ideally it should be above 50%. If it isn't, you can increase your model's accuracy by adding more images to your dataset. See my dataset video for tips and tricks on how to capture good training images and improve accuracy. Now that your custom model has been trained and converted to TFLite format, it's ready to be downloaded and deployed in an application. Run the next two cells to copy the model and label map files into a folder, zip the folder, and download it to your computer. The "custom_model_lite.zip" file containing the model will be downloaded into your Downloads folder. Okay, so now that we've downloaded our trained TFLite model what can we do with it? Well, TensorFlow Lite models are great for running on a wide variety of hardware including PCs, embedded systems, Raspberry Pis, phones, and whatever other edge device you can think of. This section of the notebook provides links to instructions for running your model on various devices including the Raspberry Pi, Android phones, or Windows, Linux, or Mac OS computers. I'll update this section with instructions for other devices as I write them. As you guys know, I love Raspberry Pis, so in this video, I'll be showing how to deploy your model on a Raspberry Pi. TFLite models are great for running on the Raspberry Pi because they require less compute power than regular TensorFlow models. The quantized SSD-MobileNet-FPNLite model runs at about 2.6 FPS on my Raspberry Pi 4. To run your model on the Raspberry Pi, first you need to install TensorFlow Lite and prepare a Python environment for your application. Check out my TensorFlow Lite on the Raspberry Pi video for step-by-step instructions on how to set it up. It only takes about 20 minutes to step through the whole process. If you'd rather just run it on your PC, follow the instructions on my Windows TensorFlow Lite setup guide. I'll be writing a guide for Mac OS and Linux too. Once you've got TFLite set up on your Raspberry Pi, move the "custom_model_lite.zip" file that you downloaded from Colab over to your Pi. You can upload it to Google Drive, or use a USB thumb drive, or do whatever your favorite file transfer method is. I'll copy my model onto a USB drive and then fly it over to my Raspberry Pi. This Pi has already been set up with TensorFlow Lite, and has a folder called "tflite1" that holds all the scripts and model files for running detection. Move the "custom_model_lite.zip" file into the tflite1 folder. Then open a terminal, issue "cd tflite1" and then "unzip custom_model_lite.zip". Before we run the detection script, let's activate the virtual environment by issuing "source tflite1-env/bin/activate". Now, you can run the TFLite scripts with your model by using the "--modeldir" argument. For example, to run the webcam detection script, issue "python TFLite_detection_webcam.py --modeldir=custom_model_lite". A window will appear showing a live feed from your webcam with boxes drawn around objects of interest. You can press 'q' to quit. Finally, if you trained a coin detection model and want to try it out with my example change counter application, issue "python examples/ChangeCounter.py --modeldir=custom_model_lite". Point the camera at coins on a surface. I built a cool camera stand out of K'nex, but you can also just hold it above the table with your hand. The program should identify each coin and calculate the total value of all the coins it sees. This is just one example of the many cool applications you can create using computer vision and machine learning. Check out my website and YouTube channel for other examples. You can squeeze some more performance out of your model using a compression technique called quantization. Step 9 of this notebook shows how you can quantize a model with TensorFlow and recalculate its accuracy. If you have a Coral USB Accelerator, the notebook also shows how to compile your model for EdgeTPU. I'll release a follow-up video walking through these steps soon. Okay, you did it! At this point you should have a fully trained TensorFlow Lite model running on your Raspberry Pi or other edge device. There's a ton of different ways you can use AI-powered computer vision to solve everyday problems. I'll keep making videos and examples showing how to build cool programs that use object detection. Stay tuned for more videos. In the meantime, if you wind up making a cool application with TensorFlow Lite, you can comment here or tweet it to me on Twitter and I'll share it with the rest of my followers. Thanks so much for watching this video, and I hope it was helpful. As always good luck with your projects, and I'll see you next time.

Info

Channel: Edje Electronics

Views: 76,434

Rating: undefined out of 5

Keywords:

Id: XZ7FYAMCc4M

Channel Id: undefined

Length: 23min 18sec (1398 seconds)

Published: Mon Feb 13 2023