How to train an object detection model - ML on Raspberry Pi with MediaPipe

Video Statistics and Information

Video

Captions Word Cloud

Captions

[MUSIC PLAYING] PAUL RUIZ: Hey, all. I'm Paul Ruiz, an engineer with the developer relations team on Google ML. And this is the ML on Raspberry Pi with MediaPipe series, where you will learn about the basics of machine learning, along with how you can use Google's newest on-device machine learning tool, MediaPipe, to add useful features to your own Raspberry Pi apps. In the last video, which I have linked in the video description below, you learned how to use object detection on your device using a custom model for tracking pieces of a toy. In this video, you will learn about how to create your own object detection model that you can deploy to your Raspberry Pi. This video is going to focus on a really high-level approach to training. So your results will vary, as there's a lot of different things to be aware of for optimizing and refining a model. But that's definitely outside of our scope today. So I think the goal here is just to make sure that you have enough of an understanding to be able to train your own model specifically for prototyping ideas for your IoT device. So let's go ahead and get started. The first, and most important step, in training a new model is collecting good data. In this example, I'll be using about 300 pictures that I took of these toy pieces in different orientations, some partially off screen, and others hidden behind other objects. One tip here is that you want your data to be as close as you can get it to what you would see in a real-world environment. I would also recommend the site Kaggle for finding a great selection of images and other data that you can use when training your machine learning models if you don't have access to your own custom data. After you have gathered all of the pictures representing the objects that you want to detect, it's time to start labeling them. While there's a variety of tools out there for this step, I'm going to use an open source one called Label Studio, which I'll link to in the video description. Though, there's always advantages and disadvantages to any sort of tooling that's out there. So definitely feel free to use whatever is going to work best for you. There's a few different ways to install Label Studio on your computer. So I'll defer that step to their own documentation. Once you have it installed and running, it's time to create a new project. Go ahead and click on the blue Create icon at the top-right corner. On the next screen, you can name your project. It's worth noting that you're actually going to need two projects, one for training, and one for validation. Training data, as you might guess, is the data that you'll use for training your new model. Validation data is how you can track the accuracy of your training process using images that the model has not seen before during training. We'll start by working with our training data. So make sure you name your project so that you know that this is where you have your training data stored. After you have named your project, head over to the labeling setup screen and select the object detection with bounding boxes template. This will open up a new screen that lets you customize a few features specifically to annotate object detection data. Go ahead and remove the two default labels. Then head over to the Add Labels Name text box. The very first item you will add is one called background. This is required by the COCO data format that we're going to use for training. So make sure you don't miss it. After you have the background label written down, you can start adding your additional labels within that field with each being on their own line. In this case, I'm going to have five different toy pieces. So I'll add circle, rectangle, triangle, square, and pentagon. Once you've added those labels, click on the Save button. This will bring you to a screen where you can import your data. So let's take a moment to really understand how we want to separate our data. Even though we're going to start with a project for training, we'll actually spend this step separating out our validation data so it doesn't get mixed in with our training data. This is an incredibly important step because you don't want your model validating its efficiency against data that it's already trained with, as this would be like taking a test after you've already seen all of the answers. Here, you can see that I have all of the images that I've taken of the toy pieces. For this example, I'm going to take roughly 10% of the images, or 30, to run validation. This step will vary a lot depending on your own needs and decisions when training a model. But for this quick prototype, it'll do what I need. So I'm going to create a new folder called validation, where I'll place those test images so they're separated from my training data. To make sure that I have a diverse selection of images representing the pieces for validation, I've already selected the few pictures that contain all of the pieces, as well as a collection of images that only contain one type of piece. Now that the validation is separated out into its own folder, everything else should be used for training. Getting back to Label Studio, click on the Import button. Then drag every training image into the project. Depending on how many images you have, you might need to do this a few times with a smaller selection because there is an import limit of 100 pictures. For each import step, once everything is loaded, click on the blue Import button in the top-right corner. When you're done importing your images, it's time to start the long process of annotating them. Unfortunately, this is the part of the collecting good data step, which you may remember is the most important step in training. So you're going to want to take your time. Let's get started by clicking on the first item in the list of imported images. On this next screen, you'll want to select the label for each object that you want to detect. In this first image, I'll hit the 2 key on my keyboard to select circle. And then I'll highlight the circle toy piece. You'll want to make sure that this is as accurate as you can get it. So if you need to make any adjustments, click on the square to change the size and location. Then you can hit the Escape key when you're done with it. After labeling the item, click on the blue Submit button. Then move on to the next image. Because this is a long and tedious process, I'm going to pause right here so I can do my annotations. Then I'll come back once they're done. If you're following along with this tutorial, feel free to pause the video here, grab a snack and a drink, put on a YouTube video, and good luck as you start annotating your data. Don't forget to double check your annotations at the end to verify that your data is good. [MUSIC PLAYING] All right. Hopefully, that all went well for you and you have a whole lot of annotated images, and you're ready to move on. Coming back to the main screen for Label Studio, it's time to click on the Export button in the top-right corner. This will give you a variety of options to select from for the export format. But we're going to select COCO since it's one of the most commonly-used formats for object detection. This will take a moment after clicking on Export. But then you should end up with a new zip file on your computer that contains the labels and images for your training data. So here's the tricky part. Let's go into that folder and open the results.json file. This is going to be a long file that maps your images to their annotations in a usable way. While I'm using Sublime text here, you may have different text editors. But open the Find option and search for the word, circle. You might remember that I said the background item should be item zero. But here, you can see the order is different from the order in which you added your labels in Label Studio. This is because Label Studio uses an alphabetical order for the category IDs with capital letters coming before lowercase letters. What you will want to do is change the ID number for background to zero, then replace the ID for circle, which is what is currently zero, to five so that they're swapped. This isn't my favorite step. But like I mentioned earlier, there's advantages and disadvantages to any tooling. This is just something to be aware of when you're working with this one. After you've swapped the categories here, you will want to do a find and replace to change any items that have a category ID of zero so that they're now five, matching up to the new value of circle. All right. So there's one more step while we have this file open. If you scroll back up to the top of the file, you can see that we have an item called file name within each image object. This points to all of your images under the images folder. That said, when we get to Model Maker, it won't expect that additional images path because it just assumes that's where the images are. So let's go ahead and remove it with another find and replace. At this point, you can save and close the result.json file. Getting back to Label Studio, it's time to annotate your validation data. I'm sorry. But at least this should go a lot faster. Go through the same steps of creating an object detection project, making sure that you type in your labels exactly like you did with your training data. Then import the validation photos that you separated out earlier. Once you're done annotating those images, export the validation data in the COCO format. Next, you'll want to do similar steps with the new results.json file to make sure the background category is set to ID zero. Then, double check that the other IDs and labels in your validation data match up to the information in your training data. They should, since the category IDs are alphabetical with capital letters coming first. But it's always good to verify in case that there was a typo in your validation labels. If they don't, do some find and replace operations to fix that. If your validation IDs don't match during the training phase, then your model will simply not work correctly. Also, don't forget to remove the image part from that file names path. Congratulations. You should have all of the data you need for training a new object detection model in MediaPipe Model Maker. Now, it's time to switch over to Colab to do the training. I'll also provide a link to this Model Maker program in the description below. The first thing you'll want to do in Colab is to create a new runtime and connect to it. For this example, I'm going to start a runtime with the GPU to make things go a little bit faster, though I've also tested this with the free CPU option. After your runtime has started, you will want to place your data into the Colab file directory. Go to the file section on the left of the screen and create a new directory. For this example, I just call mine blocks. Under that, I created two more directories called train and validation which, as you might guess, is where you'll put your training and validation data. Finally, create a folder named images under both of those. Now you get the fun job of copying all of that data that you've collected into the Colab environment. Everything from your train project in Label Studio should go under the train folder. And everything from your validation project should go into your validation folder. Something to pay attention to here is that Colab only lets you upload so many items at a time. So you will need to copy over your images in smaller batches. You will also need to rename the results.json file to labels.json to match what Model Maker is expecting. After you've copied everything over, you should have a folder structure that looks like this. With the data in its proper place, it's time to work through the actual training script. You'll start by installing the MediaPipe Model Maker dependencies. Once everything is finished installing, you will import everything you need for this task, including the object detector tool in MediaPipe Model Maker. After that, you will create values pointing to the paths with all of your new data. Then do a quick check by displaying all of their category IDs and names. From there, you can create data sets from everything that you've already uploaded. Then display their sizes to make sure everything just looks correct. In this case, I'm seeing the 285 images that I annotated for training and the 31 that I set aside for validation. The next step is to create the options that you'll use during training, including the underlying model that will be used as a base, and the path where you will save the newly-trained model. The hparams value here is also something that you can use for optimizing your training process. So I'll include a link to more information in the video description. After that, it's time for the actual training phase. For this example, the training is going to happen over 30 iterations, also known as epochs, with each iteration going through 35 steps. This is going to take a while. When I went through this process, it took about an hour and a half to complete when using Colab's CPU runtime, and 35 minutes when using a GPU runtime. So as much as I'm sure you all would love to just hang out watching the training happen in this video for that entire duration, I'm going to take this opportunity to pause recording and come back once we're ready for the next step. [MUSIC PLAYING] Great. Now that the model is done training, you can use this code to convert it into a TensorFlow Lite file then download it onto your computer. Once the model is downloaded, it's time to actually test it. Rather than transferring it to your Raspberry Pi, let's take a look at MediaPipe Studio, which is a web tool that lets you load your own models for MediaPipe task to test them in a simplified environment. This is especially useful when you want to make sure your model works as expected before using it in your own apps. I'll post the link to the object detection page in MediaPipe Studio below. Under the Model Selection dropdown, scroll to the bottom and select, Choose A Model File. This will let you select the new object detection model that you just created. Once that's loaded, you can either use your computer's webcam or open up an image to run object detection and test your results. Here, you can see that I'm able to hold up multiple toy pieces to have them classified and tracked using my new object detection model. One thing worth mentioning is that these are actually a set of pieces that I separated out from those that I used in my original data set. That way, the model has never seen these specific pieces before. While we're here, it's important to know that the model might not work exactly like you'd expect since we skipped over a lot of the fine tuning techniques to create more of a prototyping model than anything else. When I did this the first time, I realized that I took all of my training images against a white background. So my model had a lot of difficulty when I held pieces up without that background. It's totally fine. Anytime something doesn't go as expected, it's just experience. And you'll do better the next time. The beauty of MediaPipe Studio is that it helps you debug those issues a bit more easily. All right. So that was a lot. In this video, you learned about how you can take data that you want to use for object detection, annotate it, train a new model using MediaPipe Model Maker, and test that model out using MediaPipe Studio. Now that you know how to do all of that, let's go back to working with a new Raspberry Pi example with MediaPipe task. And like always, we're excited to see all the cool things you make with MediaPipe on the Raspberry Pi. So let us know in the comments what you've made or what you want to make. And I'll see you in the next video. [MUSIC PLAYING]

Info

Channel: Google for Developers

Views: 1,974

Rating: undefined out of 5

Keywords: Google, developers, MediaPipe, ml on raspberry pi, raspberry pi, machine learning for raspberry pi apps, ml, machine learning, object detection, how to train a object detection model

Id: X9554zNNtEY

Channel Id: undefined

Length: 15min 1sec (901 seconds)

Published: Mon Dec 18 2023