[MUSIC PLAYING] PAUL RUIZ: Hey, all. I'm Paul Ruiz, an engineer
with the developer relations team on Google ML. And this is the ML on Raspberry
Pi with MediaPipe series, where you will learn about the
basics of machine learning, along with how you can
use Google's newest on-device machine learning
tool, MediaPipe, to add useful features to your
own Raspberry Pi apps. In the last video, which I have
linked in the video description below, you learned how
to use object detection on your device using a
custom model for tracking pieces of a toy. In this video, you
will learn about how to create your own object
detection model that you can deploy to your Raspberry Pi. This video is going to focus
on a really high-level approach to training. So your results will
vary, as there's a lot of different things to
be aware of for optimizing and refining a model. But that's definitely
outside of our scope today. So I think the goal here
is just to make sure that you have enough
of an understanding to be able to train your
own model specifically for prototyping ideas
for your IoT device. So let's go ahead
and get started. The first, and most important
step, in training a new model is collecting good data. In this example, I'll be using
about 300 pictures that I took of these toy pieces in
different orientations, some partially off
screen, and others hidden behind other objects. One tip here is that
you want your data to be as close as you can
get it to what you would see in a real-world environment. I would also recommend
the site Kaggle for finding a great selection
of images and other data that you can use when training
your machine learning models if you don't have access
to your own custom data. After you have gathered all
of the pictures representing the objects that
you want to detect, it's time to start
labeling them. While there's a variety of
tools out there for this step, I'm going to use
an open source one called Label Studio,
which I'll link to in the video description. Though, there's always
advantages and disadvantages to any sort of tooling
that's out there. So definitely feel
free to use whatever is going to work best for you. There's a few different
ways to install Label Studio on your computer. So I'll defer that step to
their own documentation. Once you have it
installed and running, it's time to create
a new project. Go ahead and click on
the blue Create icon at the top-right corner. On the next screen, you
can name your project. It's worth noting
that you're actually going to need two
projects, one for training, and one for validation. Training data, as
you might guess, is the data that you'll use
for training your new model. Validation data is
how you can track the accuracy of your
training process using images that
the model has not seen before during training. We'll start by working
with our training data. So make sure you
name your project so that you know that this is
where you have your training data stored. After you have
named your project, head over to the
labeling setup screen and select the object detection
with bounding boxes template. This will open up
a new screen that lets you customize a few
features specifically to annotate object
detection data. Go ahead and remove
the two default labels. Then head over to the
Add Labels Name text box. The very first item you will
add is one called background. This is required by
the COCO data format that we're going to
use for training. So make sure you don't miss it. After you have the background
label written down, you can start adding
your additional labels within that field with each
being on their own line. In this case, I'm going to
have five different toy pieces. So I'll add circle, rectangle,
triangle, square, and pentagon. Once you've added those labels,
click on the Save button. This will bring you to a screen
where you can import your data. So let's take a moment
to really understand how we want to separate our data. Even though we're going to start
with a project for training, we'll actually spend this step
separating out our validation data so it doesn't get mixed
in with our training data. This is an incredibly
important step because you don't want your
model validating its efficiency against data that it's
already trained with, as this would be
like taking a test after you've already
seen all of the answers. Here, you can see that
I have all of the images that I've taken
of the toy pieces. For this example, I'm
going to take roughly 10% of the images, or 30,
to run validation. This step will vary a lot
depending on your own needs and decisions when
training a model. But for this quick prototype,
it'll do what I need. So I'm going to create a new
folder called validation, where I'll place those test
images so they're separated from my training data. To make sure that I have a
diverse selection of images representing the
pieces for validation, I've already selected
the few pictures that contain all of the
pieces, as well as a collection of images that
only contain one type of piece. Now that the validation
is separated out into its own folder,
everything else should be used for training. Getting back to Label Studio,
click on the Import button. Then drag every training
image into the project. Depending on how
many images you have, you might need to do this a few
times with a smaller selection because there is an import
limit of 100 pictures. For each import step,
once everything is loaded, click on the blue Import
button in the top-right corner. When you're done
importing your images, it's time to start the long
process of annotating them. Unfortunately, this is the part
of the collecting good data step, which you may
remember is the most important step in training. So you're going to
want to take your time. Let's get started by
clicking on the first item in the list of imported images. On this next screen,
you'll want to select the label for each object
that you want to detect. In this first image, I'll
hit the 2 key on my keyboard to select circle. And then I'll highlight
the circle toy piece. You'll want to make
sure that this is as accurate as you can get it. So if you need to
make any adjustments, click on the square to
change the size and location. Then you can hit the Escape
key when you're done with it. After labeling the item, click
on the blue Submit button. Then move on to the next image. Because this is a long
and tedious process, I'm going to pause right here
so I can do my annotations. Then I'll come back
once they're done. If you're following
along with this tutorial, feel free to pause the video
here, grab a snack and a drink, put on a YouTube video,
and good luck as you start annotating your data. Don't forget to double check
your annotations at the end to verify that
your data is good. [MUSIC PLAYING] All right. Hopefully, that all
went well for you and you have a whole
lot of annotated images, and you're ready to move on. Coming back to the main
screen for Label Studio, it's time to click on the Export
button in the top-right corner. This will give you a
variety of options to select from for the export format. But we're going to
select COCO since it's one of the most commonly-used
formats for object detection. This will take a moment
after clicking on Export. But then you should end up with
a new zip file on your computer that contains the labels and
images for your training data. So here's the tricky part. Let's go into that folder and
open the results.json file. This is going to
be a long file that maps your images to their
annotations in a usable way. While I'm using
Sublime text here, you may have different
text editors. But open the Find option and
search for the word, circle. You might remember that I said
the background item should be item zero. But here, you can
see the order is different from the
order in which you added your labels in Label Studio. This is because Label Studio
uses an alphabetical order for the category IDs with
capital letters coming before lowercase letters. What you will want
to do is change the ID number for
background to zero, then replace the ID
for circle, which is what is currently zero, to
five so that they're swapped. This isn't my favorite step. But like I mentioned
earlier, there's advantages and disadvantages
to any tooling. This is just something
to be aware of when you're working with this one. After you've swapped
the categories here, you will want to do
a find and replace to change any items that
have a category ID of zero so that they're now
five, matching up to the new value of circle. All right. So there's one more step
while we have this file open. If you scroll back up
to the top of the file, you can see that we have
an item called file name within each image object. This points to all of your
images under the images folder. That said, when we
get to Model Maker, it won't expect that
additional images path because it just assumes
that's where the images are. So let's go ahead and remove it
with another find and replace. At this point, you can save
and close the result.json file. Getting back to
Label Studio, it's time to annotate
your validation data. I'm sorry. But at least this
should go a lot faster. Go through the same steps of
creating an object detection project, making sure that
you type in your labels exactly like you did
with your training data. Then import the
validation photos that you separated out earlier. Once you're done
annotating those images, export the validation
data in the COCO format. Next, you'll want to do similar
steps with the new results.json file to make sure the background
category is set to ID zero. Then, double check that
the other IDs and labels in your validation data
match up to the information in your training data. They should, since
the category IDs are alphabetical with
capital letters coming first. But it's always good
to verify in case that there was a typo in
your validation labels. If they don't, do some find and
replace operations to fix that. If your validation IDs don't
match during the training phase, then your model will
simply not work correctly. Also, don't forget to remove the
image part from that file names path. Congratulations. You should have
all of the data you need for training a
new object detection model in MediaPipe Model Maker. Now, it's time to switch over
to Colab to do the training. I'll also provide a link
to this Model Maker program in the description below. The first thing you'll
want to do in Colab is to create a new
runtime and connect to it. For this example, I'm going to
start a runtime with the GPU to make things go a little bit
faster, though I've also tested this with the free CPU option. After your runtime
has started, you will want to place your data
into the Colab file directory. Go to the file section
on the left of the screen and create a new directory. For this example, I
just call mine blocks. Under that, I created
two more directories called train and validation
which, as you might guess, is where you'll put your
training and validation data. Finally, create a folder named
images under both of those. Now you get the fun job of
copying all of that data that you've collected into
the Colab environment. Everything from your train
project in Label Studio should go under
the train folder. And everything from
your validation project should go into your
validation folder. Something to pay
attention to here is that Colab only lets you
upload so many items at a time. So you will need to copy over
your images in smaller batches. You will also need to
rename the results.json file to labels.json to match what
Model Maker is expecting. After you've copied
everything over, you should have a folder
structure that looks like this. With the data in
its proper place, it's time to work through
the actual training script. You'll start by installing
the MediaPipe Model Maker dependencies. Once everything is
finished installing, you will import
everything you need for this task, including
the object detector tool in MediaPipe Model Maker. After that, you will create
values pointing to the paths with all of your new data. Then do a quick
check by displaying all of their category
IDs and names. From there, you can create
data sets from everything that you've already uploaded. Then display their sizes to
make sure everything just looks correct. In this case, I'm
seeing the 285 images that I annotated for
training and the 31 that I set aside for validation. The next step is to
create the options that you'll use during training,
including the underlying model that will
be used as a base, and the path where you will
save the newly-trained model. The hparams value
here is also something that you can use for optimizing
your training process. So I'll include a link
to more information in the video description. After that, it's time for
the actual training phase. For this example,
the training is going to happen
over 30 iterations, also known as epochs,
with each iteration going through 35 steps. This is going to take a while. When I went through
this process, it took about an hour
and a half to complete when using Colab's CPU
runtime, and 35 minutes when using a GPU runtime. So as much as I'm sure you all
would love to just hang out watching the training
happen in this video for that entire
duration, I'm going to take this opportunity
to pause recording and come back once we're
ready for the next step. [MUSIC PLAYING] Great. Now that the model
is done training, you can use this
code to convert it into a TensorFlow Lite file then
download it onto your computer. Once the model is downloaded,
it's time to actually test it. Rather than transferring
it to your Raspberry Pi, let's take a look
at MediaPipe Studio, which is a web
tool that lets you load your own models
for MediaPipe task to test them in a
simplified environment. This is especially
useful when you want to make sure your model
works as expected before using it in your own apps. I'll post the link to
the object detection page in MediaPipe Studio below. Under the Model Selection
dropdown, scroll to the bottom and select, Choose A Model File. This will let you select
the new object detection model that you just created. Once that's loaded, you can
either use your computer's webcam or open up an image
to run object detection and test your results. Here, you can see that I'm
able to hold up multiple toy pieces to have them classified
and tracked using my new object detection model. One thing worth mentioning
is that these are actually a set of pieces that I
separated out from those that I used in my original data set. That way, the model has never
seen these specific pieces before. While we're here, it's important
to know that the model might not work exactly like you'd
expect since we skipped over a lot of the
fine tuning techniques to create more of a prototyping
model than anything else. When I did this
the first time, I realized that I took all
of my training images against a white background. So my model had a
lot of difficulty when I held pieces up
without that background. It's totally fine. Anytime something doesn't go as
expected, it's just experience. And you'll do better
the next time. The beauty of
MediaPipe Studio is that it helps you debug those
issues a bit more easily. All right. So that was a lot. In this video, you learned
about how you can take data that you want to use
for object detection, annotate it, train a new model
using MediaPipe Model Maker, and test that model out
using MediaPipe Studio. Now that you know how
to do all of that, let's go back to working
with a new Raspberry Pi example with MediaPipe task. And like always,
we're excited to see all the cool things you make
with MediaPipe on the Raspberry Pi. So let us know in the
comments what you've made or what you want to make. And I'll see you
in the next video. [MUSIC PLAYING]