So on today's tutorial we are going to make
an experiment we are going to find out how much data we need to train a machine learning
model, there is a huge misconception in machine learning which says that the more data you use
to train a model the better and that is not true that is false because the more data you use
to train a model means you are going to spend more time training the model and it also means
you are going to spend more time annotating the data and curating the data so the more data you
use to train a model is just going to make the entire process way more expensive, it's not about
maximizing the amount of data you use to train a model but it's actually the other way around
you want to minimize how much data you use to train a model, obviously you want to achieve a very
good performance you want to achieve a very good accuracy when you are training a machine learning
model but for a given threshold of performance you want to minimize how much data you are using to
train the model, now let me give you more details about the experiment we will be doing today I
have already prepared all the datasets you can see over here each one of these directories is
a different dataset and the difference between all of these different datasets is how much data
we have in each one of them right we are going to train a machine learning model with each one
of these datasets and then we are going to compare their performances, we are going to train
an object detector so so we are going to train a model in order to detect objects and we are going
to use yolov8, so we are going to train an object detector with yolov8 and the difference between each
one of these datasets is how many images we have in each one of them, and for example this one which
is called 10 means we have exactly 10 images in this dataset, this one which is called 50 means we
have 50 images in this dataset, this one which is called 100 means we have 100 images and so on then
we we have these other datasets which are comprised of 200 500 1,000 2,000 and 4,000 images, remember we are
going to train an object detector with each one of these datasets and then we are going to compare
their performances so we're going to use yolov8 in order to train this object detector and we
are just going to use all the default parameters as you can see over here the only parameter we
are going to specify is the number of epochs which we are going to set in 20, so we are going to train
each one of these object detectors for 20 epochs and we are going to take the model we produced at the
end of the training process at the end of the 20 epochs, and then the only thing we're going to do is
to compare the performances of all these models so this is exactly the experiment we will be doing
today, so this tutorial is about showing you this experiment is about showing you the results but
it's not really so much about showing you how to do all the training process right, how to train
an object detector using yolov8 on a custom dataset, no, this is only about showing you this
experiment, if you want to know how I trained this object detector then I invite you to take a
look at other of my previous videos where I show you the entire process of how to train an object
detector using yolov8 on a custom dataset and this previous video oh my God I show you absolutely
every single detail which is involved in this process, from how to annotate the data, how to
train the model, how to evaluate the performance of the model and so on so if you want to know how
I trained this model how I trained each one of these object detectors I invite you to take a look at
that video over there but for now let's continue now let me show you the data I used to train this
model, we are going to train an object detector in each one of these cases we're going to train an
object detector to detect ducks so this is the data we will be using today you can see we have
many images of ducks and this is the data we are going to use in order to train all these object
detectors, in each one of these cases you can see over here the in each one of these datasets the
only thing I did was sampling, to take a sample, of the images you can see in this directory so
for example in the dataset which is comprised of only 10 images I took 10 images at random from
this directory then for this other dataset which is comprised of 50 images I took 50 images at random
from this directory and so on so the only thing I did was taking this dataset and just taking a
few images at random in order to generate each one of the datasets you can see over here right and
this is a dataset I downloaded from the Google open images dataset version 7 which is an amazing
dataset with a lot of images a lot of categories a lot of annotations millions of annotations you can
use in order to train your machine learning models and if you want to know how I downloaded this
data from the Google open images dataset version 7 I invite you to take a look at this other previous
video where I show you the entire process of how to download an object detection dataset from the Google
open image dataset version 7, this is an amazing video I show you absolutely every single step of this
process but this previous video is not available in my YouTube channel but this is available in my
Patreon, so it's available to all my Patreon supporters this is all about the experiment we are going
to do about the data we are going to use let me tell you something else about the experiment
remember we are going to train each one of these object detectors for exactly 20 epochs, we're going
to take the model we produce at the end of the training process and we're going to compare the
performance of this model and we're are going to compute the performance of this model on a test
set which is comprised of 100 images and this is very important we are going to use always the same
test set of 100 images so we are going to change the datasets we use as training set but as a test set
we are always going to use exactly the same dataset of 100 images, this is very important please
remember although we are going to change the datasets we are going to use as a training set the
test set is always going to be the same this is very important because otherwise the experiment
doesn't make make any sense whatsoever right, so this was another thing which was very important
let me show you a few examples I'm going to open for example this two directories over here, the one
that's comprised with 10 images and the one that's comprised with 50 images and you can see that the
data is already in the format we need in order to train a model with yolov8 and if I open this
directory which is images, I'm going to open images in each one of these directories you can see that
in this case the training set is comprised of 10 items and the test set is comprised of 100 items,
now if I show you the other directory the other dataset you can see in this case the training set
is comprised of 50 items but the test set is also comprised of 100 items now I'm going to open the
test set in each one of these datasets and you can see that we have exactly the same images in
each one of these test sets right because we are going to use exactly the same 100 images in order
to test the performance of each one of the object detectors we are going to use today, the only
thing we are going to change is the training set but the test set is going to be always the
same please remember this is very important and otherwise the experiment doesn't make any sense
right we need to use exactly the same test data in all cases so this is exactly the experiment
we will be doing today and now let let me show you the results because remember this video is
not about showing you the... how I trained these object detectors but this is only about showing
you the experiment and the results and so on so I'm going to take this script I have over here and
this script is going to take all the results from all the training process from all these different
datasets and it's going to take all the data and it's going to take all the performance and it's
just going to produce a few plots we are going to use in order to make this experiment... in order
to analyze all the results from this experiment you can see we have two plots over here one of
them is the mean average precision in the last epoch as a function of dataset size and I'm talking
about the mean average precision in the test set right, and you can see we have a plot which has
the mAP in the Y axis and the dataset size in the X axis and then we have also this
other plot over here which is the training time as a function of the dataset size we have the training
time in the Y-axis and the dataset size in the X axis, so let's get started analyzing this plot we
have over here you can see that the mean average precision, the performance, is increasing as we
increase the dataset size right, we start with a mean average precision of around 60% and then in
the last dataset with 4,000 images we have a mean average precision of 91.1% so the mean average precision
increases as we increase the dataset size but please notice that although we are always increasing
the mean average precision, in some cases we are not really increasing that much right in some cases it's only
a very small Improvement right for example from... for example from 50 images to 100 images you can
see that we have pretty much the same performance we have pretty much the same mean average precision and
if I show you over here for example if I show you these four models over here you can see that
the mean average precision although it's increasing with the dataset size is not really increasing that much,
for the dataset with 500 images we have a mean average precision of around 86%, I'm looking at this number
over here so you can see this is 86% and in the case of a dataset with 4000 images we have a 91% so it's
increasing but it's only increasing a little it's only like a very very small Improvement but if we
look at the training time as a function of the dataset size you can see in this case the training
time is increasing exponentially right you can see that this is growing exponentially and let me do
exactly the same as before I'm going to take these four models over here and I'm going to compare
the performance with these four training times over here and you can see that although we
have only a very small Improvement in the mean average precision, we have a huge increase in the training
time right if we take this model over here the one we trained with 500 images and the one we trained
with 4,000 images if we take this two models you can see that we are just improving the performance...
we're just improving the mean average precision in something like a 0.05% right something around
a 0.05% because in this case we have a 0.86 mean average precision and in this case we have a 0.91
mean average precision, so it's a very small Improvement of only 0.05% but if we look at the execution
time... at the training time you can see that the training time increases by a factor of seven right,
if we take this value over here which is 500 seconds and if we compare with this other value
over here which is 3,500 seconds you can see that it's increasing by a factor of seven so it takes
seven times more time to train a model with 4,000 images that the time it takes to train a model with
500 images, we have a very small Improvement in the mean average precision but we have a huge increase in
the training time so that's the first conclusion we should take from looking at these plots, this
is very important because remember it's not only about achieving the best performance, the highest
performance, but you have to look at many other factors and if you are taking much more time to
train the model and you are not really gaining a lot of performance you're only gaining like
a very very small performance then maybe it doesn't make any sense right, you will need to
make a conclusion in each particular case if it makes sense or not but I would say that in the
most generic case maybe it doesn't really make a lot of sense now let me show you another
way to to evaluate the performance of all these models which is looking at some results right
remember from my previous videos I always told you that yeah the mean average precision is important
and all these metrics are very important but at the end of the day the most important thing is to
look how it performs with a few images right with a few samples with a few videos so I prepared
this video over here let me show you and in this video we have many ducks which are just doing
nothing... or they are just like walking in the water... or actually they are swimming... or they are doing
something, I'm not sure how this action is called right, because they are doing something I don't
know how this is called but it doesn't matter we are going to use this video which has many many
ducks in order to see how each one of these object detectors we trained over here how each one of
these models performs right so let me show you the results this is a very important test always
remember yes look at the mean average precision look at all these numbers but also take a look how it
performs on a few videos on a few images because otherwise it doesn't make any sense so these are
each one of the videos I produced with all the results and I'm just going to open each one of
these videos one next to the other so it's going to be much easier in order to evaluate all of them
at the same time, so these are the results, and you can see that for example in this case this is the
video I produced with the model I trained with only 10 images then in this other case this is the video I
produced with the model I trained with 50 images then this is the video I produced with the model I trained with 100
images and so on these are all the results from all the models and you can see that in these two
cases, in the case of the... when I used the dataset of only 10 images and the dataset with 50 images
we are not really detecting anything at all, the mod doesn't perform well at all we're detecting
nothing we're not detecting any duck whatsoever so this is the first thing we should notice then
in this other case in the model I trained with 100 images you can see we are detecting something it doesn't
perform very well we have many missdetections and it's not really very stable so it doesn't work
very well but you can see that at the very least we are detecting something and then for this other
model... with 200 images it also performs okay we have a few missdetections and so on but it's okay
and the same happens for this other model with 500 images, with 1,000 images you can see it's okay
but it's not perfect then I would say this other model with 2000 images it performs better
I would say I really like how it performs and then this other one I trained with 4,000 images
it also performs very well as well, so these are all the results from all the models and there are
many conclusions we can take from here, the first one is this situation we have over here that we
are not detecting any duck whatsoever with these two models we trained with 10 images and with 50
images and if we go back to the performance plot, to this plot over here you can see that for
the model we trained with 10 images we have a 60% mean average precision and with the model
we trained with 50 images we have something like a 73% mean average precision so if we look at the
mean average precision on itself we would say oh okay it doesn't really perform that bad right it's
like okay it's like an okay performance it's not perfect but it's like an okay performance,
60%, 73%, but if we look at some very specific values if we look at these videos you can see that
the 60% and the 73% doesn't really mean anything at all because we are not detecting anything we're
detecting nothing whatsoever so these numbers the 60% and the 73% % doesn't really say a thing
doesn't mean anything it's completely meaningless in this case right this is a very important
conclusion this is one of the reasons why I always tell you yes look at the mean average precision look
at the accuracy look at all these metrics but also look at how it performs with a few images with a
few videos because otherwise this may not be very relevant right this may be meaningless and
also a very important conclusion that if we look at the video we produced with the model we trained
with 100 images we can see that it performs okay right many missdetections is not stable at all
but it's like okay we're detecting something at the very least we are detecting something and if
we look at the mean average precision in these two cases in the model we trained with 50 images and the model
we trained with 100 images we can see that the mean average precision is pretty much the same in both cases we
have something like a 73% mean average precision so we have exactly the same mean average precision but the
performance is completely different right this is the video we produced with 50 images and this
is the video we produced with 100 images so the performances are completely different right with
100 images we are detecting something at the very least we are detecting something and in the other
case we are not detecting anything whatsoever so that's another very interesting conclusion the
mean average precision... it's important it's important to take it into consideration but also take many
other things into consideration because if you look at the mean average precision on itself... it doesn't
say anything and you can see this is a very good example it doesn't say anything at all so this
is another very interesting conclusion and now let's take a look at this other video again
with all the results and if I were to choose if I were to select the best models based on this
performance we have over here I eould say that the best models are these two, the one we
produced with the model we trained with 2,000 images and the one we produced with the model with
4,000 images I would say in these two cases the detections are more stable and we have the least
amount of missdetections right we are detecting all the ducks and everything looks very stable and
if you ask me I don't really see a huge difference between these two videos I would say they perform
pretty much the same now let's get back to these other plots, now I'm going to focus on this one over
here which is the training time as a function of the dataset size and now let's take a look at the
training time of these two models we have over here and you can see that for the model we trained
with 4,000 images we spent pretty much twice the time to train that the model we trained with 2000,
images so they... both of them perform pretty much the same based on the example I showed you but this
model took twice as long to train that the other one so in this particular case it seems it doesn't
make any sense to train the model with more than 2,000 images because you are not really improving
the performance that much and you are just wasting a lot of time and therefore you are wasting a lot
of money right so this is another very interesting conclusion from these results so this is going
to be pretty much all for this video this is the experiment I wanted to show you in this tutorial
let me know what you think about this video in the comments below and let me know if you would like
me to make other similar videos in the future with other type of experiments, I have other ideas of
other experiments we could make in other tutorials but let me know what you think about this video
first in the comments below so this is going to be all for this video my name is Felipe I'm a computer
vision engineer and see you on my next video.