Hey my name is Felipe and welcome to my channel. In this video
we are going to work with image classification and feature extraction we are going to build
an image classifier which internally uses feature extraction in order to classify
images this is a very very very advanced computer vision tutorial and now let's get started and
now let me show you the data we will be using in today's tutorial today we are going to work with
the same weather dataset we already used in two of my previous videos in my previous video where
I showed you how to train an image classifier using yolo v8 and also in my previous video where I
showed you how to train an image classifier using teachable machine, exactly the same weather dataset
and now let me show you super quickly how this dataset looks like you can see we have four
different categories and these categories are cloudy rain shine and sunrise now let me show you
super quickly how each one of these categories looks like and you can see this is the Cloudy
category now if I show you the rain category you can see we have many many many pictures of super
super super rainy days and if I show you the shine category this is how the images look like and
if I show you the sunrise category you can see we have many many many pictures of sunrises
so this is the data we will be using in this tutorial and you can see we have two directories
one of them is train and this is where the training data is located and then we have another directory
which is called val and this is the validation data now let me show you something else if I go
to my browser this is the GitHub repository we are going to use as a feature extractor because
remember in this tutorial we are going to work with image specification but we're also going to
work with feature extraction and this is exactly how we are going to extract features from more
images this repository is called image to vec with pytorch and this repository is available
as a python package so this is exactly how we are going to extract features from our images and
this is the repository we are going to be using in this tutorial now let's go to pycharm because now
it's time to start working on this tutorial now it's time to create a new python project so I
go to file new project and I'm going to select the directory where I want to create this
project which in my case is something like here tutorial okay and I'm going to create this
new project using a virtual environment and python 3.8 I click on create, create from existing sources,
this window, the python project has been created and now let's install the requirements we need
in order to work on this tutorial so I'm going to file, settings, then project, python interpreter,
I'm going to click on this plus button and the first Library we are going to install is this one
right we are going to work with this repository so we are going to install this python package so we
need to install image to vec slash pytorch so let's find this Library image to vec and this
is the one we need pytorch okay I'm going to click here and then install package and that's
pretty much all then we also need to install two additional libraries one of them is scikit
learn so I'm going to search for scikit learn this one over here I'm going to press install
package and then I also need to install pillow and that's pretty much all so my python packages
are being installed but in the meanwhile let's start working in this tutorial let's start working
on all the coding of all of this tutorial so I'm going to press here on OK and then I am
going to create a new file which is main.py main dot py okay and this is the file we are going
to use in order to train our image classifier in order to do the feature extraction and also the
image classification so I am going to enlarge it a little and the first thing I'm going to do is
to write the four steps we will take in order to complete this process the first one is to prepare
the data we are going to use in order to train this classifier then we are going to train
the model we are going to train a classifier then we are going to test the performance of
this classifier and finally we are going to save the model into our computer right so these
are the four steps of today's tutorial and now let's get started and the first thing we need to
do is to import from image to vec pytorch import Image2Vec okay and then we are going
to create image to vec... we are going to create a new object which will be the feature extractor
we are going to use in order to extract features from our images and now let's prepare the data
so I am going to define a new variable which is data directory and this is the directory where my data
is located and my data is located in the current directory, this is the directory where I have created
this project this is the main.py file in which we are currently working in and this is where
my data is located it's in a directory which is called data and within this directory there's
another directory which is called weather dataset so I am going to define the data dir as data
whether dataset... then let's create two additional variables one of them will be train directory
and this is where my training data is located which is here... this is something more like this...
os path join data directory and then train okay and then the validation directory which
is going to be something like this okay and I need to import os otherwise
this is not going to work, so import... os... okay and that's pretty much
all remember within weather dataset we have two directories one of them is called train and
the other one is called val now let's continue now let's walk through all the files in these two
directories and this is how we're going to do: for dir in train directory, val directory for category in os listdir directory right we are
going to iterate in each one of these categories and we are going to Define
another variable we are going to make another loop which will be for
image path in os path join dir category right so we are iterating
in all the images which are within this directory right so we are iterating in all
the categories so we are iterating in each one of these four directories we have over
here and then for each one of these directories we are going to iterate in all the images you
can see over here right now we are going to Define another variable which is going to be
image path underscore and this is os path join dir directory category and then image path okay then we are going to make another import which
is from PIL import image because we are going to read each one of these images we're going to load
each one of these images like this image dot open image path underscore and this will be image
okay and now we are going to compute all the features from this image and this is how we're
going to do we are going to call image to vec dot get vec and we are going to input the
image and that's it, this is all it takes to compute all the features from this image and
this is going to be something like features okay now I am going to create two
lists I'm going to do it over here one of them will be features and the other
one will be labels okay... and this object maybe it's a better idea to call it something like
image features right something like that and now we are going to append features dot append
image features right and then labels dot append and the label will be something like the category
right these are our labels... each one of these directory names is our categories right these
are our four categories so in order to append the label we need to append the directory name
we need to append this variable over here which is called category now let's continue now I'm
going to Define another variable which is data this is going to be a dictionary and I'm going to
do something like this I am going to walk not only in the directories but I'm also going to Define
another variable which is going to be j dir in enumerate train dir val dir... something like this so now this
is going to take these two values... dir is going to be train dir in the first iteration
and it's going to be val dir in the second iteration and j will be 0 in the first iteration
and it will be 1 in the second iteration so by doing so we can do something like this
now I'm going to say data training data validation data and this will be j equal to features and then the same but with the labels training labels
validation labels j and this will be labels right remember that in the first iteration dir is
training directory and j will be zero so if we access the index number 0 from this list we
are going to get training data and then in the second iteration j will be one so we we are going
to access the second element which is validation data so we are going to create this key according
to the iteration we are in and the same is going to happen for the labels right but in order to
be more clear let me show you exactly how this variable we have over here how this dictionary we
have over here how it looks like so I'm just going to execute the code as it is and at the end I'm
going to say something like print data dot keys right let's take a look at this dictionary we
have over here and let's see exactly how it looks like let's see exactly what are the keys for
this dictionary so I'm just going to press play okay the execution is now completed and these
are the keys we have saved in this dictionary training data and training labels validation data
validation labels so I invite you to take a look at this code I invite you to go through
this code once and again until it's 100% clear for you exactly what we are doing over here please
notice we are iterating in all the images, in all the training images and all the validation
images we are opening these images we are computing the features and then we are saving this
data into these two lists we have over here and then we are saving these lists in this dictionary
under this very appropriate keys right so please go through this code once and again until it
is 100% clear for you but for now let's continue we have saved all of our data into this dictionary
and now it's time to train the model and in order to train this model let's go to my browser
super quickly because we are going to use a model, a classifier from scikit learn and let me
show you all the different classifiers we could use from scikit learn, I also invite you
to take a very close look to this website so you are more familiar with all the different
classifiers you could use from scikit learn in this tutorial we are going to use random Forest
classifier we are going to use this classifier over here but we could also use any other
classifier from here right the only reason we are going to use a random forest classifier is
because I like it I really like this classifier I have used it many times in many projects I
think it's very robust it's very easy to use it's very easy to understand what's going on
I really like this classifier but that's the only reason why we are going to use a random
forest classifier we could also use any other classifier from here so let's go back to pycharm and
now I'm going to make another import which is from sklearn dot ensemble import random
forest classifier and then I am going to take this value random forest classifier and I'm going to
define a new... another variable which is called model and model will be RandomForestClassifier right
we are creating a new instance of this classifier and then in order to train the model the
only thing we need to do is to call model.fit we need to input the training data which is here the training labels which are here and that's it that's all that's all it takes in
order to train this classifier you can see how simple this is now let's continue now it's time
to test the performance of the image classifier we have just trained in order to do so I am going
to make another import which is from sklearn dot matrix import accuracy score and I'm
going to make something like this model.predict validation data sorry data validation data we want to access
our validation data which is saved here and this will be something like y pred then we are going to call accuracy
score we're going to... we're going to input y pred and we're also
going to input data validation labels okay so you can see we are computing the
accuracy score with the validation data which is completely and absolutely unseen
data for our model, for our classifier, you can see here we are training our classifier
with the training data, the validation data is completely unseen data, the classifier
has never seen this data before so this is exactly the type of data we should use
in order to compute the accuracy score in order to test the performance of our model
so this is going to be something like score and now I am going to print score and then we
are going to save the model but let's take it one step at a time let's execute this script let's
see if everything is okay, okay the execution is now completed we didn't have any
error whatsoever and this is the accuracy we got a 94.4 percent accuracy which is a very very very
high level of accuracy now remember I told you this is exactly the same dataset we used in two
of my previous videos in my previous video where I showed you how to train an image classifier using
yolo V8 and also my previous video where I showed you how to train an image classifier using teachable
machine this is exactly the same dataset so I invite you to take a look at those two videos and
to compare the accuracy we are getting here with this classifier with the accuracy we got in those
two videos right using yolo V8 and using teachable machine I invite you to compare this accuracy
with the accuracy we got in those two other videos that's your homework that's your homework
from today's tutorial and now let's continue so the only thing we need to do now is to continue
to the next step which is saving the model and this is how we are going to save the model we are
going to import another library which is pickle and we are going to say something like this with
open the file name will be something like model.p this will be WB... as f... and we
are going to call pickle.dump then this will be the object we are
going to save which in our case is model and then f... and that's pretty much all then I'm
going to close the file and now I'm going to run this file again okay and now if we go to my file
system you can see that now we will have a file which is called model.p and this is the model we have
just saved so everything is working just fine and now let me show you how to take this model how
to load this model to produce inferences with it right so I'm going back to pycharm and I'm going
to create a new file which is called infer.py and this is where we are going to make
our inference so I am going to take a few lines from here I am going to import this
object and I'm also going to import this other... this other object from PIL and then this is going
to be a very very similar process I'm going to create this object over here which is image
to vec this is the object we are going to use in order to compute our features and then I'm
just going to define an image path which will be one of my images in the validation data I'm just
going to take a random image something like this cloudy 4.jpg a random image in this directory and
then I am going to do something like image will be... image.open... image path and then I am going to compute features
from this image doing something like image to vec image to vec dot get vec image this will be features okay then I
am going to load the model or maybe I can load the model over here so I'm
going to say something like with open model.p this will be RB as f... and model
will be pickle dot load f and obviously we need to import pickle otherwise
this is not going to work import pickle and that's pretty much all
and now we are going to call model dot predict and we are going to input the features
and this is how we are going to do it and this will be something like prediction okay and now
let's print prediction and let's see what happens and this is the result we are getting cloudy
which is exactly this image category remember we are taking this image from the Cloudy
directory and absolutely all the images in this directory belong to the cloudy category so
everything seems to be working just fine and this is going to be all for this video my name is
Felipe I'm a computer vision engineer if you enjoyed this video I invite you to click
the like button and I also invite you to subscribe to my channel this is going to be
all for today and see you on my next video