Everyone and welcome to this Python and open
CV course. In this course, we'll be talking about everything you need to know. To get
started with open CV in Python, we're going to start off with the very basics that is
reading images and video, manipulating those media files with image transformations, and
how to draw shapes and put text on those files. Then we're going to move on to the most advanced
parts of open CV that is switching between color spaces bitwise operators, masking, histograms,
edge detection and thresholding. And finally, to sum things up, we'll be talking about face
detection and face recognition in open CV, so how to detect and find faces in an image
and how to recognize them using inbuilt methods. In the last video, we'll be building a deep
computer vision model to classify between the characters in The Simpsons based off some
images. All material discussed will be available on my GitHub page, and all relevant links
will be put up in the description below. If that sounds exciting, don't forget to head
over and subscribe to my channel. And I'll see you in the course. Hey, everybody, and
welcome to this Python and urban TV coast. Over the next couple of videos, we're going
to be talking about using the open CV library to perform all sorts of image and video related
processing and manipulations. Now I won't be delving into what open CV is really is.
But just be brief. It is a computer vision library that is available in Python, c++ and
Java. A computer vision is an application of deep learning that primarily focuses on
deriving insights from media files, that is images and video. Now, I'm going to assume
that you already have Python installed on your system. And a good way to check this
is by going to terminal and typing Python dash dash version. Now make sure you're running
a version of Python of at least 3.7 above whatever we do in this post wonderly work
in some older versions of Python, and especially Python two, so just make sure that you have
the latest version installed, go ahead to python.org and download the latest version
from bet. Now assuming that you've done this, we can proceed to installing the packages
that we require in this course. The first one is open C. So go ahead and do a pip install
Open CV dash contrib dash Python. Now sometimes you may find people telling you to install
just open CV dash Python. Well, this open team dash Python is basically the main package
the main module of open CV, open CV dash contract dash Python includes everything in the main
module, as well as a contribution modules provided by the community. So this is something
I recommend you install as it includes all of open CV functionality. You may also notice
that urgency, we tried to install the NumPy package. Now NumPy is kind of a scientific
computing package in Python, that's extensively used in matrix an array manipulations, transformations,
reshaping and things like that. Now, we'll be using NumPy in some of the videos in this
course. But don't worry if you've never used them before. It's simple and relatively easy
to get started with. Now the next package, I'd like you to install a sphere. So go ahead
and do pip install seer. Now, slight disclaimer, this is a package that I built to basically
help you to speed up your workflow. Sierra is basically a set of utility functions that
will prove super useful to you in your computer vision journey. It has a ton of super useful
helper functions that will help speed up your workflow. Now, although we're not going to
be using this for a good part of this course, in fact, we'll only begin to use this in our
last video of this course when we're building a deep computer vision model. I recommend
you install it now so that you don't have to worry about the installation process later
on. If you're interested in contributing to this package, or just simply want to explore
the codebase I'll leave a link to this GitHub page in the description below. Okay, that's
it for this video. In the next video, we'll be talking about how to read images and video
in open CV. So I'll see you guys in the next video. Hey everybody, and welcome back to
another video. In this video, we're going to be talking about how to read images and
video in open CV. So I have a bunch of images in this photos folder, and a couple of videos
in this videos folder. In the first half of this video, we'll be talking about how to
read in images in open CV, and towards the end we'll be actually talking about how to
read in videos. So let's start off by creating a new file and call this reader dot p y. And
the first thing we have to do is actually input CV two as CV. So the way we read in
images in open CV is by making use of the cv.im read method. Now this method basically
takes in a path to An image and returns that image as a matrix of pixels. Specifically,
we're going to be trying to read this image of a cat here. So we're going to say photos
slash cat dot jpg. And we're going to capture this image in a variable called IMG. Now you
can also provide absolute paths. But since this photos folder is inside my current working
directory, I'm going to reference those images relatively. Now once we've read in our image,
we can actually display this image by using the cv.rm show method. Now this method basically
displays the image as a new window. So the two parameters we need to pass into this method
is actually the name of the window, in this case is going to be kept and the actual matrix
of pixels to display, which in this case is IMG. And before we actually move ahead, I
do want to add an additional line a CV dot wait key zero. Now the CV or wiki zero is
basically a keyboard binding function, it waits for a specific delay, or time in milliseconds
for a key to be pressed. So if you pass in zero, it basically waits for an infinite amount
of time for a keyboard key to be pressed. I didn't worry too much about this, it's not
really that important for this course. But we will be discussing some parts of it towards
the end of this video. So let's actually save this and run by saying Python, read dot p
y, and the image is displayed in a new window. Cool. Now this was a small image, this was
an image of size 640 by 427. Now we're going to try and read in this image of the same
cat, but a much larger version, a 2400 by 1600 image. So we're gonna say Cat on a school
large dot jpg. Let's save that and run. And as you can see, this image goes way off screen.
The reason for this is because the dimensions of this image were far greater than the dimensions
of the monitor that I'm currently working on. Now currently, open CV does not have an
inbuilt way of dealing with images that are far greater than your computer screen. There
are ways to mitigate this issue. And we'll be discussing them in the next video when
we talk about resizing and rescaling frames and images. But for now, just know that if
you have images, if you have large images, it's possibly going to go off screen. So that's
it for reading images, we can then move on to reading videos in open CV. So that's called
reading videos. So what we're going to do is we're actually going to read in this video
of a dog, and the way we read in videos is by actually creating a capture variable and
setting this equal to CV dot video capture. Now this method either takes an integer arguments
like 0123, etc, or a path to a video file. Now you would provide an integer argument
like 012, and three, if you are using your webcam or a camera that is connected to your
computer. In most cases, your webcam would be referenced by using the integer zero. But
if you have multiple cameras connected to your computer, you could reference them by
using the appropriate argument. For example, zero would reference your webcam, one would
reference the first camera that is connected to your computer to would reference the second
camera and so on. But in this video, we'll be actually looking at how to read an already
existing videos from a file path. Now specifically, we'll be reading this dog, this video for
dog here. And the way we do that is by providing the path so videos, slash dog dot mp4. Now,
here's where reading videos is kind of like different from reading images. In the case
of reading and videos, we actually use a one loop and read the video frame by frame. So
we're going to say while true. And the first thing we want to do inside this loop is say
is true. And frame is equal to capture dot read. Now this capture dot read basically
reads in this video frame by frame, it returns the frame and a Boolean that says whether
the frame was successfully read in or not. Do you display this video we can actually
display an individual frame. So we do this by saying TV on show and we call this video
and we pass in the frame and finally for some way to stop the Do from playing indefinitely
is by saying if CV don't wait, Ki 20 and 0x ff is equal to equal to Ord of D. There we
want to break out of this while loop. And once that's done, we can actually release
the capture pointer. And we can destroy all windows. And we can get rid of this. So basically
just to recap, the capture variable is an instance of this video capture clause. Inside
of while loop, we grab the video frame by frame. By utilizing the captured read method,
we display each frame of the video by using the CV dot m show method. And finally, for
some way to break out of this while loop, we say if See, we don't wait ki 20 if and
0x f f is equal to or D, which basically says that if the letter D is pressed, then break
out of this loop and stop displaying the video. And finally, we release the capture device
and we destroy all the windows since we don't need them anymore. So let's save that and
run. And we get a video displayed in a window like this. But once it's done, you will notice
that the video suddenly stops and you get this error. More specifically a negative 215
assertion failed error. Now if you ever get an error like this negative 215 assertion
failed. This would mean in almost all cases is that open CV could not find a media file
at that particular location that you specified. Now, the reason why it happened in the video
is because the video ran out of frames, open CV could not find any more frames after the
last frame in this video. So it unexpectedly broke out of the while loop by itself by raising
a CV to error. And now you're gonna get the same error. If we comment this out, we uncomment
this out. And we specify a wrong path to this image. So I see me Oh wait, wait key, zero,
save that and run and we get the exact same error. This basically again says that open
CV could not find the image or the video frame at a particular location basically, it could
not be ready. That's what it's saying. So that's pretty much it. For this video, we
talked about how to read any images in open CV and how to read in videos using the video
capture class. In the next video, we'll be talking about how to rescale and resize images
and video frames in open CV. So see you then. Hey, everyone, and welcome back. In this video,
we're going to be talking about how to resize and rescale images and video frames in open
CV. Now, we usually resize and rescale video files and images to prevent computational
strain. Large media files tend to store a lot of information in it and displaying it
takes up a lot of processing needs that your computer needs to assign. So by resizing and
rescaling, we're actually trying to get rid of some of that information. rescaling video
implies modifying its height and width to a particular height and width. Generally,
it's always best practice to downscale or change the width and height of your video
files to a smaller value than the original dimensions. The reason for this is because
while most cameras your webcam included, do not support going higher than its maximum
capability. So for example, if a camera shoots in 720 P, chances are it's not going to be
able to shoot in 1080 P or higher. So to rescale a video frame or an image, we can create a
function called def rescale frame. And we can pass in the frame to be resized and scale
the value which by default we're going to set as point seven five. So what I'm going
to do next is I'm going to say with is equal to frame dot shape of one of one times scale.
And I'm going to copy this and do the same thing for the height. Now remember frame no
shape of one is basically the width of your frame or your image and frame note shape of
zero is basically the height of the image. Now since width and height are integers, I
can actually convert these floating point values to an integer by converting it to an
iron T. And what we're going to be doing is we're going to create a variable called dimensions,
and set this equal to a table of width, comma height. And we can actually return CV don't
resize the frame, the dimensions, and we can pass in it interpolations of CV dot into on
the school area. Now we'll be talking about CV dot resize in an upcoming video. But for
now, just note that it resizes the frame to a particular dimension. So that's all a function
does, it takes in the frame, and it scales that frame by a particular scalar value, which
by default is point seven, five. So let's actually try to see this in action. Let's
go back to this readout p y, and grab this code. And we can paste there, we don't need
us for now. uncomment these out. Now what I'm going to do is after I've read in the
frame, I'm going to create a new frame call frame on this go resized, and set this equal
to rescale frame of frame. And let's leave the scale value is point seven, five. And
we can actually display this video resized by passing a frame on the scope resized. Resize.
So let's save that and run Python rescale del p why that was an error. Okay, we don't
need this. Let's close that out, Save and Run. And this was our original video. And
this is actually a resize the video with the video resize by point seven 570 5%. We can
modify this by changing the scale of value to to maybe point two, so we rescaling to
20%. And we get an even smaller video in a new window. So let's close that out. Now you
can also apply this on images. So let's uncomment that out, change that to cat dot jpg. And
we can do receive our show. Image and pawson the resized image. And we can create a resize
image by calling rescale frame and we could pass in the IMG. So let's see that in Rome.
And this is a small videos we're not concerned with that. This is actually the big image
the large image. And this is the recent version of this image. So let's close that out. Now
there is another way of rescaling or resizing video frames specifically. And that's actually
using the capture dot set method. Now this is specifically for videos, and will work
for images. So let's go ahead and try to do that. Let's call this depth change rez. So
we're changing we're changing the resolution of the image of video. And we can pass in
a width and a height. And what we're going to do is we're going to say capture, don't
set three comma with and we're going to do the same thing with capture dot set four comma
height. Now three info basically stands for the properties of this capture class. So three
references the width and full references the height. You can also expand this to maybe
change the brightness in the image. And I think you can reference that by setting this
to 10. But for now we're going to be interested in the width and the height. Now, I do want
to point out this, this method will work for images, videos, and live video. Basically,
for everything you can use this rescale frame method. But the changes function only works
for live video. That is video you read in from an external camera or your webcam for
instance. So video that is going on currently, this is not going to work on standalone video
files, video files that already exist. It just doesn't work. So if you're trying to
change the resolution of live video, then go with this function if you're trying to
change the resolution of an old already existing video, then go with this function. So that's
pretty much it for this video that we talked about how to resize and rescale video frames
and images in open CV. In the next video, we'll be talking about how to draw shapes,
and write text on an image. So that's everything. I'll see you guys in the next video. Hey,
everyone, and welcome back to another video. In this video, we're going to be talking about
how to draw and write on images. So go ahead and create a new file and call this draw dot
p y. We're going to input CV two and CV, we're going to input the NumPy package that open
CV had installed previously. And we're going to input that as MP, we will read in an image
by saying OMG is equal to cv.rm, read person photos, photos slash cat dot jpg, we can display
that image in a new window. And we can do receive out of weight key zero. Now there
are two ways we can draw on images by actually drawing on standalone images like this image
of a cat to or we can create a dummy image or a blank image to work with. And the way
in which we can create a blank image is by saying blank is equal to NP dot zeros of shape
500 by 500. And give it a data type of ui 98. You ID eight is basically an image the
datatype of an image. So if you want to try and see this image, see what this image looks
like. We can say blank, and we can pass in like save that and run Python drawed or p
y. And this is basically the blank image that you can draw on. So we're going to be using
that in instead of drawing this cat image. But feel free to use this cat image if you'd
like. So the first thing we're going to do is try to paint is trying to paint the image
a certain color. And the way we do this is by saying blank and reference all the pixels
and set this equal to zero comma 255 comma zero. So by painting the entire image green,
and we can display this image by saying green in passing the blank image, save that and
run. Can I broadcast Yeah, okay, you need to give it a shape of three, basically, we
are giving the shape of height, width, and the number of color channels. So just keep
that in mind save up. And this is the green image that we get cool, we can even change
this and try to change this to red zero comma 255. Save that. And we get a red image over
here. Now you can also call a certain portion of the image by basically giving it a range
of pixels. So we can say 200 to 300. And then from 300 to 400. Save that and run and you
got a Red Square in this image. The next thing we're going to do is we're going to draw a
rectangle. And the way we do this is by using the CV don't rectangle method. This method
takes in an image to draw the rectangle over, which in this case is blank. And it takes
in point 1.2, color, thickness and a line type if you'd like. So the point one will
specifically be zero comma zero, which is the origin. And we can go all the way across
to 250 comma 250. Let's give it a color of zero comma 255 comma zero, which is green,
give it a thickness of let's say two, which is basically saying the thickness of the borders.
And once that's done, we can display this image by saying let's call this rectangle
in passing and passing the blank image. We can comment this out since we don't need this
anymore. And we get a green rectangle that goes all the way from the origin to 250 comma
250. You can play around with it if you like so we can go from 250 to maybe 500. And it
goes all the way across the image. So you basically divide the image in half. Now there
is a way of filling in this image a certain color. And the way we do this is instead of
saying thickness is equal to two, we say thickness is equal to CV dot field. That basically fills
in the rectangle to get this green rectangle. Now Alternatively, you can also specify this
as negative one, negative one. And we get the same result, what we can also do is, instead
of giving it fixed values like 250, and 500, what we could do is we could say, IMG done
shape of zero, of one divided by divided by two, and image dome shape of zero, divided
by divided by two. Let's save that and run. image is not in fact, God, this is blank,
this is blank, save that and run. And we get a nice little rectangle, or square, if you
will, in this image, what it basically did is it scaled the rectangle from instead of
being these, this entire square, this rectangle basically has dimensions, half of that of
the original image. So moving on, let's try and draw a circle. Draw circle. This is also
fairly straightforward, we do a CV dot circle. And we pass in the blank image. And we give
it a center, which basically the coordinates of the center for now let's set this to the
midpoint of this image by saying 250 comma 250. Alternatively, you could also get this
let's give it a radius of 40 pixels, give it a color of zero comma, zero comma 255,
which is red BGR. And give it a thickness of let's say three. We can display this image,
say, circle is equal to blank. And we get a nice little circle over here, that has its
center at 250 km 250, and radius of 40 pixels. Again, you can also fill in this image by
giving a thickness of negative one. Here, we get a nice little dot here in the middle.
Cool. Now there's something else that I forgot. And that is how to draw a line a standalone
line on the image. That again, is fairly straightforward, say draw a line, we use a cv.in line method.
And this takes in the image to draw the line on and two points, that's just copy these
points, basically everything. And this basically draws a point from zero comma zero to half
these image dimensions. So that's 252 50. And then it draws a line of color zero comma
255, comma zero. Let's set this to full white 2255, d 5255. And it's green thickness you
can specify as three. And we didn't display this image. See you don't on show colas line,
rule the line, blank image, and we get a line that goes all across from zero comma, zero
comma zero to 250, comma 250. Let's try and play around with this. And let's draw a line
from 100 to maybe 250. And then it goes all the way to 300 to 400, save that. And you've
got a line that goes from 100 100 to 300, comma 400. Cool. And finally, the last thing
that we will discuss in this video is how to write text on an image that that's right
text on an image. Now, the way we do this is very straightforward. We see we do a CV
dot put text. And this will put text on the blank image. We specify what we want to put
on. So let's say hello. We can give it an origin, which is basically where do we want
to draw the image from? Let's set this to 225 and 225. And we can also specify font
face. Now open CV comes with inbuilt fonts. And we will be using the CV dot font unschool
Hershey ns go. We'll be using the triple x, you have complex you have duplex you have
plain. You have script simplex and a lot of inbuilt phones. But for now, let's use a triplex.
Let's give this a font scale, which is basically how much do you want to scale the font by,
let's set this to 1.0. We don't want to scale a font, let's give it a color of zero comma
255, comma zero, and give it a thickness of two. Commit that out. And we can display this
image. So you don't I'm show let's call this text and pass in the blank image. And we get
some text that is placed on the image. You play around with it and say, Hello, my name
is Jason. Save and Run. And it goes off screen. I when we're dealing with large images, but
we can there's no way of actually handling this except for maybe changing the margins
here a bit, too, we can do that by saying let's say it's zero comma two to five. And
it sounds from zero and says Hello, my name is yes. So that's it. For this video, we talked
about how to draw shapes, how to draw a lines, rectangles, circles and how to write text
on an image. Now in the next video, we'll be talking about basic functions in open CV,
that you're most likely going to come across whatever project in computer vision you end
up doing. So if that's it, I'll see you guys in the next video. Hey, everyone, and welcome
back to another video. In this video, we're going to be talking about the most basic functions
in open CV that you're going to come across in whatever computer vision project you end
up building. So let's start off with the first function. And that is converting an image
to grayscale. So we've written an image, and we've displayed that image in a new window.
And currently, this is a BGR image, a three channel blue, green and red image. Now there
are ways in open CV to essentially convert those BGR images to grayscale so that you
only see the intensity distribution of pixels rather than the color itself. So the way we
do that is by saying gray is equal to CV dot CBT color, we pass in the image that we want
to convert from, which is IMG, and we specify a color code. Now this kind of code is CV
dealt kind of unskilled BGR. To great, since we're converting a BGR image to a grayscale
image. And we can go ahead and display this image by saying CV don't show passing gray
and pass in the gray image. Save that and run your Python basic.pi. And this was the
original image. And this is the grayscale image. Let's try this with another image.
Slide with no this is the image of a park in Boston save and maybe change that to Boston.
And this is the BGR image in open CV, and this is its corresponding grayscale image.
So nothing too fancy. We've just converted from a BGR image to a grayscale image. The
next function we're going to discuss is how to blur an image. Now blurring an image essentially
removes some of the noise that exists in an image. For example, in an image, there may
be some extra elements that were there because of bad lighting when the image was taken,
or maybe some issues with the camera sensor and so on. And some of the ways we can actually
reduce this noise is by applying a slight blur. There are way too many blurring techniques
which we will get into in the advanced part of this goes. But for now we're just going
to use the Gaussian Blur. So what we're going to do is we're going to create a blurred image.
I think blur is equal to CV dot Gaussian Blur. And this image will take an associate image
which is the IMG it will take in a kernel size, which is actually a two by two tuple
which is basically the window size that open CV uses to compute the blown the image. We'll
get into this in the advanced part of the scope so don't worry too much about this,
just know that this kernel size has to be an odd number. So So let's start a real simple
and keep the kernel size to three by three. And another thing that we have to specify
is CV dot border on school default. So go ahead and try to display this image, the same
blur, and pawson blue. Now, you will be able to notice some of the differences in this
image. And that is because of the blur that is applied on it. Right this people in the
background are pretty clear on this image. And over here, they're slightly blurred. To
increase a blind his image, we can essentially increase the kernel size from three by three
to seven by seven, save that and run. And this is the image that is way more blurred
than the previous image. So that's it. The next function we're going to discuss is how
to create an edge cascade, which is basically trying to find the edges that are present
in the image. Now again, there are many edge cascades that are available. But for this
video, we're going to be using the canny edge detector, which is pretty famous in the computer
vision world. Essentially, it's a multi step process that involves a lot of blurring and
then involves a lot of grading computations and stuff like that. So we're gonna say, Kenny,
Kenny is equal to CV dot Kenny, we pass in the image, we pass in to threshold values,
which for now I'm going to say 125 and 175. Let's go ahead and try to display this image,
get the Kenny images. And we can pass in county. Save that and run. And these were the edges
that were found in this image. As you can see that it hardly any edges found in the
sky. But a lot of features in the trees and the buildings. And quite a few, you know features
and edges in the grass and stuff. We can reduce some of these edges by essentially blurring
the image. And the way we do that is instead of passing the IMG, we pass in the blur. See
that run. And as you can see that there were far less edges that were found in the image.
And this is a way you can basically reduce the amount of edges that were found by a lot
by applying a lot of blur, or get rid of some of the edges by applying a slight blur. Now
the next function we're going to discuss is how to dilate an image using a specific structuring
element. Now the structuring element that we are going to use is actually these edges,
the canny edges that were found, so we're gonna say dominating the image. And the way
we do that is by saying dilated is equal to CV dot dilate. And this will take in the structuring
element, which is basically the canny edges. And we'll take a kernel size, which we'll
specify as three by three for now. And it will also take n iterations of one. Now, dilation
can be applied using several iterations of the time, but for now, we're just going to
stick with one. So go ahead and try to display this image by saying CV dot m shope. Call
this dilated. And we can pass in David. Save that and run. And if these were, if these
were edges, these are the dilated edges, we can maybe increase the kernel size to maybe
seven by seven and tried to see what that does hold on. And nothing much was done. Not
much difference was that let's try to increase the number of iterations to maybe three. And
it's definitely way thicker. But you're gonna see subtle differences with the amount of
features and edges that you find. Now there is a way of eroding this dilated image to
get back this structuring element. Now, it's not going to be perfect, but it will work
in some cases. So we're gonna say, call this roading and we call this eroded is equal to
CV don't erode, it will take in the dilated image, pass and dilated, it will take a kernel
size of let's start off with three by three and given n iterations of one just for now.
And we didn't display this image show coolest clothes eroded, eroded and if this was your
structuring element, and this was your dilate image, this is basically the result you get
from eroding this image. Now, it isn't the same as a structural element. But you can
just about to make the features that. But you can see that between this and this, there
is a subtle change in the edges and the thickness of these edges, we can maybe try to match
these values, so that we attempt so that there is an attempt to get back this edge cascade.
And yes, we got the edges back there, as you can see that you compare these two, they look
pretty much the same. And the edges are the same. So essentially, if you follow the same
steps, you can, in most cases, get back the same edge cascade. And probably the last function
that we're going to discuss is how to resize and crop an image. So we're going to start
with resize. So we come to resizing video frames and images in the previous video in
one of the previous videos. But we're just going to touch on the CBO resize function
just a bit. So we're going to say resized, resized equal to CV dot resize, this will
take an image to be resized, and it will take in a destination size, which let's set this
to 500 by 500. And so this essentially takes in this image of the park, and resize that
image to 500 by 500, ignoring the aspect ratio. So we display this image by saying saved out
I'm sure resized and resized. Save that and run. And let's go back to this image. If this
is the original image, this is the image that was resized to 500 by 500. Now by default,
there is an interpolation that occurs in the background, and that is CV dot into on the
scope area. Now this interpolation method is useful if you are shrinking the image to
dimensions that are smaller than that of the original dimensions. But in some cases, if
you are trying to enlarge the image and scale the image to a much larger dimensions, you
will probably use the inter underscore linear or the inter on scope cubic. Now cubic is
the slowest among them all. But the resulting image that you get is of a much higher quality
than the inter on scope area or the inter underscore linear. So let's touch on cropping.
And that's basically by utilizing the fact that images are arrays. And we can employ
something called Array Slicing, we can select a portion of the image on the basis of your
pixel values. So we can say cropped is equal to the image. And we can select a region from
50 to 200. And from 200 to 400. And we can display this image Cole is cropped, possibly
cropped. And this is a cropped image of let's go back here of this original image, you try
to superimpose them, it's probably going to be you. Yeah, it's basically this portion.
So that's pretty much it. For this video, we talked about the most basic functions in
open CV, we talked about converting an image to grayscale by applying some blur by creating
an edge cascade by dilating the image by eroding that dilated image by resizing an image and
trying to crop an image using Array Slicing. In the next video, we're going to be talking
about image transformations in open CV, that's translation, rotation, resizing, flipping
and cropping, so if you have any questions, leave them in the comments below. Otherwise,
I'll see you guys in the next video. Hey, everyone, and welcome back to this Python
and open CV course. In this section, we're going to cover basic image transformations.
Now these are common techniques that you would likely apply to images, including translation,
rotation, resizing, clipping and cropping. So let's start off with translation. Translation
is basically shifting an image along the x and y axis. So using translation, you can
shift an image up, down, left, right, or with any combination of the above. So so to translate
an image, we can create a translating function, we're gonna call this def translate This translation
function will take in an image to translate and take an x and y, x and y basically stands
for the number of pixels, you want to shift along the x axis and the y axis respectively.
So do translate an image, we need to create a translation matrix. So we're going to call
this transmit is equal to NP dot float 32. And this will take in a list with two lists
inside of it. And the first list we're going to say, one comma zero comma x, and zero comma
one comma y. And since we're using NumPy, we can import NumPy, import NumPy as NP. And
once we've created our translation matrix, we can essentially get the dimensions of the
image saying dimensions, which is a tuple of image don't shave off one, which is the
width an image dot shape of zero, which is the height. And we can return CV dot warp
a fine. This will take in the image matrix to trans MIT animal taking the dimensions.
And with that data, we can essentially translate our image. And before we do that, I do want
to mention that if you have negative values for x, you're essentially translating the
image to the left, negative negative y values implies shifting up positive x values implies
shifting to the right. And as you guessed, positive y values shifted down. So let's create
our first translated image. We're setting this equal to translate, we're going to pass
in the image, the image and we're going to shift the image right by 100 pixels, and down
by 100 pixels. That's to receive it on on the show, translated and translate tip. Save
that and run Python krones formations dot p y. And this is your translated image, it
was shifted down by 100 pixels and shifted to the right by 100 pixels. So let's change
that. Let's shift the image left by 100 pixels and down by 100 pixels. So we pass in negative
values for x and it moved to the left. Feel free to play around with these values as you
see fit. Just know that negative x shifts to the left, negative y shoves it up, x shifted
to the right and positive y values shifted down. Moving on, let's talk about rotation.
rotation is exactly what it sounds like rotating an image by some angle. Open CV allows you
to specify any point any rotation point that you'd like to rotate the image around. Usually
if the center but but with open CV, you could specify any arbitrary point it could be any
corner, it could be 10 pixels to the right 40 pixels down, and you can shift the image
around that point. So to draw to rotate the image, we can create a rotating function,
let's call this dev rotate. This will take an image angle to rotate around and a rotation
point which we're going to say which we're going to set is not so we're going to grab
the height and width of the image by pressing by setting this equal to IMG dot shape of
the first two values. Basically, if the rotation point is none, we are going to assume that
we want to rotate around the center. So we're going to say rot point is equal to width divided
by two divided by two in height divided by divided by two. And we can essentially create
the rotation matrix like we did with the translation matrix. By setting this equal to rot met is
equal to CV dot get rotation matrix 2d. We're going to pass in the center the rotation point
and angle to rotate around which is angle and a scale value. Now we're not interested
in scaling the image when we've rotated so we can set this to 1.0. value we can set a
dimensions variable equal to the width and the height and we can return the rotated image
which is a CV don't warp a fine image rot met the destination size which is dimensions.
And that's it. That's all we need for this function. So we can create a rotated image
by setting this equal to rotate, and we can rotate the original image by 45 degrees. So
let's display this image, call this rotated, and pass and rotated. Save that in rock. And
this is your rotated image. As you can see, it was rotated counterclockwise by 45 degrees.
If somehow you wanted to rotate this image clockwise, just specify negative values for
this angle, and it will rotate the image around rotated clockwise. Now you can also rotate
a rotated image that is take this image and rotated by 45 degrees further. So let's call
this rotated, rotated rotated is equal to rotate or rotate tid. And we can rotate this
image by another 45 degrees. So we're rotating it clockwise. And we can see the.on show called
is rotated, rotated. And we can pause and rotated, rotated, whatever, rotate it. And
this is your rotate rotated image. Now the reason why these black lines were included
is because if there's no image in it, if there's no part of the image in it, it's going to
be black by default. So when you took this image and rotated it by 45 degrees, you essentially
rotated the image, but introduce these black triangles. Now if you tried to rotate this
image further by some angle, you are also trying to rotate these black triangles along
with it. So that's why you get these kind of a skewed image. So there's additional triangles
are included over here. But save yourself the trouble and basically add up these angles
and you will get the final angle. So we can change that to 90 and retake the original
image by negative 90. And this is essentially the image that we were trying to go for, take
this image rotated 45 degrees clockwise and rotate this 45 degrees image by further 45
degrees, save yourself the trouble and add those two angle values. So so far, we've covered
two image transformations, translation and rotation. Now we're going to explore how to
resize an image. Now this is nothing too different from what we've discussed previously. But
let's touch on adjust a bit resizing. And we can create a resized variable and set this
equal to CV don't resize, we can pass in the image to resize and the destination signs
of maybe 500 by 500. And by default the interpolation is CV dot inter underscore area. You can maybe
change this to into underscore linear or inter underscore cubic. Definitely a matter of preference
depending on whether you're enlarging or shrinking the image. If you're shrinking the image,
you will probably go for into underscore area or stick with default. If you're enlarging
the image, you could probably use the inter underscore linear or the dansko cubic cubic
is slower, but the resulting image is better with over high quality. Again, I think it's
you different from what we discussed before. So we can display this image. I can resize
and passing and resized. Save that run and we've got a resized image. Next up we have
flipping how to flip an image. So we don't need to define a function for this, we just
need to create a variable and set this equal to CV dot flip. This will take in an image
and a flipped code. Now this flip code could either be 01 or negative one. Zero basically
implies flipping the image of vertically that is over the x axis one specifies that you
want to flip the image horizontally or over the y axis and negative one basically implies
flipping the image both vertically as well as horizontally. So let's start off with zero
claiming it vertically. I'm show call this flip in Parson boop, Save and Run. And this
is the image that was clipped vertically. Let's try out a horizontal clip how we get
a horizontal Flip, surely see whether it was a horizontal flip, we can bring these two
images together. And if they looked like mirror images, then it was flipped horizontally.
This is a kind of a symmetric image. So it's not that obvious, but bring them together
and you can maybe find out the difference. We could also try to flip the image vertically
and horizontally by specifying negative one as a flip code. And the image was flipped
both vertically, as well as horizontally mirror images, but reverse mirror images. And the
last method is cropping now being discussed cropping again, I'm just going to touch on
it, we can create a variable called corrupt and set this equal to IMG and perform some
Array Slicing. So 200 to 403 100 to 400. Save that and run. We didn't display the search.
Even though I'm show it's cool as cropped, past and cropped, Save and Run. And this is
the cropped image we try to bring this together can be brought together, cutting gram holders.
Okay. So that's pretty much it. For this video, we talked about translating an image, rotating
that image, resizing an image, flipping an image and cropping those images, we are basically
just covering the basics, basic image transformations. There are of course, way mo transformation
that you could possibly do with open CV. But just to keep this go simple and beginner friendly,
I'm only covering the basic transformations. So that's it for this video. In the next video,
we're going to be talking about how to identify countries in an image. So if you have any
questions, leave them in the comments below. Otherwise, I'll see you guys in the next video.
Hey everyone, and welcome back to another video. In this video, we're going to be talking
about how to identify contours in open CV. Now contours are basically the boundaries
of objects, the line or curve that joins the continuous points along the boundary of an
object. Now from a mathematical point of view, they're not the same as edges. For the most
part, you can get away with thinking of contours as edges. But from a mathematical point of
view, contours and edges are two different things. contours are useful tools when you
get into shape analysis and object detection and recognition. So in this video, I sort
of want to introduce you to the idea of contours and how to identify them in open CV. So the
first thing I've done is I've read in a file, an image file, and I've displayed that image
using the cv.rm show method. Then next thing I want to do is convert this image to grayscale
by saying gray is equal to CV dot CVT color IMG CV dot color on this go BGR to great,
and we can display this. So just know that we're on the same footing. I'm going to run
this Python, Cantu's down p y. And we get a gray image over here. Now after this, I
want to essentially grab the edges of the image using the canny edge detector. So I'm
going to say Kenny is equal to CV Kenny, we're going to pass in the IMG and we're going to
give it to threshold values. So 125 and 175. And we can display this image calling this
Kenny edges passing Kenny. I save that and run it I didn't save it, save it in ROM and
these are the edges that were there in the image. Now, the way we find the contours of
this image is by using the find contours method. Now this method basically returns two things,
contours and higher keys. And essentially this is equal to CV dot find Cantu's. This
takes in the edges. So Kenny, it takes in a mod in which to find the contents now this
is either CV dot retter on a scope tree, if you want all the hierarchical contours, or
the rhetoric external if you want only the external countries, or, or retter list if
you want all the cartoons in the image. The next method we pass in is actually the cone
to approximation method for now we're going to set this to CV dot chain, unscrew approx
ns go numb. So let's, let's just have a top down look at what this function does. So essentially,
the CBO fund contours method looks at the structuring element or the edges of a found
in the image and returns to values, the contours, which is essentially a Python list of all
the coordinates of the contours that were found in the image. And hierarchies, which
is really out of the scope of this course. But essentially, it refers to the hierarchical
representation of contours. So for example, if you have a rectangle, and inside the rectangle,
if you have a square, and inside of that square, you have a circle. So this hierarchy is essentially
the representation that open CV uses to find these courtrooms. This even retinal list essentially
is a mod in which this fine contries method returns and finds the cuantos. Read a list
essentially returns all the quantities that find in the image. We also have Reto external
that we discussed radix download retrieves only the external conduits to all the ones
on the outside, it returns those revenue underscore tree returns all the hierarchical contours,
all the contours that are in a hierarchical system that is returned by record underscore
tree. For now, I'm just going to set this to will list to return all the contours in
the image. The next one we have is the contour approximation method. This is basically how
we want to approximate the contour. So chain approx none does nothing, it just returns
all of the contracts. Some people prefer to use red chain approx symbol, which essentially
compresses all the quantities that are returned in the simple ones that make most sense. So
for example, if you have a line in an image, if you use chain approx none, you are essentially
going to get all the contours all the coordinates of the points of that line, chain approx simple
essentially takes all of those points of that line, compresses it into the two end points
only. Because that makes the most sense, a line is defined by only two end points, we
don't want all the points in between. That, in a nutshell is what this entire function
is doing. So since cartoons is a list, we can essentially find the number of cartoons
that were found by finding the length of this list. So we can print print length of this
list. And we can say fair, we can say we can say these many contused. Found. Okay,
so let's say that and Ron. And we found 2794 quantos in the image. And this is huge. This
is a lot of code who's ever found in the image. So let's do a couple of things. Let's try
to change this chain approx symbol to chain approx none, and see what that does. See how
that affects our length. Now there isn't any difference between those two, because I'm
guessing that there were no points to compress and sin there are a lot of edges and points
in this image. So there wasn't a lot of compression. So let's change the back to symbol. And actually,
what we want to do is I want to blow this image before I find the edges. So let's do
this. Let's do a blue is equal to CV dot Gaussian Blur can pass in the gray image. And we can
give the kernel size of let's let's do a lot of blur. So five by five. And maybe we can
give it by the default of CV dot border on disko default. And we can if you want to,
and we can display this image, call this blur and pass an error we can find the edges on
this blurred image. So let's close below. And as you can see this significant reduction
in the number of Quorn twos that were found just by blurring the image. So it went all
the way from 2794 to 380. That's closest seven times just by blurring the image with the
kernel size of five by five. Okay, now there is another way of finding the corner shoes
is that it's stead of using this canny edge detector, we can use another function in open
CV, and that is threshold. So I'm just going to comment this out. And down here, what I'm
going to do is I'm going to say, ret Thresh is equal to CV don't threshold, this will
take in the gray image, and we've taken a threshold value of 125 and a maximum value
of 255. I don't worry too much about thresholding. For now, just know that threshold essentially
looks at an image and tries to binarize that image. So if a particular pixel is below 125,
if the density of that pixel is below 125, it's going to be set to zero or blank. If
it is above 125, it is set to white or two by five. That's all it does. And in the find
quantities method, we can essentially pass in the thrush value. So let's save that. Let's
close this out and try to run that. Type. Okay. threshold missing. Okay, I think I forgot
one part, where to specify a threshold and type. So this is CV dot Thresh. On this go,
binary, binary raising the image basically. Okay, let's run that. And there were 839 contours
that were found, we can visualize that let's print ad to display this Thresh. Image, passing
Thresh. Same that run. And this was the thresholded image you're using 125. close this out, using
125 as our threshold value, and 255 as a maximum value, we got this thresholded image. And
when we tried to find the current use on this image, we got 839 concepts. Now don't worry
too much about this thresholding business, we'll discuss this in the advanced section
of this goes more in depth just know that thresholding attempts to binarize an image,
take an image and convert it into binary form that is either zero or black, or white, or
to Vi five. Now what's cool in open CV is that you can actually visualize the contours
that were found on the image by essentially drawing over the image. So what do we do real
quick is actually input NumPy NumPy as NP and after this, I'm going to create a blank
variable and set this equal to NP dot zeros of image dot shape of the first two values,
and maybe give it a data type of I know you are 28 we can display this image because blank
pawsome blank, just to visualize and have a blank image to work with. Let's save that
and go to a blank image. This is of the same dimensions as our original accounts image.
So what I'm going to do is I'm going to draw these contours on that blank image so that
we know what kind of contours that open CV found. So the way we do that is by using the
CV dot draw contours method, it takes in an image to draw over fill blank, it takes in
the contours, which has to be a list, which in this case is just the quantities list.
It takes an account to index which are basically how many countries do you want in the image.
Since we want all of them since we want to draw all of them, we can specify a negative
one, give it a color, let's add this to BGR. So let's set this to red zero comma zero comma
255. And we can give it a thickness of maybe two. And we can display the blank image. So
let's call this contused join. And we can pass in blank. Save that and run. Okay, there
was an error I think this has to be shaped. Okay, so these were the cartoons that would
draw on the image. If you take a look at the threshold value thresholded image, it's not
the same thing. What I believe it attempted to do is instead it found the edges of this
image all the edges of this image and attempted to draw it out on this blank image. Let's
set this so let's set the thickness to maybe one so that we have a crisper view Okay, so
these were the quantities that were drawn in the image. And in fact, if you try to visualize
it with Kenny, let's actually visualize that with Kenny uncomment. That out, run. blows
on the point undefined. Okay, that has to be an image. Okay, let's look at Kenny, let's
look at this. Okay, it's not the same thing. And that makes sense, because our firing coaches
method and use Kenny, as the basis of detecting and finding the controls. But we can do that.
Let's not use a thresholding method. And instead, let's use Kenny. So we can pass in Kenny here.
Save that and run. And, okay, that pretty much the same thing, right? It's basically
a mirror image of these two, like I said, you can get away with thinking of contours
as edges. They're not the same thing. But, but you can think of them as edges. Because
from a programming point of view, they kind of like the edges of the image. Right? The
other boundaries, they are curves that join the points along the boundary, those are basically
edges. So let's try to blow that image. Let's uncomment that out. Let's see what that does.
I don't think that had any effect because we didn't pass in blood. Okay, 380 countries
have found and mirror images of each other. So generally, what I recommend is that you
use scanning method first, and then try to find the corn who's using that, rather than
try to threshold the image and then find the contours on that. Because like we will discuss
in the advanced section, this type of thresholding. The simple thresholding has its disadvantages.
Maybe because we're passing in a simple, just one value, dread binarize the image using
this threshold value, right? It's not the most ideal, but in some cases, in most cases,
it is most favored kind of thresholding because it's the simplest, and it does the job pretty
well. So that's pretty much it. For this video, we talked about how to identify quantities
in open CV. But in two methods first trying to find the edge cascades of the image using
the canny edge detector, and try to find the quantities using that and also trying to binarize
that image using the CV dot threshold and finding the contours on that. So if you have
any questions, leave them in the comments below. I'll be sure to check them out. Otherwise,
as always, I'll see you guys in the next video. Hey, everyone, and welcome back to another
video. We are now at the advanced section of this course, where we are going to discuss
the advanced concepts in open CV. So what we're going to be doing in this video is actually
discussing how to switch between color spaces in urgency. Our color spaces, basically a
space of colors, a system of representing an array of pixel colors. RGB is a kind of
space grayscale is color space. We also have other color spaces like HSV, lamb, and many
more. So let's start off with trying to convert this image to grayscale. So we're going to
convert from a BGR image which is open CV is default way of reading and images. And
we're going to convert that to grayscale. So the way we do that is by saying gray is
equal to CV dot CBT color. We pass in the image and we specify a color code, which is
CV dot color, underscore BGR to to grip since we're converting from a BGR image format to
grayscale format, and we can display this image I st gray and passing in grip. Let's
save that and run Python spaces dot p y. We had a problem as a comma, Save and Run. And
this is the grayscale version of this BGR image. Cool pretty cool. grayscale images
basically show you the distribution of pixel intensities at particular locations of your
image. So let's start off with trying to convert this image to an HSV format. So from Jeff
from vgr to HSV. HSV is also called hue saturation value and is kind of based on how humans think
and conceive of color. So the way we conduct that is by saying HSV is equal to CV dot CBT
color, we pass in the IMG variable. And we specify a color code, which is CV dot color,
undergo BGR to HSV. And we can display the syringe called as HSV and pass in HSV. Let's
save that. And this is the HSE version of this BGR image. As you can see that there
was a lot of green in this era and the skies are reddish. Now we also have another kind
of color space. And that is called the LA be color space. So we're going to convert
from BGR to L A, B. This is sometimes represented as L times A times B, but but v free to use
whatever you want. So lb is equal to CV dot CVT color, we pass the MG and the color on
the scope of BGR. to AB see that I'm sure colas lamb pass and lamb is wrong that and
this is the LGB version of this BGR image. This kind of looks like a washed down version
of this BGR image. But hey, that's the lamb format is more tuned to how humans perceive
color. Now when I started off with this goes, I mentioned that open CV reads in images in
a BGR format that has blue, green and red. And that's not the current system that we
use to represent colors outside of open CV. Outside of open CV, we use the RGB format,
which is kind of like the inverse of the BGR format. Now if you try to display this IMG
image in a Python library that's not open CV, you're probably going to see an inversion
of colors. And we can do there real quick. Let's try to input mat plot lib dot pie plot
as PLT. And we can can, we can basically uncomment commented that out. And we can try and display
this image variable. So we're gonna say PLT dot, I am show pass in the image. And we could
say a peak, or we could say PLT dot show, maybe let's comment this out, save that and
run. And this is the image you get. Now, if you compare with the image that open CV read,
this is completely different, these two are completely different images. And the reason
for this is because this image is a BGR image and open CV displays BGR images. But now if
you tried to take this BGR image and try to display it in matplotlib, for instance, matplotlib
has no idea that this image is a BGR image and displays that image as if it were an RGB
image. So that's why you see an inversion of color. So where there's red over here,
you see a blue, where there's blue over here you see a red, and there are ways to convert
this from BGR to RGB. And that is by using open CV itself. So let's comment that out.
And let's uncomment this all out. And right over here, let's say BGR to RGB. And what
we're going to say is our RGB is equal to CV dot CVT color, we can pass in the BGR image
oopsie, we can pass in the br image. And what we're going to do is specify a color code,
which you see without color on the scope BGR to RGB. And we can try to display this image
in in open CV and see what that displays RGB. And we can also display this in matplotlib.
So I've passed in the RGB. And we can do PLT dot show, save that and go here it is you
Python spaces dot p y. What I'm most interested in is this. And this. Now again, you see an
inversion of colors, but this time in open CV because now you provided open CV and RGB
image. And it assumed it was a BGR image. And that's why there's an inversion of colors.
But we pass in the RGB image to matplotlib and matplotlib is default is RGB. So that's
why I displayed the proper image. So just keep this in mind when you're working with
multiple libraries, including open CV and matplotlib for instance, because do keep in
mind the inversion of colors that tends to take place between these two libraries. So
now another thing that I want to do is we've essentially converted the BGR to grayscale,
we've essentially converted BGR, HSV BGR to RGB BGR to RGB, what we can do is we can do
the inverse of that, we can convert a grayscale image to BGR, we can convert an HSV to BGR,
we can convert an LNB to BGR, and RGB to be GL, and so on. But here's one of the downsides.
You cannot convert grayscale image to HSV directly. If you wanted to do that, what do
you have to do is convert the grayscale to BGR. And then from video to HSV. So we're
gonna do that real quick. So we're gonna say HSV, two BGR. Okay, so the first thing we
do is HSV, underscore vgr. Basically, converting from HSV to BGR is equal to CV dot CVT color,
this will take in the HSV image. And the color code will be color on Cisco HSV, two BGR.
And we can display this image, let's call this HSV, two BGR and pass in HD on the scope
BGR. On screw VR, save that and run. Okay, we're not interested in this. So let's close
this out. But essentially, this is the HSV, two BGR image. If this was the HV image, we
converted this image to BGR. And we can try this with lamb. So let's call this lamb to
lamb, and of course, lamb. And let's copy this and paste that. We can get rid of Mapplethorpe's
it's been addressed in an email. So go out and run. Okay, that was a mistake. We said
HSV, lamb to L baby to BGR. That was my mistake. Cool. So if this was the lamb version, this
is the lamb to BGR version back from BGR to lamb and from lamb to BGR. So that's pretty
much it. For this video, we discussed how to convert, we discussed how to convert between
color spaces from BGR to grayscale, HSV, LGB, and RGB. And if you want to convert from grayscale
to nav, for instance, note that there is no direct method, what you could do is convert
that grayscale to BGR. And then from BGR to and maybe that's possible. By directly. I
don't think there was a way to do that, if open CV could come up with the feature like
that, it would be good, but it's not gonna hurt you to write extra lines of code, at
least two or three lines of code extra, moderately hard. In the next video, we will be talking
about how to split and merge color channels in open CV. If you have any questions, leave
them in the comments below. Otherwise, I'll see you guys in the next video. Everyone and
welcome back to another video. In this video, we're going to be talking about how to split
and merge color channels in open CV. Now, a color image basically consists of multiple
channels, red, green, and blue. All the images you see around you all the BGR or the RGB
images are basically these three color channels merged together. Now open CV allows you to
split an image into its respective color channels. So you can take a BGR image and split it into
blue, green and red components. So that's what we're going to be doing in this video,
we're going to be taking this image of the park that we had seen in previous videos,
and we're going to split that into its three color channels. So the way we do that is by
saying b comma g comma r, which stands for the respective color channels, and set this
equal to CV dot split split of the image. So the CV dot split basically split the image
into blue, green and red. And we can display this image by saying CV dot I'm sure, let's
call this blue and pass in blue. And let's do the same for green image and pass in G
and we can do the same for the red part two are and we can actually visualize the shape
the shapes of these images. So let's first print the image node shape, and then print
the bead on shape. And then print the genome shape and then print the our dot shape. Basically,
we're printing the shapes and dimensions of the image and the blue, green and red and
we're also displaying these images. So let's run Python split merge dot p Why. And these
are the images that you get back. This is the blues, the blue image, this is the green
image. And this is the red image. Now these are depicted and displayed as grayscale images
that show the distribution of pixel intensities. regions where it's lighter showed that there
is a far more concentration of those pixel values and regions where it's darker, represented
a little or even no pixels in that region. So take a look at the blue pick the blue channel
first. And if you can, if you compared with the original image, you will see that the
sky is kind of almost white, this basically shows you that there is a high concentration
of blue in the sky, and not so much in the the trees or the grass, let's take a look
at the green. And there is a fairly even distribution of pixel intensities between the between the
grass, the trees, and some parts of the sky. And take a look at the red color channel.
And you can see that parts of the trees that are red are whiter and the grass in the sky
are not that white in this red image. So this means that there is not much red color in
those regions. Now coming back, let's take a look at the shapes of the image. Now this
stands for the original image, the BGR image, the additional elements in the tuple here
represents the number of color channels, three represents three color channels blue, green,
and red. Now if we proceeded to display the shapes of BG and our components, we don't
see a three in the tuple. That's because the shape of that component is one. It's not mentioned
here, but it is one. That's why when you try to display this image using see even if I'm
show it displays it as a grayscale image, because grayscale images have a shape of one.
Now, let's try and merge these color channels together. So the way we do that is by seeing
the merge image, merged images equal to CV dot merge. And what we do is we pass in a
list of blue of blue comma g comma r, I'd save that in let's display that things either
on show call this them call this the merged image. And we can pass in merged. So let's
save that and run. And we get back the merged image by basically merging the three individual
color channels red, green, and blue. Now there is a way an additional way of looking at the
actual color there is in that channel. So instead of showing you grayscale images, it
shows you the actual color involved. So for the blue image, you get the blue color channel
for the red channel, you get the red color for that channel. And the way we do that is
we actually have to reconstruct the image. The shapes of these images are basically grayscale
images. But what we can do is we can actually create a blank image, a blank image using
NumPy. And essentially, what we're going to do is we're going to say blank is equal to
NP dot zeroes. And we're going to set this to the shape of the image, but only the first
two values. And we can give it a data type of you iemt, eight, eight, which basically
are for images. And to print the blue color channel, what we're going to do is we're going
to say, down here, we're going to say blue is equal to CV dot image, we're going to pass
in the list of b comma, blank comma blink. And we're going to do the same thing for green
and set is equal to CV dot merge of blank comma g comma blank. And we're going to do
the same thing for red by setting this equal to CV dot merge of blank comma blink, comma,
comma red. Basically, what I've done is this blank image basically consists of the height
and the width, not necessarily number of color channels in the image. So by essentially merging
the blue image in its respective compartment, so blue, green and red, we are setting the
green and the red components to black and only displaying the blue channel. And we're
doing the same thing for the green by setting the blue and the red components to black.
And the same thing for red by setting the blue and the green components to black. And
we can display this by saying blue, green, and red. Let's save that and run and now you
actually get the color in its respective color channels. Take a look at this, you now be
able to visualize the distribution much better. Here you can see lineup later portions represent
a high distribution. Lighter portions here represent the high distribution of red and
higher and wider regions represent a high distribution of green. So essentially, if
you take these three images of these color towns and merging them together, you essentially
get back the merged image. That's the merged image. So that's pretty much it. For this
video, we discuss how to split an image into three respective color channels, how to reconstruct
the image to display the actual color involved in that channel, and how to merge those color
channels back into its original image. In the next video, we'll be talking about how
to smooth and blur an image using various blurring techniques. If you have any questions,
leave them in the comments below. Otherwise, I'll see you guys in the next video. Hey,
everyone, and welcome back to another video. In this video, we're gonna address the concepts
of smoothing and blurring in urban CV. Now, before I mentioned that we generally smooth
and image when it tends to have a lot of noise, and noise that's caused from camera sensors
are basically problems in lighting when the image was taken. And we can essentially smooth
out the image or reduce some of the noise by applying some blurring method. Now Previously,
we discussed the Gaussian Blur method, which is kind of one of the most popular methods
in blurring. But generally, you're going to see that Gaussian Blur won't really suit some
of your purposes. And that's why there are many blurring techniques that we have. And
that's what we're going to address in this video. Now, before we actually do that, I
do want to address a couple of concepts. Well, let's actually go to an image and discuss
what exactly goes on when you try to apply blur. So essentially, the first thing that
we need to define is something called a kernel or window. And that is essentially this window
that you draw over an image that has two lines here. Let's draw another line. So this is
essentially a window that you draw over a specific portion of an image. And something
happens on the pixels in this window. Let's change it to blue. Yeah. So essentially, this
window has a size, this size is called a kernel size. Now kernel size is basically the number
of rows and the number of columns. So over here, we have three columns and three rows.
So the kernel size for this is three by three. Now, essentially, what happens here is that
we have multiple methods to apply some blue. So essentially, blur is applied to the middle
pixel as a result of the pixels around it, also called the surrounding pixels. Let's
change that to a different color. So something happens here as a result of the pixels around
the surrounding pixels. So with that in mind, let's go back and discuss the first method
of blurring which is averaging. So essentially, averaging is we define a kernel window over
a specific portion of an image, this window will essentially compute the pixel intensity
of the middle pixel of the true center as the average of the surrounding pixel intensities.
So if this was to green, suppose if this pixel intensity was one, this was maybe two, this
is 345678, you get the point. Essentially, the new pixel intensity for this region will
be the average of all the surrounding pixel intensity. So that's summing up one plus two
plus three plus four plus five plus six plus seven plus eight, and dividing that by eight,
which is essentially the number of surrounding pixels. And we essentially use that result
as the pixel intensity for the middle value, or the true center. And this process happens
throughout the image. So this window basically slides to the right. And once that's done,
it slides down, and computed basically for all the pixels in the image. So let's try
to apply and see what this does. So what we're going to do is we're going to say average,
is equal to CV don't blur. The CV or blow method is a method in which we can apply averaging
blur. So we define the source image which is IMG, we give it a kernel size of let's
say three by three. And that's it. We can display this image called as average, average
blur. Save that and run Python smoothing dot p y net Gosh, we have to pass an average,
save that and run. And this is basically the average blow that's applied. So what the algorithm
did in the background was essentially define a candle window of a specified size three
by three. And it computed the center value for a pixel using the average of all the surrounding
pixel intensities. And the result of that is we get a blurred image. So the higher kernel
size we specified, the more blur there is going to be in the image. So let's increase
that to seven by seven and see what that does. And we get an image with way more blur. So
let's move on to the next method, which is the Gaussian Blur. So Gaussian basically does
the same thing as averaging, except that instead of computing the average of all of this running
pixel intensity, each running pixel is given a particular weight. And essentially, the
average of the products of those weights gives you the value for the true center. Now using
this method, you tend to get less blurring than compared to the averaging method. But
the Gaussian Blur is more natural as compared to averaging. So let's print that out. Let's
call this Yes. And set this equal to CV dot Gaussian Blur. And this will take in the source
image, so IMG kernel size of seven by seven, just to compare with the averaging. And another
parameter that we need to specify is sigma x, or basically the standard deviation in
the x direction, which for now, just going to set as zero. And we can put that out, call
this Gaussian Blur and pass in gaps, save that and run. If you can bear with this, you
see that both of them use the same code size, but this is less blurred as compared to the
average method. And the reason for this is because a certain weight value was added when
computing the blur. Okay, so let's move on to the next method. And that is median blur.
So let's go back to our image. And medium blurring is basically the same thing as averaging,
except that instead of finding the average of the surrounding pixels, it finds the median
of the surrounding pixels. Generally, medium blurring tends to be more effective in reducing
noise in an image as compared to averaging and even Gaussian Blur. And it's pretty good
at removing some salt and pepper noise that may exist in the image. In general, people
tend to use this image in advanced computer vision projects that tend to depend on the
reduction of substantial amount of noise. So let's go back here. And the way we apply
the blur is by saying, let's call this median and set the Z and set this equal to CV dot
median, blue, we pass in the source image, and this kernel size will not be a tuple of
three by three, but instead, just an integer to three. And the reason for this is because
open CV automatically assumes that this kernel size will be a three by three, just based
off this integer. And we can print this out. Let's call this median, blue, and pass in
median. And let's compare it with that. So I set that to seven. And comparing it with
Gaussian Blur, and averaging blur, you tend to look at this. And you can make up some
differences between the two images. So it's like as if this was your painting, and it
was still drawing. And you take something and smudge over the image and you get something
like this. Now generally, medium blurring is not meant for high Colonel sizes like seven
or even five in some cases, and it's more effective in reducing some of the noise in
the image. So let's, let's change this all to three by three. Let's copy that, change
that to three by three. And we can change that to three. And now let's have a comparison
between the three. This is your Gaussian below. This is your average in blue, this is your
median love. So compared with these two, you can see that there is kind of less blurring
when Gaussian when you can sort of make out the differences between the two Very subtle,
but there are a couple of differences between the two. Finally, the last method we're going
to discuss is bilateral blurring caused by natural lateral. Now bilateral bearing is
the most effective, and sometimes used in a lot of advanced computer vision projects,
essentially because of how it blurs. Now traditional blurring methods basically blur the image
without looking at whether you're, whether you're reducing edges in the image or not.
bilateral blurring applies blurring but retains the edges in the image. So you have a blurred
image, but you get to retain the edges as well. So let's call this bilateral and multilateral
and set this equal to CV dot bilateral filter. And we pass in the image, we give it a diameter
of the pixel neighborhood. Now notice this isn't a kernel size, but in fact, a diameter.
So let's set this to five for now, give it a sigma color, which is basically the color
sigma sigma color, a larger value for this color sigma means that there are more colors
in the neighborhood, that will be considered when the blue is computed. So let's set this
to 15. For now. And sigma space is basically your space sigma. larger values of this space,
sigma means that pixels further out from the central pixel will influence the blurring
calculation. So let's set this to 50. So let's take a look at that sigma spacing. So for
example, in bilateral filtering, if this is the value for this central pixel, or the true
center is being computed, by giving a larger values for the Sigma space, you essentially
are indicating that whether you want pixels from this far away, or maybe this far away,
or even this far away from influencing this particular calculation. So if you give like
a really huge numbers, then probably a pixel in this region might influence the computation
of this pixel value. So let's set this to 15. For now, and let's display this image.
So call the cv.on show is called as bilateral and pass on bilateral. Let's save that and
run. And this is your bilateral image. So let's compare with all the previous ones that
we had. Compared with this. Much better compared with averaging way much better. Let's compare
with median. The edges are slightly, it's slightly blurred. If you compare with the
original image, they kind of look the same thing. Okay, it kind of looks like there's
no blur applied. So maybe let's increase this diameter to I know 10. And not much was done,
the edges are still there, it kind of looks like the original image itself. So let's try
into one of the other parameters. Let's add this to 3435. Let's set this dude 25. We're
only playing around with these with these values. And now you can basically make our
generic that this is starting to look a lot like median blow. We need even larger values.
It's starting to show you that this is more looking like a smudged painting version of
this image, right, there's a lot of blur applied here, but the council looking smudged. So
definitely keep that in mind when you are trying to apply blurring the image, especially
with the bilateral and median lowering, because higher values of this basic mouth or bilateral
or the kernel size for medium glowing, and you tend to end up with a washed up smudged
version of this image. So definitely keep that in mind. But that kind of summarizes
whatever we we've done in this video, we discussed averaging, Gaussian, median and bilateral
blurring. So in the next video, we'll be talking about bitwise operators in open CV. So again,
like always, if you have any questions, leave them in the comments below. Otherwise, I'll
see you guys in the next video. Hey everyone, and welcome back to another video. In this
video we're gonna be talking about bitwise operators in urban CV. Now, there are four
basic bitwise operators and or XOR and not. If you've ever taken an introductory CS course,
you will probably find these terms familiar bitwise operators, and they are in fact used
a lot in image processing, especially when we're working with masks like we'll do in
the next video. So at a very high level bitwise operators operate in a binary manner. So a
pixel is turned off if it has a value of zero, and is turned on if it has a value of one.
So let's actually go ahead and try to import NumPy as NP. And what I'm going to do is I'm
going to create a blank variable and set this equal to NP dot zeros of size 400 by 400.
And we can give it a datatype of you I empty it is what I'm going to do is I'm going to
use this blank variable as a basis to draw a rectangle and draw a circle. So I'm going
to say return angle is equal to CV dot rectangle, we can say blink dot copy. And we can pass
in the starting point. So let's give it a margin of around 30 pixels on either side.
So we're going to start from 30, comma 30. And we can go all the way across to 370370.
And we can give it a color. Since this is not a color image, but rather binary image,
we can just give it one parameter, so 255. White, and give it a thickness of negative
one, because we want to fill this image. And then I'm going to create another circle variable
and set this equal to CV dot circle, we're going to say blank, don't copy, we are going
to give it a center. So the center will be the absolute center, so 200 by 200. And let's
give it a radius of give a radius of 200. And give it a color up to five, five, and
let's fill in the circle. So negative one. So let's display this image and see what we've
seen or we're working with. So we'll call this rectangle and passing the rectangle.
And we're going to do the same thing with the circle, it's called a circle. And pass
in the circle, save that and run Python bitwise r p y. So we have two images that we're going
to work with this image of rectangle, and this image of a circle. So let's start off
with the first basic bitwise operator, and that is bitwise. And so before we actually
discuss what bitwise ad really is, let me show you what it does. So essentially, what
I'm going to do is I want to say bitwise is go and is equal to CV dot bitwise. And, and
basically what I have to do is pass in two source images that are these two images, rectangle,
and circle. Now we can display this image, let's call this beautiful lines, and let's
pass in bitcoins and save, run. And essentially, you get back this image. So essentially, what
bitwise AND did was it took these two images, placed them on top of each other, and basically
returned the intersection. Right, and you can make out when you take this image, put
it over this image, you have some triangles that are common to both of these images. And
so those are set to black zero, while the common regions are returned. So the next one
is basically bitwise. Or now bitwise, or real simply returns both the intersecting as well
as the non intersecting regions. So let's try this bitwise OR is equal to CB dot bitwise
AND scope or you pass in rectangle, we pass in circle. Now we can print that, let's call
this bitwise OR pass in bitwise. Oops, that was or save that and run and bitwise OR, okay,
there's a bitwise OR, by mistake. It's a bitwise OR basically return this funky looking this
funky looking shape. Essentially what it did is it took these two images, put them over
each other from the common regions and also found regions that are not common to both
of these images and basically superimpose them. So, basically, you can just put them
together and find the resulting shape and this is what you get, but this image over
this and you get this moving on. The next one is bitwise XOR, which basically is good
for returning the non intersecting regions. So this found the the intersecting oops, the
inter setting regions this found the sky Brought back, the no one intersecting in interest
selecting regions, and xR only finds the non intersecting regions. So let's do that I say
bitwise call this XOR is equal to CV dot bitwise underscore xR, we pass in the rectangle, passing
the rectangle when we pass in the circle, we can display this CV and I'm sure close
bitwise XOR. And we can pass in bitwise XOR. Save that and run. And here we have the non
intersecting regions of these two images when you put them over each other. Pretty cool.
And just to recap, this bitwise AND AGAIN, returns the intersection regions bitwise,
or returns the knowledge second regions as well as the intersecting regions bitwise XOR,
returns the knowledge second regions. So essentially, if you take this bitwise XOR, and subtract
it from bitwise, or you get bitwise end. And conversely, if you subtract bitwise, and from
the device, or you get bitwise XOR. Just so essentially, that's a good way of visualizing
what exactly happens with these bitcoins operators. And finally, the last method we can discuss
is bitwise. Not essentially, it doesn't return anything. What it does is it inverts the binary
color. So let's do that. So let's call this bitwise. Not is equal to CV dot bitwise. underscore
not. And this only takes in one source image. So let's set this to the rectangle put out.
And we can display this. Let's call this rec tangle not, we can pass in bitwise not see
that. And basically what it did is if you look at this image, it found all the white
regions, all the white pixels in the image and inverted them to black and all the black
images it inverted to white, essentially, it converted the white to black and from the
ads from black to white. So we can try that with the circle. Let's call this circle, we
can pass in the circle here. Save and Run and the resultant the resulting circle, not
that you get is this. This is white hole. This is a black hole for physicists out there.
Okay, so that's pretty much it. For this video, I just wanted to introduce you all to the
idea of bitwise operations and how it works. In the next video, we'll be actually talking
about how to use these bitwise operations in a concept called masking. So if you have
any questions, leave them in the comments below. Otherwise, I'll see you guys in the
next video. Hey, everyone, and welcome back. In this video, we're going to be talking about
masking in open CV. Now in the previous video, we discussed bitwise operations. And using
those bitwise operations, we can essentially perform masking in open CV masking essentially
allows us to focus on certain parts of an image that we'd like to focus on. So for example,
if you have an image of people in it, and if you're interested in focusing on the faces
of those people, you could essentially apply masking and essentially mask over the people's
faces and remove all the unwanted parts of the image. So that's basically our high level
intuition behind this. So let's actually see how this works in open CV. So I basically
read in a file and display that image. The other thing I'm going to do is I'm going to
import NumPy NumPy as NP, what I'm going to do is I'm going to say blank is equal to NP
dot zeros of size of size image dot shape with the first two values. Now this is extremely
important, the dimensions of the mask have to be the same size as that of the image.
If it isn't, it's on good work. And we can give it a data type of UI eight, you can see
it if you want to display this, we can display this. It's just going to be a black image,
schools blank image and pawson blank. Essentially, what I'm going to do is I'm going to draw
a circle over this blank image and call that my mask. So I'm going to say mask is equal
to CV dot circle. We're going to draw the blank image on the blank image, we can give
it a center of this image so let's say image dot shape of Have one divided by two divided
by two, and image down shape of two image a shape of zero divided by divided by two.
And we can give it a radius of, I don't know, I'd say 100 pixels, give it a color of 255,
give it a thickness of negative one. And we can visualize a mask as mask and passing mask.
So let's run that. Python masking dot p y. And this is essentially our mask. There's
the blank image we're working with. And this is the image that we want to mask over. So
let's actually create a masked image, we're going to say masked image is equal to CV dot
bitwise. underscore and this source image. So IMG, IMG, and we specify the parameter
mask is equal to mask, which is this circle image over here. And we can display this image,
call this masked image. And we can pass in masked, save that and run. And this is essentially
your masked image, you took this image, you took this image, you put this image over and
found the intersecting region. Okay, by optionally passing the mask is equal to mask. That's
exactly what we're doing. Cool. That's right. And, you know, play around with this, let's
maybe move this by a couple of pixels around, let's say 45. Save and Run moves down to zero,
okay, this has to be 45 plus 45, save up and running. And we get the image of the cat,
we can draw, we can draw a circle, or we can draw a rectangle instead. What's bottom blank,
skip that. Let's give you that in draw, give it a static endpoint of let's copy this and
add a couple of pixels or maybe 100 pixels this way, in 100 pixels. This way, we can
get rid of this, we don't need that and say that, right? This is this, this is the square.
And this is essentially the masked image. So let's actually try this with. So let's
actually try this with a different image. So we have got an image. Let's try it with
maybe these cats too. Let's go back to cats to save that run. And this is the mask that
we get by putting these two on each other. And essentially, you can play around with
these as you feel fit. You can maybe try different shapes, weird shapes. And the way you can
do get these weird shapes, essentially creating a circle or rectangle and applying bid wise
and you get this weird shape. And then you can use that weird shape as your mask. So
let's just try that. Let's let's try that. Oh, we're going to say let's, let's call this
circle and blanked out copy copy and create a rectangle. Let's just grab it from this
re read Where are we from bitvise Let's grab this rectangle copy that piece over time the
copy 3030 Okay, blank, same shape. So let's create this weird weird shape is equal to
CV dot bitcoins on the scope end of this circle this rectangle and we don't need to specify
anything else. um what's one of visualizes let's close this out try to see see it on
on show call this the weird shape passing the weird shape and wrong. masking undefined
was mask westmar Mosque Okay. Good. This is the weird shape that we get. We're not really
going for a half moon But hey, whatever. Let's close this out. Use this weird shape is mask.
So use weird shape as a mask and let's see the final mask image and this is essentially
your weird weird shape, masked image. Let's call this a weird shape mask image, weird
shaped mask damage. This little halfmoon here. And essentially you can, you can do pretty
much anything you want with this, you can experiment with various shapes and sizes and
stuff like that. But just know that the size of your mask has to be at the same dimensions
as that of your image. If you want to see why not maybe subtract 100 pixels possible,
but let's support it, though. So that's maybe like subtract tubal on it. I don't know whether
that'll work. But guess what? Okay, so let's just say, image on shape of while I'm okay,
let's just give it a different size. What are we? Why are we even using image, let's
go this size of 300 by 300. Definitely not the size of this. And we get this assertion
failed m time, blah, blah, blah, maskhadov, same size, in function, whatever. So essentially,
these need to be at the same size, otherwise, it's going to fail and throw you an error.
So that's it for this video, we talked about masking, again, nothing to do different. We've
essentially used the concept of bitcoins and from the previous video, and you will see
that when we move on to computing histograms in the next video, where masking really comes
into play, and how masking really affects your histograms. So if you have any questions
again, leave them in the comments below. Otherwise, I'll see you in the next video. Hey, everyone,
and welcome back to another video. In this video, we're going to be talking about computing
histograms in open CV. Now histograms basically allow you to visualize the distribution of
pixel intensities in an image. So whether it's a color image, or whether it's a grayscale
image, you can visualize these pixel intensity distributions with the help of a histogram,
which is kind of like a graph or a plot that will give you a high level intuition of the
pixel distribution in the image. So we can compute a histogram for grayscale images and
compute a histogram for RGB images. So we're gonna start off with computing histograms
for grayscale images. And so let's just convert this image to grayscale is activity don't
CVD color, pass the image and give it a color code of of color underscore BGR. To gray,
it means read this image with gray and passing Great. Now to actually compute the grayscale
histogram. What we need to do is essentially call this gray underscore hist and set this
equal to CV dot calc hist. This method will essentially compute the histogram for the
the image that we pass into. Now this images is a list, so we need to pass in a list of
images. Now since we're only interested in computing a histogram for one image, let's
just pass in the the grayscale image, there thing we have to pass in is the number of
channels which basically specify the index of the channel we want to compute a histogram
for that since we are computing the histogram for a grayscale image, let's wrap this as
a list and pass in zero. The next thing we have to do is provide a mask do we want to
compute a histogram for a specific portion of an image, we will get to this later. But
for now just have this to num. His size is basically the number of bins that we want
to use for computing the histogram. Essentially, when we plot a histogram, I'll talk about
this concept of bins. But essentially, for now, just set this to 256 wrapped as a list.
And that's wrapped out as list. And the next thing I want to do is specify the range of
the range of all possible pixel values. Now for our case, this will be 02256. And that's
it. So to prop this image, let's actually use matplotlib. So import map plot matplotlib.pi
plot as PLT, and then we can instantiate of PLT dot figure, a PLC figure. Let's give it
a tidy, let's call this gray kale histogram. We can essentially give it a label across
the x axis and we're going to call this bins. Let's give this a y label and set this equal
to the number of pixels. The number Have pixels. And that's why label. And finally, we can
plot by saying PLT dot plot the, the grayscale histogram. And Valley, we can essentially
give it a limit across the x axis. So PLT dot x Lim have a list of 02256. And finally,
we can display this image. So PLT dot show, save that and run Python histogram, dot p
y. And this is the distribution of pixels in this image. As you can see, the number
of bins across the x axis basically represent the the intervals of pixel intensities. So
as you can see that there is a peak at this region, this means that this is close to 5060
ish. So this means that in this image, there are close to 4000 pixels that have an intensity
of 60. And as you can see that there's a lot of, there's a lot of peeking in this region,
so between probably 40 to 70, there is a peak of pixel intensities of close to 3000 pixel
intensities in this image. So let's try this with a different image. Let's try this with
a cants. I'm just going to save that and run. And there is a peaking of pixel values in
between 202 25. And this makes sense because most of the image is white. So given that
reason, you can probably deduce that there will be a peak into words white or 255. Five.
So this is essentially computing the grayscale histogram for the entire image, what we can
do is we can essentially create a mask, and then compute the histogram only on that particular
mask. So let's do that. Let's go back to masking. Let's grab this, grab this. Let's go right
up there. I set this to image dot shape of the first two values the sizes of the same.
Let's essentially draw a mask, which will be CV dot circle of all blank. And we can
get the center of image into a shape of one by by divided by two, image doing shape of
zero divided by two over two, give it a radius of 100 pixels, give it a color of 245 give
it a thickness of negative one, we can display a mask let's call this let's call as mask
policy mask. And here's where things get interesting. We can get the grayscale histogram for this
mask. And the way we do that is by setting this mask parameter to mask two instead of
none. We set this to mask and let's see what that does to our histogram MPs and undefined
great. And I couldn't make this kind of made a mistake here. Oh, that's right. This is
the Masters not exactly the Masters is circle. This is a this will be a circle circle. And
essentially we need to mask out the image so we so the way we do that is by creating
a mask and setting this equal to CV dot Bitcoins. bitwise unscored, and we can pass in the grayscale
image the grayscale image, and we can pass in the mask which is equal to circle. Now
we can use that as the mask. So let's display that x Sorry, I made a mistake, but hopefully
things should be fine right now. So this is the mask and this is the histogram computed
for this particular mask. As you can see that there is a peaking of pixel intensity values
in this region. And there are smaller pickings in in these regions down below. Let's try
this with another image. Let's pass in the cats cats to the cats though jpg. This is
our mask and this is the there is a peaking in this image towards 50. Okay, so that was
it for computing grayscale histograms. Let's move on to true To compute a color histogram,
that is to compute a histogram for a color image to an RGB image. So let's call this
color histogram. And the way we do that is, instead of converting this image to grayscale,
let's comment all of this out. We will use a mask later. That's come in all of this out.
There is mask will be for IMG, IMG. And yeah, that's pretty much it. So let's start with
the color histogram. The way we do that is let's define a tuple of colors, and set this
equal to b, then tuple of G, a tuple element of R. And what I'm going to do next is I'm
going to say for our common call in enumerate of colors. What I'm going to do is I'm going
to say hist. So I'm going to plot the histogram by saying CV dot calc hist, we're going to
compute it over the image itself, the channels will be I mean, this eye over here, we're
going to provide a mask of none for now. Give it a his size of 256 and give it a ranges
of 02256. And then let's do a PLT dot plot hist and give it a color equal to call. And
only we can do a PLT dot x Lim of 02202256. And for this purpose, we can essentially grab
this, copy that uncomment this out. And we can do a PLT dot show. So this should work.
We're missing something Oh, no, don't think of him. We're not, we're not computing this
histogram for a mask, or we live there next. But let's save that run. Oh, cool. And let's
close enough, I made a mistake, this is a color histogram shouldn't make much of a difference.
So this is the color histogram that we get for the original image not for a mask. But
in fact, this image. So as you can see that this color image basically computed the plot
for blue channel, the red channel and the green channel as well. So using this, you
can basically make out that there is a peaking of blue pixels that have a pixel intensities
of 30. There's a peaking of red, probably around 50, peaking of green, probably around
8075 to 80. Cool and using this, you can basically make up the distribution of pixel intensities
of all three color channels. So let's try and apply a mask by setting this equal to
mask. Let's see whether we have everything in order. It's a bit more than mass mass,
mass mass mass masks. Masks are not the same size, okay, I finally got the error. So basically,
the mass needs to be a binary format. So instead of passing in this mask, this will actually
be the masked marks image, Regan passes me fat mask, and we can change the circle to
mask. Now this should work without any arrows. And we can change that to masked. Yeah, that's
around that. And now we get the color histogram for this particular mask, I made a mistake
because I use this as my mask to compute the histogram for one channel. The problem was
this masked image was actually a three channels and I attempted to use this s3 channeled mask
to calculate the histogram per channel, which isn't allowed in open CV. So that was my mistake.
What kind of use the wrong variable names so confused, but essentially, this is it,
you're computing the histogram for a particular section of this image. And this is what you
get there is a high peaking of red in this area, high peaking of blue in this era, and
high peaking of greens I'm over here. So essentially, that's it for this video. histograms actually
allow you to analyze the distribution of pixel intensities, whether for a grayscale image
or for a colored image. Now these are really helpful in a lot of advanced computer vision
projects. When you actually trying to analyze the image that you get, and maybe try to equalize
the image so that there's no peeking of pixel values here and there. In the next video,
we'll be talking about how to thresh hold an image and the different types of thresholding.
As always, if you have any questions, leave them in the comments below. Otherwise, I'll
see you guys in the next video. Hey, everyone, and welcome back to another video. In this
video, we're going to be talking about thresholding in open CV. Now, thresholding is a binary
realisation of an image. In general, we want to take an image and convert it to a binary
image that is an image where pixels are either zero or black, or 255, or white. Now, a very
simple example of thresholding would be to take an image and take some particular value
that we're going to call the thresholding value. And compare each pixel of the image
to this threshold of value. If that pixel intensity is less than the threshold value,
we set that pixel intensity to zero. And, and if it is above this threshold value, we
set it to 255, or white. So in this sense, we can essentially create a binary image just
from a regular standalone image. So in this video, we're actually going to talk about
two different types of thresholding, simple thresholding and adaptive thresholding. So
let's start off with simple thresholding. So in essence, what I want to do is, before
I talk about simple thresholding, is I want to convert this BGR image to grayscale. So
I'm going to say gray is equal to CV dot CVT color, we pass in the image, we pass in the
color code, which is vgr. To correct, we can display this image called this gray, we can
pass in great. Cool. So let's start off with the simple thresholding. So essentially to
to apply this this idea of simple thresholding, we essentially use the CV dot threshold function.
Now this function returns a threshold, and Thresh, which is equal to CV dot threshold.
And this in essence takes in the grayscale image, the grayscale image has to be passed
in to this thresholding function, then what we do is we pass in a threshold value. So
let's set this to 150 for now, and we have to specify something called a maximum value.
So if that pixel value is greater than is greater than 150, what do you want to set
it to, in this case, we want to binarize the image. So we set it to 245. And finally, we
can specify a thresholding type. Now this thresholding type is essentially CV dot thrush
underscore binary. And what this does is basically it looks at the image compares each pixel
value to this threshold value. And if it is above this value, it sets it to 255. Otherwise,
it infers that if it falls below, it sets it to zero. So essentially returns two things
trash, which is the thresholded image or the binarized image and threshold, which is essentially
the same value that you passed 150, the same threshold value you pass in, will be returned
to this threshold value. So let's actually display this image. So let's say cv.rm show,
we'll call this threshold. We'll call this simple thresh hold dead, and we can pass in
thrash. So let's save that and run Python thrash. Da p y in this is a thresholded image
that you get. Again, this is nothing too different from when we discussed thresholding in the
in one of the previous videos, but this is essentially what you get. So let's play around
with these threshold values. Let's set this to 100. And let's see what that does. And
as a result, both parts of the image have become white. So and of course, if you give
it a higher value, less parts of the image will be white. So let's set this to 225. And
very few pixels in this thresholded image actually have a pixel intensity of greater
than 225. So what we can do after this is essentially create an inverse thresholded
image. So what we could do is we could essentially copy this and instead of saying Thresh, I'm
going to say thrush underscore inverse, and I'm going to leave everything else the same.
Let's set this to 150. And the same thing here, and instead of passing in the type of
thresholding, I'm going to say CV dot Thresh underscore binary under scope inverse. And
let's call this thresholded inverse. And we can pass in inverse. So let's save that and
run. And this is essentially the inverse of this image, instead of setting pixel intensities
that are greater than 150 to 255, it sets whatever values that are less than 150, to
255. So that's essentially what you get. Right, all the black parts of this image will change
to white, and all the white parts of the image will change to black. Cool. So that's a simple
threshold. Let's move on now to adaptive threshold data thresholds. Now, as you can imagine,
we got different images, when we provided different threshold values. Now, kind of one
of the downsides to this is that we have to manually specify a specific threshold value.
Now, some cases this might work, in more advanced cases, this will not work. So one of the things
we could do is we could essentially let the computer find the optimal threshold value
by itself. And using that value that refines it binary rises over the image. So that's
an essence the entire crux of adaptive thresholding. So let's set up a variable called adaptive
on its growth Thresh. And set this equal to CV dot adaptive threshold. And inside I want
to pass in a source image. So let's set this to gray, I'm going to pass in a maximum value,
which is 255. Now notice there is no threshold value. adaption method basically tells machine
which method to use when computing the optimal threshold value. So for now, we're just going
to set this to the mean of some neighborhood of pixels. So let's set this to CV dot adaptive
on the scope Thresh. And score mean underscore C. Next, we'll set up a threshold type. This
is CV dot Thresh. underscore binary, which again, I think do different from this from
the first example. And two other parameters that I want to specify is the block size,
which is essentially the neighborhood size of the kernel size, which open CV needs to
use to essentially compute mean to find the optimal threshold value. So for now, let's
set this to 11. And finally, the last method we have to specify is the c value. Now this
c value is essentially an integer that is subtracted from the mean, allowing us to essentially
fine tune our threshold. So again, don't worry too much about this, you can set this to zero.
But for now, let's set this to three. And finally, once that's done, we can go ahead
and try to display this image. So let's call this adaptive thresholding. And we can pass
in adaptive cash. So let's save that and run. And this is essentially your adaptive thresholding
method. So essentially, what we've done is we've defined a kernel size or window that
is drawn of this image. In our case, this is 11 by 11. And so what open CV does is it
essentially computes a mean over those neighborhood pixels, and finds the optimal threshold value
for that specific part. And then it slides over to the right, and it slides, it does
the same thing. And it's lines down and does the same thing so that it essentially slides
over every part of the image. So that's how adaptive thresholding works. If you wanted
to fine tune this, we could change this to a threshold, just go binary and scope inverse,
you're just to see what's really going on under the hood. Cool. So all the white parts
of the image will change the black and all black parts of the image have changed white.
So let's play around with these values. Let's set this to probably 13 and see what that
does. Okay, definitely some difference from the previous hyper parameter. So let's try
it. Let's go with let's set this to 11. And let's set this to maybe one. Okay, definitely
more white. Let's set this to maybe five in a row that you can play around with these
values, right, the more you subtract from the mean, the more accurate it is, right,
you can basically make out the edges now in this basket. So let's maybe increase that
to nine. And you get less white spots in the image. But essentially, now you can make the
features better. Cool. So that was essentially adaptive thresholding, adaptive thresholding
that essentially can Did the optimal threshold value on the basis of the mean? Now we don't
have to stick with the mean, we can go with something else. So instead of mean, let's
set this to Gaussian. So let's save that and see what that does. And this is the thresholded
image using the Gaussian method. So the only difference that Gaussian applied was essentially
add a weight to each pixel value, and computed the mean across those pixels. So that's why
we were able to get a better image than when we use the mean. But essentially, the adaptive
thresholding mean works. In some cases, the Gaussian works in other cases, there's no
real one size fits all. So really play around with these values, see what you get. But that's
essentially all we have to discuss. For this video, we talked about two different types
of thresholding, simple thresholding and adaptive thresholding. In simple thresholding, we have
to manually specify a threshold value. And in adaptive thresholding, open CV does that
for us using a specific block size, or current size and other computing the threshold of
value on the basis of the mean, or on the basis of the Gaussian distribution. So in
the next video, the last video in the advanced section of this goes, we're going to be discussing
how to compute gradients and edges in an image. So if you have any questions, leave them in
the comments below. I'll be sure to check them out. Otherwise, I'll see you guys in
the next video. Thanks for watching, everyone, and welcome back to another video. In this
video, we're going to be talking about gradients and edge detection in urban CV. Now, you could
think of gradients as these edge like regions that are present in an image. Now, they're
not the same thing gradients and edges are completely different things from a mathematical
point of view. But you can pretty much get away with thinking of gradients as edges from
a programming perspective only. So essentially, in the previous videos, we've discussed the
canny edge detector, which is essentially kind of an advanced edge detection algorithm.
That is essentially a multi step process. But in this video, we're going to be talking
about two other ways to compute edges in an image. And that is the lat placing and the
Sobel method. So let's start off with the left place here. So the first thing I want
to do is I want to convert this image to grayscale, recalling the CVT. DVD to color color method,
we pass in the image, and we say CV color on describe BGR to grip, we can display this
image is called as gray. And we can pass in every pass. Great. So let's start with the
Laplacian. So we're going to define a variable called lap and set this equal to CV dot lap
lesion. And what this essentially will do is it will take in a source image, which is
great now, and it will take in something called a D depth or data depth. Now for now when
we set this to CV dot 64, F is for long with whatever I do next, I'm going to say lap is
equal to NP dot u 98. And instead I'm going to pass an NP dot absolute. And we can pass
in lap. And since I'm using NumPy, I can actually go ahead and import NumPy as NP and when I
go to display this image coil CV dot I'm sure method is called this lamp lesion. And we
can pass on lap lap Save and run a call this Python good radians dot p y invalid syntax
CV dot Okay, it's cv.cv on score 64 F. Say that. And this is essentially the law placing
edges in the image kind of looks like an image that is drawn over a chalkboard and then smudge
just a bit. But anyway, this is the lab laser method. Let's try this with another image.
Let's try this with this park called Boston. Let's call this the park. Save that in right.
And this essentially looks like a pencil shading off this image. It's all the edges that exists
in the image, or at least most of the edges in the image are essentially drawn over with
the pencil and then lightly submerged. So that's essentially the left lacing edges you
could say. So again, don't worry too much about why we converted this to in the UI and
then we computed the absolute value. But essentially the Laplacian method computes the gradients
of this image the grayscale image. Generally this involves a lot of mathematics but Essentially,
when you transition from black to white and white to black, that's considered a positive
and a negative slope. Now, images itself cannot have negative pixel values. So what we do
is we essentially compute the absolute value of that image. So all the pixel values of
the image are converted to the absolute values. And then we convert that to a UI 28 to an
image specific datatype. So that's basically the crux of what's going on right over here.
So let's move on to the next one. And that is the subtle gradient magnitude representation.
So essentially, the way this does is that Sobel computes the gradients in two directions,
the x and y. So we're gonna say sobble x, which is the gradients that are computed along
the x axis, and Seth is equal to CV dot Sobel. And we can pass in the image, let's add this
to the grayscale image, we pass in a data depth, which is cv.cv on school 64 F. And
we can give it an x direction. So let's set this to one and the y direction, we can set
that to zero. And let's copy this and call it soble. Why, and instead of one,
zero, we can save zero comma one. And we can visualize this let's print. Let's call this
symbol x, and we can pass in sub x. And we can say it's either long show Sabo y and set
this to Sabo y. Call that and these are essentially the gradients that are computed, this is over
the y axis. So you can see a lot of y horizontal specific gradients and the sub x was computed
across the y axis. So you can see y axis specific gradients. Now we can essentially get the
combined Sobel image by essentially combining these two Sobel x and Sobel why, and the way
we do that is we're gonna say combined on combined underscore sobald and set this equal
to CV dot bitwise. on school or, and we can pass in Sabo x and symbol y. And we can display
this image, so let's call CV dot I'm show we get to combined Sobel and we can pass in
the combined symbol. Let's run that. And this is essentially the combined sobble that you
get. It isn't, let's go back here. So it essentially took these two apply and CV dot bitwise OR,
and essentially got this image. So if you want to compare this with lat race in two
completely different algorithms, so the results you get will be completely different. Okay,
so let's compare both of these left patient and the Sobel with the canny edge detector.
So let's go down here. Let's say Kenny is equal to CV, don't, Kenny. And we can pass
in the image. So let's possible a grayscale image. Let's give it to threshold values of
150 and 175. And we're done. Let's display this image. Let's call this Kenny, we can
pass in Kenny. So let's save that. And let's see what that gives us. So let's compare that
with you. So that's essentially it. This is the last place in gradient representation,
which essentially returns kind of this pencil shading version of the image of the edges
in the image, combined several computes the gradients in the X in the y direction. And
we can combine these two with bitwise OR, and Kenny is basically a more advanced algorithm
that actually uses Sobel in one of its stages. Like I mentioned, Kenny is a multi stage process,
and one of its stages is using the symbol method to compute the gradients of the image.
So essentially, you see that the canny edge detector is a more cleaner version of the
edges that can be found in the image. So that's why in most cases, you're going to see the
Kenny used. But in more advanced cases, you're probably going to see a Sobel use a lot. Not
necessarily lap racing. But so definitely. So that's pretty much it for this video. And
in fact, this video concludes the advanced section of this course. Moving on to the next
section, we will be discussing face detection and face recognition in urban see, we're actually
going to touch on using hard cascades To perform some face detection, and face recognition,
we actually have two parts. Face Recognition with open CV is built in face recognizer.
And the second part will be actually building our own deep learning model to essentially
recognize some faces in an image. Again, like always, if you have any questions, leave them
in the comments below. Otherwise, I'll see you guys in the next section. Hey, everyone,
and welcome back to another video. We are now with the last part of this Python and
open CV coasts, where we are going to talk about face detection and face recognition
in open CV. So what we're going to be doing in this video is actually discussing how to
detect faces in urban CV using something called a har cascade. In the next video, we will
talk about how to recognize faces using open CV is built in face recognizer. And after
that, we will be implementing our own deep learning model to recognize during the simpson
counters, we're going to create that from scratch and use open CV for all the pre processing
and displaying of images and stuff like that. So let's get into this video. Now, face detection
is different from face recognition. Face Detection merely detects the presence of a face in an
image, while face recognition involves identifying whose face it is. Now, we'll talk more about
this later on in this course. But essentially, face detection is performed using classifiers.
A classifier is essentially an algorithm that decides whether a given image is positive
or negative, whether a face is present or not. Now classify needs to be trained on 1000s
and 10s, of 1000s of images with and without faces. But fortunately for us, open CV already
comes with a lot of pre trained classifiers that we can use in any program. So essentially,
the two main classifiers that exist today are har cascades, and mo advanced classifiers
core local binary patterns, we're not going to talk about local binary patterns at all
in this course. But essentially the most advanced how cascade classifiers, they're not as prone
to noise in an image as compared to the hard cascades. So I'm currently at the open CVS
GitHub page where they store their whole cascade, there are cascade classifiers. And as you
can see, there are plenty of hard cascades that open CV makes available to the general
public. You have a hard cascade for an eye, fragile cat face, from face default, full
body, your left eye, a Russian license plate, Russian plate number, I think that's the same
thing. How cascade to detect smile, Hawk cascade for detection of the upper body, and things
like that. So feel free to use whatever you want. But in this video, we're going to be
performing face detection. And for this, we're going to use the har cascade underscore frontal
face underscore default dot XML. So when you go ahead and open that, you're going to get
about 33,000 lines of XML code. So all of this. So what do you have to do is essentially,
go to this role button, and you'll get all this raw XML code, all you have to do is click
Ctrl A, or Command D if you're on a Mac, and click Ctrl C, or Command C, and then go to
your VS code or your editor and create a new file. And we're going to call this har unscrew
face dot XML. And inside this, I want to paste in those 33,000 lines of XML code. Go ahead
and save that and our classifier is ready. So we can go ahead and close this out. So
we're going to be using this Hawk cascade classifier to essentially detect faces that
are present in an image. So in this file called face detect, face underscore detected py,
I inputted open CV, I basically read in an image of Lady a person, that is this image
over here. And we can go real quick and display this. So let's run Python face to face on
disco with detect dot p y, and we get an image in a new window. Cool. So let's actually implement
our code. The first thing I want to do is convert this image to grayscale. Now face
detection does not involve skin tone or the colors that are present in the image. These
hard cascades essentially look at an object in an image and using the edges tries to determine
whether it's a face or not. So We really don't need color in our image. And we can go ahead
and convert that to grayscale, TV dot CVT color, passing the image in CV dot color on
BGR. To gray. And we can display this call this gray of color is gray person, we can
pass in our name. Let's save them and run. And we have to pass in the gray. Okay, we
have a blu ray person over here. So let's move on to essentially reading in this har
underscore face dot XML file. So the way we do that is by essentially create a har cascade
variable. So let's set this to her underscore cascade. And we're going to set this equal
to CV dot cascade classifier, in inside, what I essentially want to do is, is parsing the
path to this har to this XML file. That is as simple as saying har en disco face dot
XML. So this cascade classifier class will essentially read in those 33,000 lines of
XML code and store that in a variable called har underscore cascade. So now that we've
read in all har cascade file, let's actually try to detect the face in this image over
here. So what I'm going to do is essentially, say faces on school rect is equal to har underscore
cascade dot detect multi scale, and instead, we're going to pass in the image that we want
to detect based on. So this is great, we're going to pass in a scale factor. Now let's
set this to 1.1. Give it a variable called minimum neighbors, which essentially is a
parameter that specifies the number of neighbors rectangle should have to be called a face.
So let's set this to three for nap. So that's it. That's all we have to do. And essentially,
what this does is this detect multiscale, an instance of the cascade classifier class
will essentially take this image, use these variables called scale factor and minimum
labels to essentially detect a face and return essentially the rectangular coordinates of
that face as a list to faces on the score rec. That's exactly why we are giving it faces
on scope rect rect, to rectangle. So you can essentially print the number of faces that
were found in this image by essentially printing the length of these faces on the score rect
variable. So let's do that. Let's print the number, number of faces found is equal to,
we can pass in the length of faces on school rect. So let's save that and run. And as you
can see that the number of faces that were found one, and that's true, because there's
only one person in this image code. Now utilizing the fact that this faces on school rec is
essentially the rectangular coordinates for the faces that are present in the image, what
we can do is we can essentially looping over this list and essentially grab the coordinates
of those images and draw a rectangle over the detected faces. So let's do that. So the
way we do that is by saying for x comma y comma w comma H, H in faces underscore rect,
what we're going to do is we're going to draw a rectangle CV to a rectangle over the original
image. So IMG give the point one, this point one is essentially x comma y. And point two
is essentially x plus w comma y plus H. Let's give it a color. Let's set this to green.
So zero comma 255 comma zero, give it a thickness of two. And that's it. And we can print this
or we can display this image. So let's set this to detected basis. And we can pass in
OMG. And if you look at this image, you can essentially see the rectangle that was drawn
over this image. So this in essence, is the face that open CV is hard cascades found in
this image. So let's try this with another image. So what I have here are a couple of
people, a couple of other people then image of five people, so we're going to use that
image and try to see how many faces OBG these hard cascades could detect in this image.
So let's set this to group two. We can change that to a group of five people. Save that
close, right people save and run. And I want to point real quick that the number of faces
that we found, were actually seven. Now we know that there are five people in this image.
So let's actually see what open CV thought was face. So we can go real quick. So actually
detected all the faces in this image, all the five people, but it also detected two
other guests a stomach and part of a neck. Now this is to be expected because her cascades
are really sensitive to noise in an image. So if you have something that pretty much
looks like a face, like a neck looks like a face, it has the same structure as the typical
face would have. I don't know why her stomach was recognized as face. But again, this is
to be expected. So one way we can try to minimize the sensitivity to noise is essentially modifying
the scale factor in minimum neighbors. So let's increase the minimum neighbors to maybe
six or seven. Save that in run. In as you can see, now six faces were found. So I guess
by increasing the minimum neighbors parameter, we essentially stopped open open CV from detecting
her stomach as face. So let's try this with another more complex image, a couple of people
in group one. So if I change that to group one, save rock. Now, as you can see that the
number of faces we've never found was six. And we know that this is not six. So let's
actually change this minimum minimum neighbors just a bit. Let's change this first to three
and see how many faces we'll found. Now we got 14. Okay, some people at the back want
chosen because either the faces are not perfectly perpendicular to the camera, or they're wearing
some accessories on the face, for example, eyeglasses. This dude's wearing a hat, this
dude ran on cap, and stuff like that. So let's actually change this to one. And let's see
what that gets us to one. So, Ron, and now we got 19 faces that were found in this image.
So it's about looping through these values by changing these values. by tweaking these
values, you can essentially get a more robust result. But of course by by minimizing these
values, you're essentially making open CV small cascades more prone to noise. That's
the trade off you need to consider. Now, again, hard cascades are not the most effective in
detecting faces, they're popular, but they're not the most advanced, they are probably not
what you would use if you were to build more advanced computer vision projects. I think
for that, dealings face recognizer is more effective and less sensitive to noise than
open CV is our cascades. It stands for your use case hard cascades are most more popular.
They're easy to use, and they require minimal setup. And if you wanted to extend this to
videos, you could all you have to do is essentially detect hot cascades on each individual frame
of a video. Now I'm skipping that because it's pretty self explanatory. So that's pretty
much it. For this video, we discussed how to detect faces in open CV using open CV as
har cascades. In the next video, we will actually talk about how to recognize faces in open
CV using open CV is built in face recognizer. So like always, if you have any questions,
comments, concerns, whatever, leave them in the comments below. Otherwise, I'll see you
in the next video. Hey everyone, and welcome back to another video. In this video, we will
learn how to build a face recognition model in open CV using open CV is built in face
recognizer. Now, on the previous video, we dealt with detecting faces in open CV using
hard cascades. This video will actually cover how to recognize faces in an image. So what
I have you have five folders or five different people. Inside each folder, I have about 20
images of that particular person. So Jerry has 21 images. Anson has 17 Mindy kailyn has
22 Ben Affleck has 14 and so. So what I'm essentially going to do is we're going to
use open CV is built in face recognizer. And we're going to train that right now. So on
all of these images in these five folders, now this is sort of like building a mini sized,
deep learning model, except that we're not going to build any model from scratch, we're
going to use open TVs built in face recognizer, or we're going to do is we're actually going
to pass in these close 90 images. And we're going to train that recognizer on these 90
images. So let's create a new file. And we're going to call this faces ns, go train dog
p y, we're going to input always, we're going to input CV to our CV, and we're going to
import NumPy as NP. So the first thing I want to do is essentially create a list of all
the people in the image. So this is essentially the names of the folders of these particular
people, what you could do is you can manually type those in, or you could essentially create
an empty list. Let's call this P. and we can loop over every folder in this folder, and
let's set this to an Austrian. And we can say P dot append, I, or we can print P. Let's
save that and run Python beaters on skirt on skirt trained on p y. And we get the same
list that we got over here. So that's one way of doing it. And what I'm going to do
next is I'm essentially going to create a variable called dir, and set this equal to
this base folder, that is this folder which has, which contains these five folders of
these people. Cool. So with that done, what we can do is we can essentially create a function
called def create unscrewed train, that will essentially loop over every folder in this
base folder. And inside that folder, it's going to loop over every image and essentially
grab the face in that image and essentially add that to our training set. So our training
set will consist of two lives. The first one is called features, which are essentially
the image arrays of faces. So let's set this to an empty list. And the second list will
be our corresponding labels. So for every face in this features list, what is its corresponding
label, whose face does it belong to, like one image could belong to Ben Affleck, the
second image could belong to elton john, and so on. So let's create a function. So we're
going to say we're going to loop over every person in this people list, we're going to
grab the path for this person, so for every folder in this base folder, going through
each folder and grabbing the path to that folder. So that's essentially as simple as
saying, Oh s dot path dot join the join. And we can, we can join the der with person. And
what I'm going to do is I'm gonna create a labels label variable, and set this equal
to people don't index of person. And now the way inside each folder, we're going to loop
over every image in that folder. So we're going to say for image to image in our stock
list there. In path, we are going to grab the image Park. So we're going to say image,
underscore path is equal to OS dot path dot join. We're going to say join, we're going
to join the PATH variable to the image. Now that we have the path to an image, we're going
to read in that image from this path. So we're going to create a variable called IMG underscore
rain is equal to CV dot m read image on the scope path. We're going to convert this image
to grayscale I think CVT color, pause and IMG. On scope right here we can pass in t
v dot c, CV dot color on the screw BGR to grip. Cool and now now with that done we can
essentially trying to detect the faces in this image. So let's go back to face underscore
detect and grab the whole cascade classifier variable here. Let's paste that there. And
we can create a set of faces on school rect and set this equal to har underscore cascade
dot detect multi scale this will take in the gray image scale factor of 1.1 and add a minimum
neighbors of four. And we can loop over every every face in this face rect. So for for x
comma y comma w comma each in faces rect, we are going to grab the bases region of interest,
and set this equal to and basically crop out the face in the image. So we're going to say
gray, y two y plus h, and x 2x plus W. And now that we have a faces a face region of
interest, we can append that to the features list. And we can append the corresponding
label to the labels list. So we're going to do features dot append, we're going to pass
in faces on scope, or y. And we can do a labels dot append label. This label variable is essentially
the index of this list. Now the idea behind converting a label to numerical values is
essentially reducing the strain that your computer will have, by creating some sort
of mapping between a string and the numerical label. Now the mapping of we are going to
do is essentially the index of that particular list. So let's say that I grab the first image,
which is an image of Ben Affleck, the label for that would be zero, because Ben Affleck
is at the zeroeth index of this people list. Similarly, elton john, an image of elton john
would have a label of one because it is at the second position or the first index in
this people's lists. So that's essentially the idea behind this. Now, with that done,
we can essentially trying to run this and see whether we got any errors or not. And
we can bring the length of the features. So let's say length, length of the features list,
is equal to the length of features. And we can do the same thing. This was copy this
length of the labels list, set this to length of labels. So that shouldn't give us any error.
So let's run that. And we get the length of the features 100 and length of labels 100.
So essentially, what we have 100 faces, and 100 corresponding labels to this faces. So
we don't need this anymore. What we can do is we can essentially use this features and
labels list now that it's appended to train our recognizer on it. So the way we do that
is we instantiate our face recognizer, call this, as the instance of the cv.face.lb p
h face recognizer underscore create class. And this will essentially instantiate the
face right now. Now we can actually train the recognizer on, on the features list, and
the labels and the labels list. So the way we do that is by saying face underscore recognizer
dot train, and we can pass in the features list, and we can pass in the labels list.
And before we actually do that, I do want to convert this features and labels list to
NumPy arrays. So we're going to do so we're going to say, features is equal to NP dot
array of features. And we can say of labels is equal to NP dot array of labels, and save
that and run. OK, data object. So let's add this to the typed object. Horse in detail
type is equal to object. And we can actually print when this is done, so let's say craning
don't. And we can actually go ahead and save this features and labels list. And we're going
to say NP dot save, we're going to call this features.np y, and we can pass in features.
And we can do NP dots, MP dot save labels dot nPy. And we can pass in the labels. So
let's save that and run cool. So essentially, now the face recognizer is trained and we
You can now use this. But the problem here is that if we plan to use this face recognizer
in another file, we'll have to separately and manually repeat this process, this whole
process of adding those images to a list and getting the corresponding labels, and then
converting that to NumPy rays, and then training all over again, what we can do and what open
CV allows us to do is essentially save this trained model so that we can use it in another
file in another directory in another part of the world just by using that particular
YAML source file. So we're going to repeat this process again. But the only change that
I'm going to do is I'm going to say face recognizer dot save. And we're going to give the path
to a YAML source file. So we're going to save face unscrewed trend, dot yamo. So let's repeat
this process, again, trainings down. And now you'll notice that you have face on scope
trained a yamo file in this directory, as well as faces, as well as features that nPy
annal labels dot nPy. So let's actually use this train model to recognize faces in an
image. So let's close this out. And create a new file. And we're going to call this face
on school rec hig nition dot p y. Very simply, we're going to import NumPy as MP and CV to
sc V, we don't need us anymore, because we're not looping over directories, we can essentially
create our har underscore cascade file. So let's do that. Let's go up here, grab this,
we can load our features and label rate using by saying features is equal to NP note load
features.np y. And we can say labels is equal to NP dot load with called as labels.np y.
And we can essentially now read in this face on the scope train that yamo file. So let's
go over here. Let's grab this line. And let's say face recognizer dot read. And we're going
to give it the path to this YAML source file. So face unscrewed face on screw trained dot
yamo. So that's pretty much all we need. Now we need to get the mapping. So let's grab
this list as well. And so that's pretty much all we have to do. So let's create a variable
image that set this to save it out in read, give it a path. Let's create eight. Let's
grab one from this validation and one from this validation. I have one of Ben Affleck.
So let's try this with grab that piece on. And Graham, maybe this image from but I have
a piece out there. And that's a JPG file. And we can convert that image to grayscale
tv.tv t color positive image CV no color, I'm just going to BG BGR to Great. So let's
just play this image. See, Cole is the person on identified person that's patient on the
board. So what we're going to do is we're going to first detect the face in the image.
So the way we do that is by saying faces on underscore rect is equal to r on this go cascade
dot detect multiscale we pass in the gray image, we pass in the scale factor, which
is 1.1 give it a minimum neighbors of foe and we can loop over every face in his faces
on score rect Sue, Sue for x comma y comma w comma H in faces basis on score rect. We
can grab the region of interest to what we're interested in finding your Why two one plus
H and x 2x plus H. And now we can predict using this face recognizer to we get the label
and a confidence value. And we say face recognizer dot predict And we predict on this faces on
scope ROI, let's print. Let's call this label is equal to label with confidence, off color
confidence. And since we're using numerical values, we can probably we can probably say
people off label. Okay. And we can essentially what we can do is we can put some text on
this image just to show us what's really going on, we can put this on the image, we can create
a string, variable of people of label. So the person involved in that image, given an
origin, let's say 10, let's say 20 by 20. Give it a font face of CV dot font, unscrew
Hershey on school complex. Give it a font scale of one point of 1.0 here to color of
zero, comma 255 comma zero and give it a thickness of two. And we can draw a rectangle over the
image over the face. This is we draw this over the image, we give it x to y and x plus
delta t comma y plus H. We give it a color of zero comma two five comma zero, and we
can give it a thickness of two. So with that done, we can find this display this image
called as the detected bass. And we can pass the image. And finally we can do a CV Delta
Wait, key zero. So let's save and see what we get Python. Python face on school record,
Nish. nation dot p y. Cannot be alone in love pickles equals false. Gosh, where's that?
We probably don't need this anymore. So let's come up with that out. there if you wanted
to use these again, you could essentially use MP dot load. Since the data types are
objects, you can basically say allow pickle is equal to true. That's essentially but we're
not going to use it. So let's comment that out. Save. And okay, we get Ben Affleck with
a confidence of 60%. So that's pretty good. 60% is good, given the fact that we only train
this recognizer on 100 images. So let's try this with another image of Ben Affleck, maybe
this image, copy that go right across here. And this again, is Ben Affleck with the confidence
of 94% pretty good. Let's go back. Let's go maybe to an odd person. Let's go to Madonna.
For to grab this. It's a pain rally. But let's change this to Madonna. And let's grab this
person, I'm not sure whether it will detect a face because of the head. But let's face
that anyway. Now this is where you'll find that obon TV's face recognizer built on face
recognizer is not the best. It currently detects it currently detects that this person in the
image is actually Jerry seinfield. And that will be the confidence of 110%. Maybe there's
an error somewhere. I'm not sure why that went to 111. But pretty sure there's an error
somewhere. But essentially, this is where the discrepancies lie. It's not the best so
it's not going to give you accurate results. So let's try this with another image. Let's
go about to maybe share image, copy that piece of paper. Okay, this is Madonna with the confidence
of 96.8% Okay, let's move on to elton john Watson had problems with elton john. Given
the fact that he looked pretty similar to Ben Affleck for some reason. Copy that chain
got to elton john just called john and print that. Okay, elton john with the confidence
of 67% pretty good. Okay, so not bad. This is more accurate than what I predicted. before
filming this video, I did a couple of trial runs, and I got very good results. For example,
elton john was continually detected as Jerry seinfield or Ben Affleck. Madonna was detected
as Ben Affleck, Ben Affleck was detected as Mindy kaylin. Minnie kailyn was detected as
elton john, and a whole bunch of weird results. So I guess that we did something right. I
must have done something wrong in the trial runs. But hey, we get good results. And that's
pretty good. Now, I'm not sure why that gave a confidence of 111%. Maybe there's an error
somewhere with the training sent. But I guess for the most part, you can ignore that. Given
the fact that we get pretty good results. So that's pretty much it. For this video,
we discussed face recognition. In open CV, we essentially build a features list and a
labels list and we train a recognizer on those two lists. And we saved a model as a YAML
source file. In another file, we essentially read in that saved model saved YAML source
file. And we essentially make predictions on an image. And so in the next video, which
will actually be the last video In this course, we will discuss how to build a deep learning
model to detect and classify between 10 Simson characters. So if you have any questions,
comments, concerns, whatever, leave them in the comments below. Otherwise, I'll see you
in the next video. Hey, everyone, and welcome to the last video in this Python and urban
TV cuts. Previously, we've seen how to detect and recognize faces Pioli in open CV, and
the results we got were varied. Now, there are a couple of reasons for that. One is the
fact that we only had 100 images to train the recognizer on. Now, this is a significantly
small number, especially when you're training recognizes and building models. Ideally, you'd
want to have at least a couple of 1000 images per class. The second reason lies in the fact
that we want using a deep learning model. Now as you go deeper into especially computer
vision, you will see that there are very few things that can actually beat a deep learning
model. So that's what we're going to be doing in this video. Building a deep computer vision
model to classify between the sensing characters now generate open CV GS for pre processing
the data that is performing some sort of image normalization, mean subtraction, and things
like that. But in this video, we're going to be building a very simple model. So we're
not going to be using any of those techniques. In fact, we'll only be using the open CV library
to read an image and resize them to a particular size before feeding it into the network. Now,
don't worry if you've never used a built a deep learning model before. But this video
will be using tensor flows implementation of Kara's now I want to keep this video real
simple, just so you have an idea of what really goes on in more advanced computer vision projects.
And carers actually comes with a lot of boilerplate code. So if you've never built a deep learning
model before, don't worry, Cara's will handle that for you. So kind of one of the prerequisites
to building a deep learning model is actually having a GPU. Now GPU is basically a graphical
processing unit that will help speed up the training process of a network. But if you
don't have one, again, don't worry, because we'll be using candle, a platform, which actually
offers free GPUs for us to use. So real simple, before we get started, we need a couple of
packages installed. So if you haven't already installed Sierra at the beginning of this
course, go ahead and do a pip install Sierra. The next package you require is conero. And
this is a package that I built specifically for deep learning models built with Kerris.
And this will actually appear surprisingly useful to you, if you're planning to go deeper
into building deep computer vision models. Now, installing this package on your system
will only make sense if you already have a GPU on your machine. If you don't, then you
can basically skip this part. So we can do a pip install conero. And can our actually
installs TensorFlow by default, so just keep that in mind. So with all the installations
out of the way, let's actually move on to the data that we're going to be using. So
the data set that we're going to be using is the Simpsons character data set that's
available on kaggle. So the So the actual data that we're interested in lies in this
instance on score data set folder. This basically consists of a number of folders with several
images inside each subfolder. So Maggie Simpson has about 12 128 images. Homer Simpson has
about 2200 images. Abraham has about 913 images. So essentially, what we're going to do is
we are going to use these images and feed them into our model to essentially classify
between these characters. So first thing we want to do is go to kaggle.com slash notebooks,
go ahead and create a new notebook. And under Advanced Settings, make sure that the GPU
is selected, since we're going to be using the GPU off of that click Create. And we should
get a notebook. So we're going to rename this to Simpsons. And one thing I want to do is
enable the internet since we're going to be installing a couple of packages over the internet.
So do use the Simpsons character data set in our notebook, you need to go head to add
data search for Simpsons. And the first one by Alec city, I should pop up, go ahead and
click Add. And we can now use this data set inside a notebook. So the first thing I want
to do is we're going to pip install, seer. And now, now the reason why I'm doing this
yet again. Now the reason why I'm doing this, again, is because candle does not come pre
installed with Sierra and conero. Now I did tell her to install it on your machine. And
the reason for that is because y'all can work with it and experiment with. So once that's
done, go ahead to a new cell. And let's import all the packages that we're going to need.
So we're going to input o s, we're going to input seer, we're going to input conero. We're
going to import NumPy. As NP we're going to input CV to add CV, and we're going to input
GC for garbage collection. Then next what we want to do is in basically when building
deep computer vision models, your model expects all your data or your image data to be of
the same size. So since we're working with image data, this size is the image size. So
all the data or the images in our data set will actually have to be resized to a particular
science before we can actually feed that into the network. Now with a lot of experiments,
I found that an image size of 80 by 80 works well, especially for this Simpsons data set.
Okay, the next variable we need is the channels. So how many channels do we want in our image.
And since we do not require color in our image, we're going to set this to one basically grayscale.
To run back. What we need next is we're gonna say car on the scope path is equal to the
base path where all the data where all the actual data lines, and that is in this Simpsons
on a school dataset, this is the base folder for where all our images are stored in. So
we're going to copy this file path. And we're going to paste that in that. Cool. So essentially,
what we're going to be doing now is, we're essentially going to grab the top 10 characters,
which have the most number of images for that class. And the way we're going to do that
is we are going to go through every folder inside the Simpsons underscore data set, get
the number of images that are stored in that data set, store all of that information inside
a dictionary, so that dictionary in descending order, and then grab the first 10 elements,
first n elements in the dictionary, hope that made sense. So what we're going to do is we're
going to say create an empty dictionary. We're going to say for character in our stop list,
der called car path, we are going to say car underscore dict of car is equal to length
of s dot list dir of Oh s dot path dot join. We're going to join the car on a scope pump
with car. So essentially, all that we're doing is we're going through every folder or grabbing
the name of the folder, and we're getting the number of images in that folder. And we're
storing all that information inside the dictionary called car underscore dict. Once that's done,
we can actually sort this dictionary in descending order. Sending order and the way we do that
is with a car unscored dict is equal to car dot SOT unscored dict of car underscore dict.
And we said descending equals to true. And finally, we can print the dictionary that
we get. So this is the dictionary that we have. As you can see, Homer Simpson has the
most number of images at close to 2300. And we go all the way down to Lionel, who has
only three images in the data. So what we're going to do is now that we have this dictionary,
what we're going to do is we are going to grab the names of the first 10 elements in
this dictionary, and store that in a list of characters list. So we're gonna say characters.
So we're gonna say characters is equal to is equal to an empty list. And we're going
to say, for i in car underscore dict. We're going to say characters, dot append, and we're
going to append the name. So we say I have zero. And we say, if count is greater than
or equal to 10, we can break it, we need to specify a count of zero, and increment that
counts. Okay, once that's done, let's print what our characters looks like. So we've essentially
just grabbed the names of the characters. So with that done, we can actually go ahead
and create the training data. And to create a training data is as simple as saying train
is equal to seer dot pre process. From there, we pass in the car on scope, puff, the characters,
the number of channels, the image size, image size, as we say, is shuffle equals true. So
essentially, what this will do is it will go through every folder inside car on the
scope path, which is Simpsons underscore data set. And we'll look at every element inside
characters. So essentially, it is going to look for Homer Simpson, inside the Simpsons
underscore data set, it will find Homer Simpson, whereas Homer Simpson, it even finds Homer
Simpson is going to go through inside that folder, and grab all the images inside that
folder, and essentially add them to our training set. Now, as you may recall, in the previous
video, a training set was essentially a list. Each element in that list was another list
of the imagery and the corresponding label. Now the label that we had was basically the
index of that particular string in the characters list. So that's essentially the same type
of mapping that we're going to use. So Homer Simpson is going to have a label of zero,
Ned will have label of one, Liza will have label of three, and so on. So once that's
done, go ahead and run this. Now, basically, to basically the progress is displayed at
the terminal. If you don't want anything outputted to the terminal, you can basically just set
set the verbosity to zero. But I'm going to leave things just as it is, since there are
a lot of images inside this data set. This may take a while depending on how powerful
your machine is. So that's only took about a minute or so to pre process our data. So
essentially, let's try to so let's essentially try to see how many images there are in this
training set. We do that by saying the length of trip. And we have 13,811 images inside
this training set. So let's actually try to visualize the images that are present in this
dataset. So we're going to import matplotlib.pi plot as PLT, we're going to do a PLT dot bigger.
And we're going to give it and we're going to give it a big size of 30 by 30. Let's do
a plt.im show, we can pass in first. The first element in this training sets are zero and
then zero. And we can give it a color map off gray. And we can display this image. Now
the reason why I'm not using open CV to display this image is because for some reason, open
CV does not display properly in Jupyter Notebook. So that's why we're using matplotlib. So this
is basically the image that we get somebody legible, but to a machine. This is a valid
image. Okay, the next thing we want to do is we want to separate the training set into
the features and labels. Right now. The train That basically is a list of 13,811 lists inside
it. Inside each of that sub lists are two elements, the actual array and the labels
itself. So we're going to separate the feature set, or the arrays and the labels into separate
lists. And the way we do that is by saying feature set and labels is equal to car dot
zip on school train, we are going to separate the training set and give it an image size
of image size. And that's n equals two. So basically, what this is going to do is going
to separate the training set into the feature set and labels and also reshape this feature
set into a four dimensional tensor, so that it can be fed into the model with no restrictions
whatsoever. So go ahead and run that. And once that's done, let's actually try to normalize
the feature sets. So essentially, we are going to normalize the data to be in the range of
to be the range of zero comma one. And the reason for this is because if you normalize
the data, the network will be able to learn the features much faster than, you know, not
normalizing the data. So we're gonna say feature set is equal to square dot normalize, and
when to pass in, peaches set. Now we don't have to normalize the labels. But we do need
to one hot encode them that is convert them from numerical integers to binary class vectors.
And the way we do that is by saying from TensorFlow, del Kara's dot EDU tools input to underscore
categorical. And we can say labels is equal to two categorical, and we get possible labels,
and the number of categories, which is basically the length of this characters list. Cool.
So once that's done, so once that's done, we can actually move ahead and try to create
our training and validation data. Now, don't worry too much if you don't know what these
are. But basically, the model is going to train on the training data and test itself
on the validation data. And we're going to say x underscore train x underscore Val and
y underscore train and y underscore Val is equal to sere dog train, Val split. And we're
going to split the feature set and the labels using a particular validation ratio, which
we're going to set as point two. So that's basically what we're doing, we're splitting
the feature set and labels into into training sets and validation sets with using a particular
validation ratio to 20% of this data will go to the validation set, and 80% will go
to the training set. Okay. Now, just to save on some memory, we can actually remove and
delete some of the variables and we're not going to be using. So we do that by saying
Dell crane, Dale feature sets, do labels, and we can collect this by saying GC dot collect.
Cool. Now moving on, we need to create an image data generator. Now this is basically
an image generator that will essentially synthesize new images from already existing images to
help introduce some randomness to our network and make it perform better. So we're gonna
say data, Gen is equal to can narrow down generators, dot image, data generator. And
this basically instantiates, a very simple image generator from the caros using the Kara's
library. And once it's done, let's create a training generator. By setting this equal
to data Jim, don't float. And we can pass in extra rain and wind rain and give it a
batch size equal to batch size. So let's actually create some variables here. That's set my
batch size to 32. And maybe let's train the network for 20 bucks. So once that's done,
that's wrong bet. So with that done, we can actually proceed to building our model. So
let's call this creating the model. And before making this video, I actually tried and tested
out a couple of models found that one actually provided me with highest level of accuracy.
So that's the same model, the same model architecture that we're going to be using. So we're gonna
say model is equal to conero dot models dot create Simpsons model, we're going to pass
in an image size, which is equal to the image size, we're going to say set the number of
channels equal to the number of channels, we're going to say, we're going to set the
output dimensions to the to 10, which is basically the length of our characters, then we can,
then we can specify a loss, which is equal to binary binary cross entropy. There we get
set a decay of E of e to the negative sixth power, we can set a learning rate equal to
point 001. We can set Oh momentum of point nine, and we can set Nesterov to true. So
this will essentially create the model using the same architecture I built and will actually
compile the model so that we can use it. So go ahead and run this. And we can go ahead
and try to print the summary of this model. And so essentially, what we have is a functional
model, since we're using Kerris as functional API. And this essentially has a bunch of layers,
and about 17 million parameters to drain out. So another thing that I want to do is create
something called a callbacks list. Now this callbacks list will contain something called
a learning rate shedule that will essentially sheduled the learning rate at specific intervals
so that our network can essentially train better. So we're going to say call callbacks
list is equal to learning rate shedule. And we're going to pass in conero.lr on SCO LR
underscore schedule. And since we're using learning where shedule Let's go and input.
So from TensorFlow, Delve, Cara's no callbacks input learning rate schedule. And that should
about do it. So let's actually go ahead and train the model. So we're gonna say training
is equal to model dot fit, we're gonna pass in the train gin, we're going to say, steps
per epoch is equal to the length of X on school train divided by divided by the batch size.
We're going to say epochs, is equal to epochs. We're going to give the validation data validation
data equal to a tuple of x underscore Val, and y underscore Val. And we're going to say
validation steps. Easy Steps is equal to the length of y on school Val, divided by divided
by the batch. batch size. And finally, we can say callbacks, is equal to callbacks,
callbacks on school missed steps per epoch, that steps for epoch. And that should begin
training. And once that is done, we end up with a baseline accuracy of close to 70%.
So here comes the exciting part, we're now going to use open CV to test how good our
model is. So what we're going to do is we're going to use open CV to read in an image at
a particular file path. And we're going to pass that to our network and see what the
model spits out. So let's go ahead and go to this Simpson test set. So let's go ahead
and try to search for all the way down here. Let's look at our characters. Let's just print
that out just to see what characters we trained on. Okay, let's look for Bart Simpson. Probably
bit irritating, but since data sent Okay, we got an image of Bart Simpson. So click
this and random path, got a test path, set this equal to our string. And what we're gonna
do is we're gonna say mg is equal to CV dot m read test on secure path. And, and just
to display this image, we can use PLT dot m show, we can pass the image, pass in the
image and give it a column map of gray. And we can do a PLT dot show. Okay, PLT show.
And okay, so this is an image of Bart Simpson. So what we're going to do is we are going
to create a function called prepare, which will basically prepare our image to be of
the same size and shapes and dimensions as the images we use to prepare the model in.
So this will take in a new image. And what this will do is we'll, we'll convert this
image to grayscale so we're gonna say injury is equal to CV dot, CVT color, and we're gonna
pass in the injury and we're gonna say CV dot color on scrub BGR. To gray, we can resize
it to our image size. So we're going to say mg is equal to CV dot resize, we're going
to resize the image to be image underscore with size, I'm going to reshape this image.
So injury is equal to stare dot reshape, reshape of image. We want to reshape the image to
be of image size with channels equal to one. And we can return image. So let's run that.
And let's go down here. And let's say predictions is equal to model dot predict and prepare
image. And we can visualize this predictions. So let's print predictions. And essentially,
this is what we get. So to print the actual class, what we can do is we can print their
characters, BB NP dot arg Max, and we can say predictions of zero. You're not trying
to visualize this image so we can do a PLT dot m show. Let's pass in the image. And PLT
dot show. Let's grab this and move this down. ver. That's right. Yeah. Okay, so this is
our image. And right now our model thinks that buttons in is in fact, Lisa Simpson.
Let's go. Lisa Simpson. Okay. Let's try another image. Let's try probably this image is Bart
Simpson 28. Let's go up they and maybe change that to two, eight. run that. This is Bart
Simpson. Let's run this. And let's and again, we got Lisa Simpson. So let's try with a different
image. Yeah, we do. We did Charles Montgomery to copy this. All the way down there. We got
Charles predict, and we get van Hughton. Okay, definitely not the best model that we could
have asked for. But hey, this is a model. Right now this base discounting has a baseline
accuracy of 70%. Although I would have liked it to go to at least 85%. In my test, it had
gone close to 90 92%. I'm not sure exactly why this went to 70%. But again, this is to
be expected into building deep computer vision models is a bit of an art. And it takes time
to figure out what's the best model for your project. So that's it for this Python and
open c because this goes to is basically kind of a general introduction to open CV and what
it can do. And of course, we've only just scraped the surface and really this A whole
new world of computer vision now fair. Now, while we obviously can't cover every single
thing that open CV can do, I've tried my best to teach you what's relevant today in computer
vision. And really one of its most interesting parts, building deep learning models, which
is in fact, where the future is self driving vehicles, medical diagnosis, and tons of other
things that computer vision is changing the world. And so all the code and material that
was discussed throughout this course is available on my GitHub page. And the link to this page
will be in the description below. And just before we close, I do want to mention that
although I did recommend you installed Sierra in the beginning, we barely use it throughout
the coasts. Now, it's probably not going to make sense to you right now. But if you plan
to go deeper into computer vision into building computer vision models, Sierra lasher proved
to be a powerful package for you. It has a lot of helper functions to do just about anything.
Now I'm constantly updating this repository. And if you want to contribute to these efforts,
definitely do that you can set a pull request with your changes. And if it's helpful, it
will be merged into the official code base, and you'll be added as a contributor. If you
want to building deep learning models with Kara's then conero will be useful to you.
But again, for the most part, it's usually software that you'll be using. So anyway,
with that said, I think I'll close up this course, if this goes helped you in any way
and God you're more interested in computer vision, then definitely like this video, subscribe
to my channel, as I'll be putting up useful videos on Python computer vision and deep
learning. So I guess that's it. I hope you enjoyed this post and I'll see you in another
video.