Hello, welcome to the video lecture series
on digital image processing, that will be discussing
today that is a global processing approach which is also called Hough transformation.
So, after today’s lecture the student will be
able to explain and implement the local processing technique for linking the edge points and
also the global processing technique that is the
students will be implement the Hough transformation to link the edge points. So, let us just see that what we have seen
in the last class. We have said that ideally edge
detection technique should identify the pixels line on the boundary between the regions.
We say it is the boundary between the regions
because we assume that it is a transition, this
region is a transition region from a low intensity values, from low intensity region to a high
intensity region or from a high intensity region to a low intensity region. But while
trying to implement this, it has been found that the
edge points which we expect to be continuous to
give us a meaningful boundary description of a segment that cannot be achieved in practice.
Now, this is mainly due to two reasons. First one is due to non-uniform illumination of
the scene, if the scene is not uniformly illuminated
that leads to detection of edge points where the boundary points will not be continuous
and the second reason for getting this noncontinuous boundary points is the presence of noise,
that is if the image is noisy then after doing the edge detection operation either
the boundary points will not be continuous or there
may be some spurious edge points which are not actually edge points or the boundary points
of the any of the regions. So, to tackle this problem we have to go for
linking of the edge points so that after linking we
will get a meaningful description of the boundary of a particular segment. So, we have said
that there are mainly two approaches for edge linking operation. The first approach is the
local processing approach and the second approach is the global processing approach. In the
local processing approach what we do is we take an edge detected image, that is the image
that we have as an input, this is an image containing only the edge pixels. So, we assume that edge points will be white
and all the non edge points will be black and in
this edge detected image we analyze each pixel in a small neighboρod. So, for every point
(x, y) if that is an edge pixel we take a small
neighboρod of point (x, y) and we link the other
edge points within this neighboρod with the point (x, y) if they are similar in nature.
So, whenever we find that within the neighboρod
we have two edge points which are similar in
nature then we link these edge points and after linking all such edge points, we get
a boundary of pixels that are similar in nature.
So, basically what we get is a well-defined boundary of a particular segment. So, when
we say that we have to link the edge points which
are similar in nature, then we have to have some similarity measure. So, remember that
after edge detection operation for every edge point we get two quantities, one is the boundary
strength at that particular edge point and the
second quantity that is the direction of edge at that particular edge point.
So, by comparing the boundary strength as well as the direction of boundary at point
(x, y) and at a point which is in the neighboρod
of (x, y) we try to find out whether these two points
are similar or not. So, if these two points are similar we simply link them. So, for this what you have to do is, we take
an edge point (x, y) and we find out a point (x', y') , which is in the neighboρod of
(x, y). So, what we are doing is, we are taking this point (x, y) and considering the point (x',
y') , which is in the neighboρod of (x, y) and we
find out the difference of edge strength. We know that this f x, y is the
image function and f(x, y) gives the intensity value at location
(x, y) in the image. And f x, y this gives
you the gradient of the intensity value at location (x, y).
So, we compute the gradient at location (x, y) and also the gradient at location (x',
y') and if the difference between these two is less than
or equal to certain threshold T, where T is a
non-negative threshold and at the same time so, this gives you whether the strength is
similar or not and at the same time we also have to
check whether the direction at this edge which is
given by this α(x, y) at location (x, y) and α (x', y') at location (x', y') . So
if there, if the orientation or the direction of the edge is
also similar, that is the difference is less than or
equal to some angle threshold value A, then we consider these two points (x, y) and (x',
y') to be linked together.
So, in this particular case our (x', y') has to be in the neighboρod of (x, y). So, that
is what is represented by (x', y') N xy . So, this
is the local processing technique but as we said that we
are going for linking of the edge points because the edge points are discontinuous and
normally the neighboρods size that is taken is a small neighboρod. So, if (x', y') is
not within the neighboρod of (x, y) over a given, over
a given definition of the neighboρod in that case
(x', y') cannot be linked with the edge point (x, y). So, to solve this problem because (x', y')
, the two edge points can be far apart depending upon the amount of noise that you have or
the depending upon the lighting condition, but we
have to be, we should be able to link those points as well. So, in such cases the local
processing technique does not help to link the edge points. What we have to go for is
the global processing technique and the global
processing technique that we will discuss today is
called the Hough transformation. So, it is Hough transform. So, what is this Hough
Transform? The Hough transform is a mapping from the
spatial domain to a parameter space. So, let us
take an example, suppose I have this x, y co-ordinate system and I have a single straight
line in this x, y co-ordinate system and we know
that in the slope intercept form, this straight line
is described by an equation which is given by y = mx + c , where m is the slope of the
straight line and c is the intercept value. Now, for a particular straight line the values
of m and c will be constant. So, I represent them
by m1, c1 indicating that these two values, the slope and the intercept are constant for
a particular given straight line in the x-y
plane. So, you find that this particular straight line is
now defined, is now specified by two parameters, one is one of the parameters is m1, which
is the slope of the straight line and the other
parameter is c1, which is intercept. Now, if I mapped this straight line to the
parameter space because I have two parameters m
and c, that is slope and intercept. So, our parameter space will also be a 2 dimensional
space. So, what I am trying to do is? I am trying
to map this straight line in the parameter space. So,
I draw this mc plane. I will have the slope along one direction and the intercept c along
another direction and since for this given straight line y = m1x + c1 , m1 that is the
slope and c1, that is the intercept is fixed.
So, this particular straight line will be represented by a single point in the m-c plane
and this point is at location m1 and c1. So, we find
that when I mapped a given straight line in the
spatial domain to the parameter space, a straight line gets mapped to a single point. Now, let
us see what happens if we are given a point in the spatial domain that is in the x-y plane,
we are given a particular point let us see the
situation, what will happen in this case. So, now what I have is, I have again this
x-y plane and in the x-y plane, I have a single point
and let us assume the co-ordinate of this point is (x1, y1). Now, you find that equation
of any straight line in the x-y plane as we have
seen earlier in the slope-intercept form is given by y
is equal to mx plus c. Now, if this straight line y is equal to mx plus c has to pass through
this given point (x1, y1), then (x1, y1) must satisfy
this equation. So, in effect what I will get is, I will get
an equation that is y1 = mx1 + c , because this line
y = mx + c is passing through the given point (x1, y1) and this is the equation that has
to be satisfied by all the straight lines that passes
through this point (x1, y1). Now, we find that ideally I can have infinite number of straight
lines passing through this given point (x1, y1).
So, there will be infinite number of straight lines like this and for each of these straight
lines the value of the slope that is m and the intercept
c it will be different. So, if I now map this single straight line
in our parameter space that is mc plane, you find that
this m and c, these two become the variable whereas y1 and x1 they are the constants.
So, now what I can do is? I can rewrite this equation
that is y1 = mx1 + c , in this way, I can write it as c = -mx1 + y1 . So, here what I have is, I
have this x1 and y1, these two are constants and c and
m are variable. So, if I map this point (x1, y1) into our
parameter space. So, what I will have now is? I will
have this mc-plane and in the mc-plane you have find that c = - mx1 + y1 this now the
equation of a straight line. So, effectively what I get is? I get a straight line in the
mc-plane following the equation c = - mx1 + y1 . So,
we have seen two cases that is one case a straight
line in the x-y plane is mapped to a point in the mc-plane and in the other case if we
have a point in the x-y plane that is mapped to a
straight line in the mc-plane. And this is the basis
of the Hough transformation by using which we can link the different edge points which
are present in the image domain which is nothing
but the spatial domain or we can say that this is
nothing but our x-y plane. So, now let us see that, what happens if I
have two points in our spatial domain or the x-y
plane. So, again I go to our spatial domain or x-y plane. So, this is my x-axis and this
is my y- axis and suppose I have two points, one is
say (xi, yi) and the other point I have in this spatial
domain is (xj, yj). Now, if I draw a straight line passing through these points (xi, yi)
and (xj, yj). So, this is the straight line, which
is passes through these two points (xi, yi) and (xj, yj)
and we know that this straight line will have an equation of the form y equal to say m dash
x plus c dash.
So, what we have to do by using the Hough transformation is that, given these two points
(xi, yi) and (xj, yj) we have to find out he parameters
or the equation of the straight line which is
passing through these two points (xi, yi) and (xj, yj). Now, as we have seen earlier
that a point in the x-y plane is mapped to a straight line
in the mc-plane or in the parameter space. So,
here in this case since we have two points in the x-y plane this will be mapped to two
different straight lines in the mc-plane. So, if I draw in the mc-plane if I find the
mapping in the mc-plane it will be something like
this. So, I have this parameter space or mc-plane. The first straight line, the first point (xi,
yi) will be mapped to a straight line like this,
where the equation of this straight line will be
given by c = - mx i + yi and the second point will be mapped to another straight line say
something like this, where the equation of this straight line is the given by c = - mx
j + y j and you find that the point at which these two
straight lines meet, that is this particular point this
is the one which gives me the values of m and c.
So, this will give me the value of m' and c' , and this m' and c' are nothing but the
parameters of the straight line in the x-y plane, which passes through these two given
points (xi, yi) and (xj, yj). Now, if I consider
that there are infinite numbers of points or there are
large number of points lying on the same straight line in the x-y plane, each of these points
will be mapped to a particular straight line in the mc-plane like this. But each of these
straight lines will pass through this single point
(m', c') in the parameter space. So, by this what we seen is as we know that
if there is a single point in the parameter in the
spatial domain or in the x-y plane, that is mapped to a single straight line in the parameter
Space that is mc-plane. So, if I have a set of collinear points in the x-y plane each
of these collinear points will be mapped to a single
straight line in the parameter space or in the mcplane. But all these straight lines
corresponding to the points which are collinear in the x-y
plane will pass through, will intersect at a single point and this point in this case
is (m', c') the values of which, that is m' and c' , this
give us the slope and intercept values of the straight line on which all those collinear points in
the spatial domain lie. And this is the basic essence
of using Hough transformation for linking the edge points.
Now, how do you compute this Hough transformation? So, far what we have discussed, this
in the continuous domain that is our assumption is the values of m and c, or m and c are
continuous variable. But in our implementations since we are considering the digital cases
we cannot have the continuous values of m and
c. So, we have to see that how this Hough transformation
can really be implemented. So, for implementation, what you have to do is? This
entire mc space has to be subdivided into a
number of accumulator cells. So, as has been shown in this particular figure. So, here you find that what we have done is
this mc space mc-plane is divided into a number of smaller accumulator cells. So, here we
have a range of the slops which are the expected range of slops in a particular application
and the range is from a minimum, that is of the
minimum slope to a maximum slope. So, this is the minimum slope value mmin and
this is the maximum slope value, mmax. Similarly, for c this is also sub-divided
and the total range is within an expected maximum
value and a minimum value. So, cmin is the expected minimum value of the intercept c
and cmax is the expected maximum value of the
intercept c. So, within this minimum and maximum, within this range, this space is
divided into a number of accumulator cells. And this array, this 2 dimensional array let us
name is, name this as an array say A and each of
these accumulator cell locations can be indexed by say index i and j.
So, a cell location at location say (i, j) and I call this location as say cell location
(i, j) and the corresponding accumulator cell will have a
value say A(i, j). So, this A(i, j), this particular
cell the (i, j)th cell corresponds to the parameter values, let us say mi and cj. So,
an (i, j)th an accumulator cell (i, j) having an accumulated
value A(i, j) corresponds to the corresponding parameter values mi and cj and for implementation
what we do is? We initialize each of these accumulators to 0.
So, initially A(i, j) is set to 0 and that is our initialization. Okay. So, once we have
this array of accumulator cells then what we do in the
spatial domain we have a set of boundary points and in the parameter space we have a 2 dimensional
array of accumulator cells. So, what we have to do is, we have to take
a boundary point say (xk, yk), a single boundary point in the spatial domain and we have seen
earlier that these boundary point (xk, yk) is
mapped to a straight line in the parameter space that is in the mc-plane and the equation
of this straight line is given by c = - mx k
+ y k . Okay. So, what we have to do is we have to find
out the values of m and c from this particular equation. Now, in our case what we have is,
the values of m and c are not continuous but they
are discrete and as we have said that an (i, j)th accumulator cell corresponds to the corresponding parameter values mi and cj.
So, to solve for values of m and c, our basic equation is c = - mx k + y k .
So, here what we do is? We allow the value of m to vary from the minimum to the maximum
as we have said that we have chosen a range mmin to mmax. So, we allow the value of m
to take all possible values or all allowed values
as specified in our accumulator cell ranging from the mmin to mmax. And for each of these
values of m we solve for the corresponding value of c. Following this particular equation
c = - mx k + y k . Now, the value of c that you get by solving
this particular equation for a specific value of m
that may be a real number, whereas we have to deal with the discrete case. So, whenever
we get a value which may be a real number or
may be the value of c that we get which is not
allowed as per our definition of the accumulator cell then what we have to do is? This value
of c has to be rounded off to the nearest allowed value of c as specified in the accumulator
cell. So, if I have say m number of such possible
values of m, I get the m number of corresponding values of c by solving this equation c = - mx
k + y k . So, suppose by following this, a particular choice of m, say mk or we have
already used k for something else instead of calling
mk, let us call it say particular value of m, say mp. When I put this value of mp in
this equation suppose the corresponding value of
c, I get after solving this equation is say cq. And
you remember that we have initialized our accumulator cells to all the cells having
a value of 0.
So, whenever for a particular value of mp, I get a value of cq, that is the intercept
value then the operation I do is the corresponding accumulators
cell A(p, q) is incremented by 1. So, I make A p, q = A p, q +1 . So,
this I have to do for all the points all the boundary points in
our spatial domain, that is in the x-y plane and for each of these boundary points I have
to compute it for every possible or every allowed
value of m as allowed in our parameter space. So, at the end you will find because for every
computed value or computed values in mp, cq, I
am incrementing the accumulator cell by 1, where the accumulator cells were initialized
to 0. So, you find that at the end of the process
if an accumulator cell A(i, j) contains a value say
Q. So, we are considering this after we consider all the boundary points in the spatial domain.
For each of these, we compute the corresponding say mp and cq for all allowed values of m,
I find out the corresponding allowed values
of c and for each such pair of mp and cq, I do this
incrementation operation on the accumulator cell.
So, at the end if an accumulator cell say, A(i, j) contains a value of say Q. This indicates
that there are Q number of points lying on a straight
line whose equation is given by y = mi x + c j
because as we said that, this accumulator cell (i, j) corresponds to the slope value
of mi and the intercept value of cj and for every point,
wherever I get a corresponding value of m and c.
The corresponding accumulator cell is incremented by 1.
So, at the end of the process if a particular accumulator cell A(i, j) contains a value
Q. This is an indication that in the spatial domain.
I have capital Q number of points or boundary points
which are lying on the straight line y = mi x + c j . Now, the question is what is the
accuracy of this particular procedure, that is how accurate
is this estimation mi and cj. That depends upon
how many number of accumulator cells I will have in the accumulator array.
So, if I have a very large number of accumulator cells in the accumulator array then the
accuracy of the computed m and c will be quite high, whereas if I have small number of
accumulator cells in our accumulator array then the accuracy of this computed values
will be quite less. Now, the question is how can we
use this to detect the number of straight lines
present in the spatial domain. Let us consider a case like this. So, I have an image in the spatial domain
and the boundary points of this image are something like this. So, these are the boundary
points in the straight line. So, when I compute this Hough transformation I get an accumulator
cell. So, you find that in this particular straight line there are 1, 2, 3, 4, 5, 6,
7, 8, Eight number of points on this side, on this straight
line there are 1, 2, 3, 4, 5, Five number of points, on this part of the straight line
there are again five number of points, on this part
of the straight line there are 4 number of points. And
if I consider that this part is also a straight line, So, on this straight line there are
3 number of points.
So, in our accumulator cell at the end of this Hough transformation operation I will
get one cell with value equal to 8, I will get another
cell with value equal to 5, I will get one more cell
with value equal to 5, I will get one cell with value equal to 4 and I get another cell
with value equal to 3. And if I chose all these
values. Say, if I say that I will consider a straight
line to be significant if the corresponding accumulator cell contains a value greater
than or equal to 3, then by this process the number
of straight lines that I will be able to detect is this
straight line, this straight line, then this straight line as well as this straight line.
But if I say that I consider only those straight lines
to be significant where the number of points, number
of collinear points lying on the straight line is greater than or equal to 4, then I
will be detecting only this straight line, this straight
line, this straight line and this straight line.
So, again here you find that by choosing that how many points to be I should consider to
lying on a straight line so that the straight line will be significant and I consider this
to be a boundary straight line that also can be varied
or tuned depending upon the application by choosing the threshold on this number of points
lying on the straight line. Now, here you find that in the mc-plane though
we are able to find out the straight line segments but this particular formulation of
Hough transformation that is mapping from x-y
domain to the parameter domain that is the mc-plane has a serious problem. The problem
is in mc-plane what we are trying to do is we
are trying to find out the slope intercept value of
the straight line in the spatial domain. Now, the problem comes when this straight
line tries to be vertical, that is parallel to x-axis.
If the straight line is parallel to x-axis, then the slope of the straight line that is
the value of m tends to be ∞. And in this formulation we
cannot tackle the value of m which becomes very
large or which tends to become ∞. So, what should we do to solve this problem? So, to
solve this problem instead of considering the slope
intercept form, what we can do is we can make use of the normal representation of a straight
line. So, the normal representation of the straight
line is given by this, the formula is ρ = x cosθ + ysinθ and what do we get in
case of this normal representation? The line that we get is something like this. So, here again,
I have this straight line in the x-y plane but instead
of taking the slope intercept form, where the equation was given by y = mx + c , I take
the normal representation, where the equation
of the straight line is given by ρ = x cosθ + ysinθ .
What is ρ? ρ is the length of the perpendicular, it is the length of the perpendicular dropped
on the straight line drawn from the origin of the x-y plane and θ is the angle made
by this perpendicular with the x-axis. So, you find
that the parameters of the straight line which is
defined in this normal form, ρ = x cosθ + ysinθ . The parameters are ρ, which is
the length of the perpendicular drawn from the origin to
the given straight line and θ, which is the angle
formed by this perpendicular with the x-axis. So, unlike in the previous case where the
parameters are the slope m and c now my parameters become ρ and θ, and when I have
these two parameters ρ and θ then the situation is quite manageable that is I do not have
the situation of leading to a parameter which can
take an infinite value. So, we find that in this particular case what can be the maximum
value of ρ and the maximum value of θ. We consider the value of θ to be ranging
in the range of ±90o and the value of ρ that is the
length of the perpendicular to the straight line from the origin to be ρ = M 2 +N
2 , where MxN is the image size. And this is quite obvious,
because if I have an image of dimension say Mx N say here the image dimension is there
are M number of rows and say N number of columns and this is the origin of the image
plane. And in this you can find that I cannot draw
any straight line for which the value of θ will be beyond this range ±90o and the value
of ρ this should be ρ = M 2 +N 2 . But what
is the difference of our earlier formulation with this
formulation. Now, in this particular case our equation
of the straight line is taken as ρ = x cosθ + ysinθ .
Whereas in our earlier case our equation was y is equal to mx plus c. So, here you find
that in this particular case given a single points
say (x1, y1) in the spatial domain, in the parameter
domain, in the mc-plane the corresponding equation becomes c = - mx1 + y1 , which is
again the equation of a straight line.
So, when we consider the parameter space to be mc-plane, mc-space or we represent a
straight line in the slope-intercept form in that case a point is mapped to a straight
line in the parameter space. Whereas in this particular
case for the same given point (x1, y1), in the
parameter space the equation that we get is ρ = x1cosθ + y1sinθ . So, here you find
that this x1 and y1 these two are constants for a given
point (x1, y1) in the parameter space, whereas the ρ
and θ they are variables. So, a particular point in the spatial domain
is now mapped to a sinusoidal curve in the parameter domain or in the ρ-θ space. However,
if I have Q number of collinear points in the
x-y plane which will be mapped to Q number of straight lines in the mc-plane, but all
those straight lines will pass through a single
point. In this case, the same Q number of collinear
points in the x-y plane will be mapped to Q number of sinusoidal curve in the ρ-θ
plane but all those sinusoidal curves will intersect
at a particular point, at the single point, which gives
us the values of ρ and θ, which are the parameters for the straight line on which
all those Q number of collinear points lie.
Now, this is the only difference between the mc-plane and the ρ-θ plane. Apart from this
the formulation is exactly same as before. So,
for computation again here what we have to do is,
we have to define the ρ-θ space into a number of accumulator cells. So, the accumulator
cells are given like this as in this figure. So,
here again any (i, j)th accumulator cell, an accumulator
cell say this (i, j)th accumulator cell which will have an accumulator value of say A(i,
j) corresponds to our parameters θi and ρj.
So, again as per as we have done in a previous formulation that for a given point as we have
said seen that our equation becomes ρ = x cosθ + ysinθ , and for a given point say
(xk, yk) in the spatial domain, our equation becomes ρ
= x k cosθ + yk sinθ . What we do is, we allow the
value of this variable θ to assume any of these allowed values as given in this accumulator
cell. So, the θ can assume any of these allowed
values between this given maximum and minimum and we solve for the corresponding value of
the ρ. And because the solution of ρ that you get
may not be one of the allowed values. So, what we have to do is, we have to round off
the value of ρ to one of the nearest allowed
values in our ρ axis. So, again as before that at the end of the
process if an accumulator cell say A(i, j) contains a
value equal to Q. This means that, there are Q number of collinear points in the spatial
domain, which in the spatial domain lying on the straight line, which satisfy the equation ρ j = xcosθi + ysinθ i . So, again as before
depending upon the number of points, putting a
threshold on the number of points which we consider to be a significant point or not,
I can determine how many straight lines in the given
boundary image, I will extract which will give
me a meaningful boundary description. Now, let see by applying this technique, what
kind of result we can get here. Here what we
have shown is as we have said that every point in the spatial domain is mapped to a
sinusoidal curve in the parameter domain, in the ρ-θ plane. So, you find that this
point 1, this point 1 over here has been mapped to a straight
line. The point 2 has been mapped to a sinusoidal curve as has been given by this,
the point 3 again has been mapped to this particular sinusoidal curve, 4 has been mapped
to this particular sinusoidal curve and 5 has
been mapped to this particular sinusoidal curve. And now we find that if I want find
out the equation of the straight line passing through
say 2, 3 and 4, these three points. Then you find
the 2, 3 and 4 these three sinusoidal curves are meeting at this particular point in the
ρ-θ plane.
So, the corresponding cell will have a value equal to 3, indicating that there are three
points lying on the straight line, which satisfy
this particular value of θ and this particular value of ρ.
So, from here I can get the parameters of the straight line on which these three points
2,3,4 they are lying and same is true for other
cases as well, for example, 1, 3 and 5, So, we find
that this is curve for 1, this is the curve for 3 and this is the curve for 5 and all
of them are meeting at this particular point. So, here whatever the value of θ and ρ,
I get, that is the parameter of the straight line passing
through these points 1, 3 and 5. So, by applying this you find that in one of our previous
classes we had shown this image, and after edge detection operation what we get is the
edge points as given on this right hand side. And
now, if I apply the Hough transformation and try
to detect the four most significant straight lines then by applying Hough transformation,
I find these four straight lines which are most
significant and the boundary which is specified by these four straight lines.
So, I can always find out that what are these vertex locations and this is a rectangular
region which is actually the boundary of this particular
object region. So, with this we come to the end of our today’s discussion that is the
global edge linking operation where the global edge
linking is done by using Hough transformation and as we have said that this Hough
transformation is nothing but a process of mapping from the spatial domain to the parameter
space. Thank you.