Image Segmentation Global Processing (Hough Transform)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hello, welcome to the video lecture series on digital image processing, that will be discussing today that is a global processing approach which is also called Hough transformation. So, after today’s lecture the student will be able to explain and implement the local processing technique for linking the edge points and also the global processing technique that is the students will be implement the Hough transformation to link the edge points. So, let us just see that what we have seen in the last class. We have said that ideally edge detection technique should identify the pixels line on the boundary between the regions. We say it is the boundary between the regions because we assume that it is a transition, this region is a transition region from a low intensity values, from low intensity region to a high intensity region or from a high intensity region to a low intensity region. But while trying to implement this, it has been found that the edge points which we expect to be continuous to give us a meaningful boundary description of a segment that cannot be achieved in practice. Now, this is mainly due to two reasons. First one is due to non-uniform illumination of the scene, if the scene is not uniformly illuminated that leads to detection of edge points where the boundary points will not be continuous and the second reason for getting this noncontinuous boundary points is the presence of noise, that is if the image is noisy then after doing the edge detection operation either the boundary points will not be continuous or there may be some spurious edge points which are not actually edge points or the boundary points of the any of the regions. So, to tackle this problem we have to go for linking of the edge points so that after linking we will get a meaningful description of the boundary of a particular segment. So, we have said that there are mainly two approaches for edge linking operation. The first approach is the local processing approach and the second approach is the global processing approach. In the local processing approach what we do is we take an edge detected image, that is the image that we have as an input, this is an image containing only the edge pixels. So, we assume that edge points will be white and all the non edge points will be black and in this edge detected image we analyze each pixel in a small neighboρod. So, for every point (x, y) if that is an edge pixel we take a small neighboρod of point (x, y) and we link the other edge points within this neighboρod with the point (x, y) if they are similar in nature. So, whenever we find that within the neighboρod we have two edge points which are similar in nature then we link these edge points and after linking all such edge points, we get a boundary of pixels that are similar in nature. So, basically what we get is a well-defined boundary of a particular segment. So, when we say that we have to link the edge points which are similar in nature, then we have to have some similarity measure. So, remember that after edge detection operation for every edge point we get two quantities, one is the boundary strength at that particular edge point and the second quantity that is the direction of edge at that particular edge point. So, by comparing the boundary strength as well as the direction of boundary at point (x, y) and at a point which is in the neighboρod of (x, y) we try to find out whether these two points are similar or not. So, if these two points are similar we simply link them. So, for this what you have to do is, we take an edge point (x, y) and we find out a point (x', y') , which is in the neighboρod of (x, y). So, what we are doing is, we are taking this point (x, y) and considering the point (x', y') , which is in the neighboρod of (x, y) and we find out the difference of edge strength. We know that this f  x, y  is the image function and f(x, y) gives the intensity value at location (x, y) in the image. And f  x, y  this gives you the gradient of the intensity value at location (x, y). So, we compute the gradient at location (x, y) and also the gradient at location (x', y') and if the difference between these two is less than or equal to certain threshold T, where T is a non-negative threshold and at the same time so, this gives you whether the strength is similar or not and at the same time we also have to check whether the direction at this edge which is given by this α(x, y) at location (x, y) and α (x', y') at location (x', y') . So if there, if the orientation or the direction of the edge is also similar, that is the difference is less than or equal to some angle threshold value A, then we consider these two points (x, y) and (x', y') to be linked together. So, in this particular case our (x', y') has to be in the neighboρod of (x, y). So, that is what is represented by (x', y')  N xy . So, this is the local processing technique but as we said that we are going for linking of the edge points because the edge points are discontinuous and normally the neighboρods size that is taken is a small neighboρod. So, if (x', y') is not within the neighboρod of (x, y) over a given, over a given definition of the neighboρod in that case (x', y') cannot be linked with the edge point (x, y). So, to solve this problem because (x', y') , the two edge points can be far apart depending upon the amount of noise that you have or the depending upon the lighting condition, but we have to be, we should be able to link those points as well. So, in such cases the local processing technique does not help to link the edge points. What we have to go for is the global processing technique and the global processing technique that we will discuss today is called the Hough transformation. So, it is Hough transform. So, what is this Hough Transform?   The Hough transform is a mapping from the spatial domain to a parameter space. So, let us take an example, suppose I have this x, y co-ordinate system and I have a single straight line in this x, y co-ordinate system and we know that in the slope intercept form, this straight line is described by an equation which is given by y = mx + c , where m is the slope of the straight line and c is the intercept value. Now, for a particular straight line the values of m and c will be constant. So, I represent them by m1, c1 indicating that these two values, the slope and the intercept are constant for a particular given straight line in the x-y plane. So, you find that this particular straight line is now defined, is now specified by two parameters, one is one of the parameters is m1, which is the slope of the straight line and the other parameter is c1, which is intercept. Now, if I mapped this straight line to the parameter space because I have two parameters m and c, that is slope and intercept. So, our parameter space will also be a 2 dimensional space. So, what I am trying to do is? I am trying to map this straight line in the parameter space. So, I draw this mc plane. I will have the slope along one direction and the intercept c along another direction and since for this given straight line y = m1x + c1 , m1 that is the slope and c1, that is the intercept is fixed. So, this particular straight line will be represented by a single point in the m-c plane and this point is at location m1 and c1. So, we find that when I mapped a given straight line in the spatial domain to the parameter space, a straight line gets mapped to a single point. Now, let us see what happens if we are given a point in the spatial domain that is in the x-y plane, we are given a particular point let us see the situation, what will happen in this case.   So, now what I have is, I have again this x-y plane and in the x-y plane, I have a single point and let us assume the co-ordinate of this point is (x1, y1). Now, you find that equation of any straight line in the x-y plane as we have seen earlier in the slope-intercept form is given by y is equal to mx plus c. Now, if this straight line y is equal to mx plus c has to pass through this given point (x1, y1), then (x1, y1) must satisfy this equation. So, in effect what I will get is, I will get an equation that is y1 = mx1 + c , because this line y = mx + c is passing through the given point (x1, y1) and this is the equation that has to be satisfied by all the straight lines that passes through this point (x1, y1). Now, we find that ideally I can have infinite number of straight lines passing through this given point (x1, y1). So, there will be infinite number of straight lines like this and for each of these straight lines the value of the slope that is m and the intercept c it will be different. So, if I now map this single straight line in our parameter space that is mc plane, you find that this m and c, these two become the variable whereas y1 and x1 they are the constants. So, now what I can do is? I can rewrite this equation that is y1 = mx1 + c , in this way, I can write it as c = -mx1 + y1 . So, here what I have is, I have this x1 and y1, these two are constants and c and m are variable. So, if I map this point (x1, y1) into our parameter space. So, what I will have now is? I will have this mc-plane and in the mc-plane you have find that c = - mx1 + y1 this now the equation of a straight line. So, effectively what I get is? I get a straight line in the mc-plane following the equation c = - mx1 + y1 . So, we have seen two cases that is one case a straight line in the x-y plane is mapped to a point in the mc-plane and in the other case if we have a point in the x-y plane that is mapped to a straight line in the mc-plane. And this is the basis of the Hough transformation by using which we can link the different edge points which are present in the image domain which is nothing but the spatial domain or we can say that this is nothing but our x-y plane.   So, now let us see that, what happens if I have two points in our spatial domain or the x-y plane. So, again I go to our spatial domain or x-y plane. So, this is my x-axis and this is my y- axis and suppose I have two points, one is say (xi, yi) and the other point I have in this spatial domain is (xj, yj). Now, if I draw a straight line passing through these points (xi, yi) and (xj, yj). So, this is the straight line, which is passes through these two points (xi, yi) and (xj, yj) and we know that this straight line will have an equation of the form y equal to say m dash x plus c dash. So, what we have to do by using the Hough transformation is that, given these two points (xi, yi) and (xj, yj) we have to find out he parameters or the equation of the straight line which is passing through these two points (xi, yi) and (xj, yj). Now, as we have seen earlier that a point in the x-y plane is mapped to a straight line in the mc-plane or in the parameter space. So, here in this case since we have two points in the x-y plane this will be mapped to two different straight lines in the mc-plane. So, if I draw in the mc-plane if I find the mapping in the mc-plane it will be something like this. So, I have this parameter space or mc-plane. The first straight line, the first point (xi, yi) will be mapped to a straight line like this, where the equation of this straight line will be given by c = - mx i + yi and the second point will be mapped to another straight line say something like this, where the equation of this straight line is the given by c = - mx j + y j and you find that the point at which these two straight lines meet, that is this particular point this is the one which gives me the values of m and c. So, this will give me the value of m' and c' , and this m' and c' are nothing but the parameters of the straight line in the x-y plane, which passes through these two given points (xi, yi) and (xj, yj). Now, if I consider that there are infinite numbers of points or there are large number of points lying on the same straight line in the x-y plane, each of these points will be mapped to a particular straight line in the mc-plane like this. But each of these straight lines will pass through this single point (m', c') in the parameter space. So, by this what we seen is as we know that if there is a single point in the parameter in the spatial domain or in the x-y plane, that is mapped to a single straight line in the parameter Space that is mc-plane. So, if I have a set of collinear points in the x-y plane each of these collinear points will be mapped to a single straight line in the parameter space or in the mcplane. But all these straight lines corresponding to the points which are collinear in the x-y plane will pass through, will intersect at a single point and this point in this case is (m', c') the values of which, that is m' and c' , this give us the slope and intercept values of the straight line on which all those collinear points in the spatial domain lie. And this is the basic essence of using Hough transformation for linking the edge points. Now, how do you compute this Hough transformation? So, far what we have discussed, this in the continuous domain that is our assumption is the values of m and c, or m and c are continuous variable. But in our implementations since we are considering the digital cases we cannot have the continuous values of m and c. So, we have to see that how this Hough transformation can really be implemented. So, for implementation, what you have to do is? This entire mc space has to be subdivided into a number of accumulator cells. So, as has been shown in this particular figure. So, here you find that what we have done is this mc space mc-plane is divided into a number of smaller accumulator cells. So, here we have a range of the slops which are the expected range of slops in a particular application and the range is from a minimum, that is of the minimum slope to a maximum slope. So, this is the minimum slope value mmin and this is the maximum slope value, mmax. Similarly, for c this is also sub-divided and the total range is within an expected maximum value and a minimum value. So, cmin is the expected minimum value of the intercept c and cmax is the expected maximum value of the intercept c. So, within this minimum and maximum, within this range, this space is divided into a number of accumulator cells. And this array, this 2 dimensional array let us name is, name this as an array say A and each of these accumulator cell locations can be indexed by say index i and j. So, a cell location at location say (i, j) and I call this location as say cell location (i, j) and the corresponding accumulator cell will have a value say A(i, j). So, this A(i, j), this particular cell the (i, j)th cell corresponds to the parameter values, let us say mi and cj. So, an (i, j)th an accumulator cell (i, j) having an accumulated value A(i, j) corresponds to the corresponding parameter values mi and cj and for implementation what we do is? We initialize each of these accumulators to 0. So, initially A(i, j) is set to 0 and that is our initialization. Okay. So, once we have this array of accumulator cells then what we do in the spatial domain we have a set of boundary points and in the parameter space we have a 2 dimensional array of accumulator cells.   So, what we have to do is, we have to take a boundary point say (xk, yk), a single boundary point in the spatial domain and we have seen earlier that these boundary point (xk, yk) is mapped to a straight line in the parameter space that is in the mc-plane and the equation of this straight line is given by c = - mx k + y k . Okay. So, what we have to do is we have to find out the values of m and c from this particular equation. Now, in our case what we have is, the values of m and c are not continuous but they are discrete and as we have said that an (i, j)th accumulator cell corresponds to the corresponding parameter values mi and cj. So, to solve for values of m and c, our basic equation is c = - mx k + y k . So, here what we do is? We allow the value of m to vary from the minimum to the maximum as we have said that we have chosen a range mmin to mmax. So, we allow the value of m to take all possible values or all allowed values as specified in our accumulator cell ranging from the mmin to mmax. And for each of these values of m we solve for the corresponding value of c. Following this particular equation c = - mx k + y k . Now, the value of c that you get by solving this particular equation for a specific value of m that may be a real number, whereas we have to deal with the discrete case. So, whenever we get a value which may be a real number or may be the value of c that we get which is not allowed as per our definition of the accumulator cell then what we have to do is? This value of c has to be rounded off to the nearest allowed value of c as specified in the accumulator cell. So, if I have say m number of such possible values of m, I get the m number of corresponding values of c by solving this equation c = - mx k + y k . So, suppose by following this, a particular choice of m, say mk or we have already used k for something else instead of calling mk, let us call it say particular value of m, say mp. When I put this value of mp in this equation suppose the corresponding value of c, I get after solving this equation is say cq. And you remember that we have initialized our accumulator cells to all the cells having a value of 0. So, whenever for a particular value of mp, I get a value of cq, that is the intercept value then the operation I do is the corresponding accumulators cell A(p, q) is incremented by 1. So, I make A  p, q  = A  p, q  +1 . So, this I have to do for all the points all the boundary points in our spatial domain, that is in the x-y plane and for each of these boundary points I have to compute it for every possible or every allowed value of m as allowed in our parameter space. So, at the end you will find because for every computed value or computed values in mp, cq, I am incrementing the accumulator cell by 1, where the accumulator cells were initialized to 0. So, you find that at the end of the process if an accumulator cell A(i, j) contains a value say Q. So, we are considering this after we consider all the boundary points in the spatial domain. For each of these, we compute the corresponding say mp and cq for all allowed values of m, I find out the corresponding allowed values of c and for each such pair of mp and cq, I do this incrementation operation on the accumulator cell. So, at the end if an accumulator cell say, A(i, j) contains a value of say Q. This indicates that there are Q number of points lying on a straight line whose equation is given by y = mi x + c j because as we said that, this accumulator cell (i, j) corresponds to the slope value of mi and the intercept value of cj and for every point, wherever I get a corresponding value of m and c. The corresponding accumulator cell is incremented by 1. So, at the end of the process if a particular accumulator cell A(i, j) contains a value Q. This is an indication that in the spatial domain. I have capital Q number of points or boundary points which are lying on the straight line y = mi x + c j . Now, the question is what is the accuracy of this particular procedure, that is how accurate is this estimation mi and cj. That depends upon how many number of accumulator cells I will have in the accumulator array. So, if I have a very large number of accumulator cells in the accumulator array then the accuracy of the computed m and c will be quite high, whereas if I have small number of accumulator cells in our accumulator array then the accuracy of this computed values will be quite less. Now, the question is how can we use this to detect the number of straight lines present in the spatial domain. Let us consider a case like this. So, I have an image in the spatial domain and the boundary points of this image are something like this. So, these are the boundary points in the straight line. So, when I compute this Hough transformation I get an accumulator cell. So, you find that in this particular straight line there are 1, 2, 3, 4, 5, 6, 7, 8, Eight number of points on this side, on this straight line there are 1, 2, 3, 4, 5, Five number of points, on this part of the straight line there are again five number of points, on this part of the straight line there are 4 number of points. And if I consider that this part is also a straight line, So, on this straight line there are 3 number of points. So, in our accumulator cell at the end of this Hough transformation operation I will get one cell with value equal to 8, I will get another cell with value equal to 5, I will get one more cell with value equal to 5, I will get one cell with value equal to 4 and I get another cell with value equal to 3. And if I chose all these values. Say, if I say that I will consider a straight line to be significant if the corresponding accumulator cell contains a value greater than or equal to 3, then by this process the number of straight lines that I will be able to detect is this straight line, this straight line, then this straight line as well as this straight line. But if I say that I consider only those straight lines to be significant where the number of points, number of collinear points lying on the straight line is greater than or equal to 4, then I will be detecting only this straight line, this straight line, this straight line and this straight line. So, again here you find that by choosing that how many points to be I should consider to lying on a straight line so that the straight line will be significant and I consider this to be a boundary straight line that also can be varied or tuned depending upon the application by choosing the threshold on this number of points lying on the straight line. Now, here you find that in the mc-plane though we are able to find out the straight line segments but this particular formulation of Hough transformation that is mapping from x-y domain to the parameter domain that is the mc-plane has a serious problem. The problem is in mc-plane what we are trying to do is we are trying to find out the slope intercept value of the straight line in the spatial domain. Now, the problem comes when this straight line tries to be vertical, that is parallel to x-axis. If the straight line is parallel to x-axis, then the slope of the straight line that is the value of m tends to be ∞. And in this formulation we cannot tackle the value of m which becomes very large or which tends to become ∞. So, what should we do to solve this problem? So, to solve this problem instead of considering the slope intercept form, what we can do is we can make use of the normal representation of a straight line.   So, the normal representation of the straight line is given by this, the formula is ρ = x cosθ + ysinθ and what do we get in case of this normal representation? The line that we get is something like this. So, here again, I have this straight line in the x-y plane but instead of taking the slope intercept form, where the equation was given by y = mx + c , I take the normal representation, where the equation of the straight line is given by ρ = x cosθ + ysinθ . What is ρ? ρ is the length of the perpendicular, it is the length of the perpendicular dropped on the straight line drawn from the origin of the x-y plane and θ is the angle made by this perpendicular with the x-axis. So, you find that the parameters of the straight line which is defined in this normal form, ρ = x cosθ + ysinθ . The parameters are ρ, which is the length of the perpendicular drawn from the origin to the given straight line and θ, which is the angle formed by this perpendicular with the x-axis. So, unlike in the previous case where the parameters are the slope m and c now my parameters become ρ and θ, and when I have these two parameters ρ and θ then the situation is quite manageable that is I do not have the situation of leading to a parameter which can take an infinite value. So, we find that in this particular case what can be the maximum value of ρ and the maximum value of θ.   We consider the value of θ to be ranging in the range of ±90o and the value of ρ that is the length of the perpendicular to the straight line from the origin to be ρ =  M 2 +N 2 , where MxN is the image size. And this is quite obvious, because if I have an image of dimension say Mx N say here the image dimension is there are M number of rows and say N number of columns and this is the origin of the image plane. And in this you can find that I cannot draw any straight line for which the value of θ will be beyond this range ±90o and the value of ρ this should be ρ =  M 2 +N 2 . But what is the difference of our earlier formulation with this formulation.   Now, in this particular case our equation of the straight line is taken as ρ = x cosθ + ysinθ . Whereas in our earlier case our equation was y is equal to mx plus c. So, here you find that in this particular case given a single points say (x1, y1) in the spatial domain, in the parameter domain, in the mc-plane the corresponding equation becomes c = - mx1 + y1 , which is again the equation of a straight line. So, when we consider the parameter space to be mc-plane, mc-space or we represent a straight line in the slope-intercept form in that case a point is mapped to a straight line in the parameter space. Whereas in this particular case for the same given point (x1, y1), in the parameter space the equation that we get is ρ = x1cosθ + y1sinθ . So, here you find that this x1 and y1 these two are constants for a given point (x1, y1) in the parameter space, whereas the ρ and θ they are variables. So, a particular point in the spatial domain is now mapped to a sinusoidal curve in the parameter domain or in the ρ-θ space. However, if I have Q number of collinear points in the x-y plane which will be mapped to Q number of straight lines in the mc-plane, but all those straight lines will pass through a single point. In this case, the same Q number of collinear points in the x-y plane will be mapped to Q number of sinusoidal curve in the ρ-θ plane but all those sinusoidal curves will intersect at a particular point, at the single point, which gives us the values of ρ and θ, which are the parameters for the straight line on which all those Q number of collinear points lie. Now, this is the only difference between the mc-plane and the ρ-θ plane. Apart from this the formulation is exactly same as before. So, for computation again here what we have to do is, we have to define the ρ-θ space into a number of accumulator cells. So, the accumulator cells are given like this as in this figure. So, here again any (i, j)th accumulator cell, an accumulator cell say this (i, j)th accumulator cell which will have an accumulator value of say A(i, j) corresponds to our parameters θi and ρj. So, again as per as we have done in a previous formulation that for a given point as we have said seen that our equation becomes ρ = x cosθ + ysinθ , and for a given point say (xk, yk) in the spatial domain, our equation becomes ρ = x k cosθ + yk sinθ . What we do is, we allow the value of this variable θ to assume any of these allowed values as given in this accumulator cell. So, the θ can assume any of these allowed values between this given maximum and minimum and we solve for the corresponding value of the ρ. And because the solution of ρ that you get may not be one of the allowed values. So, what we have to do is, we have to round off the value of ρ to one of the nearest allowed values in our ρ axis.   So, again as before that at the end of the process if an accumulator cell say A(i, j) contains a value equal to Q. This means that, there are Q number of collinear points in the spatial domain, which in the spatial domain lying on the straight line, which satisfy the equation ρ j = xcosθi + ysinθ i . So, again as before depending upon the number of points, putting a threshold on the number of points which we consider to be a significant point or not, I can determine how many straight lines in the given boundary image, I will extract which will give me a meaningful boundary description.   Now, let see by applying this technique, what kind of result we can get here. Here what we have shown is as we have said that every point in the spatial domain is mapped to a sinusoidal curve in the parameter domain, in the ρ-θ plane. So, you find that this point 1, this point 1 over here has been mapped to a straight line. The point 2 has been mapped to a sinusoidal curve as has been given by this, the point 3 again has been mapped to this particular sinusoidal curve, 4 has been mapped to this particular sinusoidal curve and 5 has been mapped to this particular sinusoidal curve. And now we find that if I want find out the equation of the straight line passing through say 2, 3 and 4, these three points. Then you find the 2, 3 and 4 these three sinusoidal curves are meeting at this particular point in the ρ-θ plane. So, the corresponding cell will have a value equal to 3, indicating that there are three points lying on the straight line, which satisfy this particular value of θ and this particular value of ρ. So, from here I can get the parameters of the straight line on which these three points 2,3,4 they are lying and same is true for other cases as well, for example, 1, 3 and 5, So, we find that this is curve for 1, this is the curve for 3 and this is the curve for 5 and all of them are meeting at this particular point.   So, here whatever the value of θ and ρ, I get, that is the parameter of the straight line passing through these points 1, 3 and 5. So, by applying this you find that in one of our previous classes we had shown this image, and after edge detection operation what we get is the edge points as given on this right hand side. And now, if I apply the Hough transformation and try to detect the four most significant straight lines then by applying Hough transformation, I find these four straight lines which are most significant and the boundary which is specified by these four straight lines. So, I can always find out that what are these vertex locations and this is a rectangular region which is actually the boundary of this particular object region. So, with this we come to the end of our today’s discussion that is the global edge linking operation where the global edge linking is done by using Hough transformation and as we have said that this Hough transformation is nothing but a process of mapping from the spatial domain to the parameter space. Thank you.
Info
Channel: Digital Image Processing
Views: 8,851
Rating: undefined out of 5
Keywords: Image Segmentation Global Processing (Hough Transform)
Id: q1J0VAYFkHg
Channel Id: undefined
Length: 53min 17sec (3197 seconds)
Published: Fri Sep 30 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.