CVFX Lecture 14: Epipolar geometry

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay so if you're caught last time we basically talked about optical flow right that was the main idea was that we want to produce a correspondence between images so this is my image one and this is my image - the idea behind optical flow was for every point in image 1 X Y I want to find a vector UV that points to some corresponding point X prime Y Prime in image 2 right and so this was the idea is that you know we want to do and just to give you a preview what's going to happen so I talked a little bit about why we want to use optical flow for things like retiming or moving patterns of texture from one place to another and then basically two lectures from now I'll talk about more applications of all the stuff that we talked about throughout this chapter specifically for visual effects the kind of stuff so really the first few chapters or the first few lectures in this chapter are more about fundamentals and then I'll talk about the applications at the end so in optical flow right we basically don't put any sort of constraints on what that vector UV could be right we're basically saying that I could hypothetically look anywhere in the second image to find the correspondence for this point and you know when you think about it that's not unreasonable right so for example if I have a bunch of objects and they're all independently moving well then in theory that corresponds could occur anywhere right so like for example you know let's say that I got the sky and he's playing volleyball right and so if the camera is moving and the guy is moving and the ball is moving you know then you can imagine the kind of in theory the correspondence for the ball depending on how this guy hits it could be anywhere in the image right so in those general case when you've got not only camera but also object motion independence with the camera then you do have to search all over the image plane to look for that correspondence right however there are some important situations where the correspondence is extremely constrained and we don't have to look all over the image plane okay that's kind of the concept for today so the concept for today is called the epipolar geometry between a pair of images and so the setup basically is that we have either a static scene and the camera moves or we have two pictures of the same scene taken at exactly the same time and then I think about these are really kind of the same statement right it's basically saying that the camera is moving but the scene is not moving right and so you know one example this is if you're an empty room and you're just moving your video camera around nothing is changing well then that's kind of an example first case the second case is when you have a pair of cameras for example that are mounted on the ends of a bar and you kind of watch the scene with this stereo camera right and that's the way that they you know that's one way that they produce a lot of stereo or 3d movies in Hollywood right now is by using a stereo camera right where both cameras are acquiring images of the scene at three frames per second but they're offset by some physical distance right so you know this case here is called stereo and we're going to talk more about that in detail next time about how do you estimate the correspondence and so Petawawa setup today is why does this scenario make finding correspondences easier okay and so let's draw a picture to think about why this is true okay so let me just say that almost exclusively up to this point in this class almost everything I've talked about has been in 2d in the sense of saying they're purely things that we can look at based on the images themselves right so for doing the matting and doing a compositing all we were doing was looking at those images there was no 3d information about those scenes that we use to solve any of those problems similarly for feature detection we were just looking at good regions of pixels in the images by themselves right we weren't trying to figure out where those features were in 3d so kind of starting you know what I mean even today we're going to mostly stay in 2d but definitely for the following chapter six seven eight we're going to moving into the world of 3d computer vision where you really have to know about how things are set up where the cameras are and where the points are in the three dimension world right and so that's kind of a shift from 2d to 3d so here is your first kind of introduction to that type of shift where if you think about these images are produced by cameras right and so here you know when you have a camera it's got a aperture or hole through which the light passes and used to be that the light would pass and hit a piece of film now the light passes through the hole hits a CCD array but the principle of the image formation is really more or less the same and so you've got an image plane which is kind of like in this case I'm drawing the image plane as if it lies between the aperture and the scene so in this case what I want to do is I want to draw you know V like this is the camera center and this is the image plane in reality right the image plane is actually behind the hole right because then the light has to come through the center of the camera and I hit whatever is behind the camera but you know if you just kind of mentally flip where that plano is around you're not really changing anything mathematical thought I'm just kind of drawing it as it lies in front of the image center we'll talk a lot more about this image formation process in chapter 6 so just hold your horses a lot of questions about this until we get a couple weeks in okay and so now let's suppose that I see this point on the image plane okay so I think about okay well where could that point be in the three-dimensional world right so really the way that we formed this 2d image is by taking three dimensional points and kind of drawing a line between the point in the world and the camera center and where that line hits the image plane is where the pixel appears right that's like saying okay you know if I think about following this out into 3d space you know that point could be out here or it could be out here or it could be out here right so basically there's kind of a one degree of freedom of where that point could be out in the world right anywhere this ray that goes from the camera center through the point on image plane out into the world right and now I think about okay well over here in this camera well now I want to know where could that correspondence of this point here possibly be right and so I'm doing basically as I'm projecting that line in the world that Ray in the world back on to the second image plane so it's like saying well if this point was the one in 3d then it would appear like here if this was the point then it would appear like here if this was the point it would appear like here and so you can see that kind of the image of this line in 3d projects down on to a line in the second image this is like saying that you know this point here has to have a correspondence in image to somewhere on this special line okay and that means that we can constrain our search to only this line right we don't have to look all over the image plane to find the correspondence if we knew what this line was we'd only have to look in one place we just kind of search along the line from left to right trying to find the best match and as you can see that would like totally make our job easier right so going from the optical flow where the correspondences could fundamentally be anywhere to go into the stereo problem where the correspondences are really restricted enables you know number one it enabled us to do much more accurate correspondence because things are much more constrained and number two it enables some computationally much faster algorithms because suddenly we can pull in lots of concepts from so that stuff we're talking before question so the degenerate case where the line doesn't predict a lot stupid thing where they're pointing basically right at each other right so there there are a couple cases where so first of all I think the main thing is that yeah okay so you're saying that if you've got two cameras that are literally looking across the you know road for each other right so that's like a crazy case right so in that case you've never actually get any correspondences because you never see the same actual 3d points right you're seeing the back of one thing at the front of it from the other side right so let's not talk about that this but I will say that you do have to have the case where the cameras are physically separated right so I mean if the camera is just rotating around these epipolar lines don't exist in fact what we learned before was earlier in this chapter we learned that if the camera is purely rotating like you're on your vacation taking the vacation pictures then the images from any two camera positions that relate to each other by this projective transformation right so in that case life is even easier right there's only eight parameters of this particular transformation to estimate and they put the entire image planes into corresponds so when you're just rotating the camera you should be estimating a projective transformation when the camera is physically separated then you've got this Eppler geometry right yeah so that's a good question so there are some kind of corner cases where the a polar geometry doesn't exist but anytime over the camera are separated you're not maybe in this pathological case then then you should be fine I mean even here there's really only one bad line and that's when you're pointing right in the middle right so in theory this line here would still project to a line somewhere on this camera right so it's not like the whole epipolar geometry is destroyed it's just that you can't infer something about one particular ray so it's still more or less working right even though that case you would never be able to do anything useful for a computer vision standpoint okay select pause - does this make sense this is this is the key console to the day right is that you think about how these lines could work and I guess I have a nicer picture of this it's basically the same idea right so now that's like saying for every point in image 1 we know that there is some ray out into space where that 3-d correspondence could be and then that produces a constrained set of possible correspondence locations in image - ok ok and actually if you think about it there is this kind of a parallel or a mutual relationship between the two images by which I mean that if you think about how are these so it's the same way you know I could say ok well now for this point in image 2 I could shoot a ray out into space and I could look at its projection onto image 1 so basically for every point in either of the images there's a corresponding a polar line in the other image right oh sorry nope yes sorry so I was basically drawing this point and showing you there was this line here right and I could have probably if I had the foresight made a nice little animation of this but if you think about how are these lines kind of traced out think about this way so you can say okay here's image plane one here is image plane two here are the camera centers right and now think about think about this line connecting the two camera centers so the goal is camera one camera two this line segment in 3d space is usually called the base line I think I mentioned this earlier in the context of wide base line I believe when we're talking about sift for example I was saying that you know usually sift descriptors work the best when the cameras are or they're such descriptions are designed for cases where the cameras are fairly far apart that's the wide baseline case otherwise we could just use Harris corners for example cameras are close together so now you think about okay this is a plane in 3d space this is a plane in 3d space now I think about this line in 3d space and I imagine that there's basically a family of planes that can spin around this line right so you imagine that the side view is that I have a bunch of planes that can kind of pivot around the slide think about a big piece of plywood that is pivoting around a pole in front of me right so I can basically twirl it in one direction every time I move that huge plane it's going to intersect the image planes in these two places right so for one position of the plane I get one intersection or one pair of wire sections right now I move the plane that's hinging around the base slide a little more and I get some other line and so on and so basically the idea is that you trace out the possible epipolar lines by moving this big 3d plane back and forth on its hinge right so kind of all I want to say is that from from this picture here maybe it wasn't immediately obvious that for this point I have this line and for this point I have this line maybe most immediately obvious that those two image one image two lines or related to each other but this is kind of an analogy that shows me that actually they come in these what I would call conjugate pairs right so the kind of conclusion is that if I look back at kind of looking at these image planes as if they're flat that's like saying that there's a line here and a line here and every point on this line must have its correspondence on this line and vice versa right so fundamentally I can constrain my search for correspondences by looking along these conjugate at polar lines as their goal the word conjugate here just kind of means that these guys come in pairs right in some sense right and so this is kind of my better picture of that right so on the Left I have the picture of the plains twirling around the pole the baseline on the right I have a better picture of what I try to draw by hand you guys see that you know so if I think about another way I think about this is that for any 3d point in space I can draw a plane that goes through that point and the two camera centers was because three points in the 3d world define a plane that plane intersects the image planes in two lines and those are at the Eppler lines I know the correspondence for this for any point on this plane has to lie on this pair of lines right so it's kind of the big picture is that I have this one D so this is that what I boiled down to is instead of having a full generality 2d to 2d corresponds problem kind of what I have is a set of 1d correspondence problems so if I was looking for correspondence and I want to do it for the whole image I would say okay well first I'm going to search along this line and match it up with that line and now I'm going to search along this line and match it up to this line and so on right so now I've basically made my life easier by turning my fully generic problem into a series of easier 1d problems that's the way that we really solve the stereo correspondence problem in a way um so one thing that you might notice from my diagram here is I'm kind of drawing these lines in a slanted way and as you'll see in a minute that really is the way that these lines typically look at it they don't look like they're nice and horizontal and I'll kind of show you in a second why that's true so let's just say generally you know slandered and as a preview I mean obviously life would be great if these lines actually were like rows of the image right then you can imagine for those of you that are computer programming type of people then life would be greater because then you can just do a processing along the rows of the image finding correspondence along each corresponding row and in fact that is something that we like to do to images to make the correspondence problem easier right so we're going to talk about this process of what's called rectification in the second half of this lecture where I say okay so once we know what these slanted up poet lines are how could we transform the images so if the epipolar lines become horizontal and then you can apply a nice you know you're programming element okay let me go back to my picture here for a second so one reason that these are slanted is you can imagine that not pictured on this image is the fact that you know this point here this is the camera center belongs to this plane and so somewhere in theory if this image plane was large enough you know relators maybe draw it as an exaggerated thing over here so let's suppose that I had something like this and this is like kind of an extreme example so let's say that we had something like this so here's the line connecting the two camera centers and so in theory if one camera was in the view of the other one I got actually I would actually physically see it right inside would mean that I would have a point here where I would see this camera this would be where this 3d point of course buying the camera would project out of the first image plane and vice versa and if you think about it since every since every one of these planes contains this baseline that means that this projection point has to appear in each of the F polar lives right so basically what I get are sets of epipolar lines that all kind of have to intersect through this point here again if I died a little bit the foresight made a nice computer-generated example that might be better but basically the idea is that as I twirl this plane around right I'm looking at where that plane cuts the two image planes and I can see that if I happen to have both of these guys in the field of view of each other well then all of those at polar lines are going to intersect at one place that I can see on the image plane and this point here in each of the cases is called the epic Pole right and so in some sense this is why all of the airport lines are this slanted look is because they all have to converge at a point right now in typical images you don't actually see like if you leave on a stereo camera pair or if you think about you know taking two images for different perspective typically those images are not necessarily so extreme that the position of the second camera is visible from the first and so kind of what often happens in real life is that your F pull is actually somewhere off camera right so here's the is you can imagine if I were to extend the image plane suppose instead of having fine image planes these image planes could in theory be not going off to infinity so then the projection of this camera onto the continuation of this plane will be somewhere out here and so all the apples would intersect out at this point right so you kind of would still see this kind of notion that they're all kind of converging to something but in most of its cases in real life the thing that they're converging to is off to the side somewhere right okay so I'm going to show you well actually I guess I can show you an example of this now just to make it a little bit clearer and then we'll go back to a little math okay so it's a little bit hard to see in the abstract so let's go over to MATLAB for a second okay so let's see I draw her how I owe nope of course I called it in sho points instead of this okay so I guess I maybe call this c1 and c2 okay so in this case I have two images of the same scene nothing's moving taken from different perspectives and as I'm going to talk about in just a second well you can see that I've done is labeled by hand some corresponding points between these images right so again it's a little bit confusing because this is kind of a you know symmetric on four sides object but here for example this is the front view of this temple and I can see that these four points on the balcony correspond to these four points this is like the left-hand tip corresponding to this point over here you know and so you can see I've marked some points on what from this perspective is the front face and the side face and I mark the corresponding points over here so we're going to talk about later in the lecture how these points are so basically the way I usually estimate this at polar geometry is by labeling either with automatically generated features or by hand some good correspondences and I use those to estimate where these lines should be okay so let's suppose we're saying that I've done that estimation problem now I want to actually show the at polar lines and so now question is how did i phrase this sorry let's see if it's function is called it is called show headlines in one m2f okay so i've previously estimated where these if four lines are and so now you're going to see is every time i click in the left image it's going to generate a EPP Euler line in the right image okay so let me click for example on this point in image one so if you look at it these correspondences would make sense right it's easiest to see if correspondences on the surface of the building so here guy I can see that fundamentally over here the EPP or line seems to go along this brown border of the building and you can see that yes you know all those points basically correspond to the border of the building over here and if you think about how this line follows along the sidewall right it kind of goes a little bit above the brown border going around the other side and here again that line goes a little bit above the corresponding edge and so if I click on a few more points we can generate some more Eppler lines it's kind of hard to just read through these guys err now I'm kind of filling up the image with lines so here again let's think about how these correspondences make sense right so here let's think about this this line here right so yes I immediately found that the correspondence does lie on the course buying a power line what else is on that line well for example there's some point on the roof that's about a third of the way over right and again I can see that yes that correspondence is basically see yeah so basically it's up over here and this actually is a little bit disconcerting because I can see that while the correspondence is there actually the correspondence has switched places in some sense right so here the roof point and the corner point you know pure in the order one two and here they appear in the order two one right so there's actually going to switch of order in this case which as we'll see in the next lecture could be kind of complicated for fine and correspond so while the correspondence does exist it may not exist in such a nice one you know left to right way but you can see the rest of this stuff looks pretty good right so for example you know here's another Hepler line that goes kind of almost along the brown lines both on the side and the front of the building you can see that yes you know if I were searching for responses they would be in the right places right and here in both these images you can see that these Eppler lines are kind of diagonal and somehow feel like they're converging to some point way off beyond the image plane but they would intersect if you drew them all you know to their point at infinity right okay so this is this is kind of a real example of what the F polar lines would look like for a real image okay so these pause and ask comments or questions yes if I understand them correctly like these conjugates that before lines if you knew like so you could like so for example when you have those corresponding so if you like between the two right okay so that's the original question so the question is if you knew the pairs of conjugate polar lines then couldn't you go further and estimate the 3d orientations of the cameras and so we'll talk about that in the next chapter it turns out that yes you can in certain circumstances right so basically there are methods for kind of bootstrapping your way from the correspondences to this a polar geometry to the real camera positions right there are some caveats to how that process can work though right so for one thing you know one one problem with that approach is that there are it turns out that there are several or there's a whole family of possible camera locations that project to the same Eppler lines right so you're you're not going to get a unique solution you can put some constraints on things to get your solution to be almost unique right you know one one very obvious way I think about why it's not unique is you know suppose that I was to take my stereo rig and I was to move it around in 3d space right so I've immediately got a you know rigid motion the rotation and translation that's going to give me six parameters of you know uncertainty right on top of that you know if I were to kind of keep the geometry saying but I think that if you were to kind of make the whole rig bigger right here to make the physical separation of the cameras larger but but also change the angle so you could basically get a scale ambiguity so there are there are definitely some complications right we'll talk about that somewhere as well as unfortunately projective complications not just rigid motions but also more complicated stuff but yes that is the fundamental idea behind some of the match moving stuff we'll talk about next okay so other questions or comments okay so what I want to talk about next is how do we you know how do we encapsulate or quantify these Eppler lines well figure out how to do this okay so the nice thing is that all of the well so I guess just start off by saying that there's this important quantity called the fundamental matrix okay and so the fundamental matrix really kind of gathers together everything I had to know about these up polar lines and so it's expressed as the following so it's a three by three matrix and it ties together the image correspondences in this special way so this basically says that for any correspondence XY and one image and x-prime y-prime in the other image this matrix equation has to be true okay and so this this I'm this double arrow thing I'm using to mean corresponding points okay and so this is a really you know remarkable way of saying things and so why does this constraint what how does this give me a polar lines for example okay so basically getting epipolar lines from f so suppose that i fix you know the correspondence in image one okay so the fundamental matrix equation is basically this right so again this here is a three by one Becca this here is a three by three matrix this here is a one by three vector and so if I'm going to fix X Y then I can take this part here and just multiply this F matrix by X Y and what I get is basically some other three by one vector which looks like let's just call it ABC right and then if I write out what this means this is just saying that I have a X prime plus B Y prime plus C equals zero right and this is an equation of a line in image two so this is the F polar line in image two so I guess there's a fixed X Y in image one so if I tell you the point in image one then I can use this special fundamental matrix to give me the special up polar line image two and you know kind of vice versa if I were to switch the roles if I was to fix X prime Y prime then I could kind of tie this together into one vector and I would find the special line in image one right and so that's exactly how I drew or that's like that's not exactly how I drew the lines in that MATLAB example as I knew this three by three matrix F and every time I clicked on a point in image one I did this process to figure out what a corresponding line that I should draw over an image to should be right okay so I guess okay so any questions about that is that excess now let me just say that just a note remember that we talked about these epic polls right so basically the app holes are the places where all of these f bar lines are going to intersect so let me just say why that's true so all epipolar lines intersect at the epital eg in I won let's define this special point is on every puller line and so one way I think about this is that you know no matter what the point is in image 2 I know that the epical has to be on that head floor line and so this is like saying I can change all the stuff over here I can change the X&Y however I want but this matrix equation still has to be true because I know that the F hole image one has to be on this line right and so the conclusion is that since this part is kind of changeable and this part has to equal zero all the time the conclusion is that this epical can be found as an eigenvector of f that has a zero eigenvalue so what this means is that at full is an eigenvector of F with I can value zero and so that Pat tells us a couple things so one thing to note I don't wanna get too far into it is that you know in theory F is this three by three matrix I mean well I'm Theory F is a three by three matrix right now you might think that there are basically nine possible numbers that I could put into that 3x3 matrix the fact that this F has zero eigenvalue means that actually there is one less degree of freedom than nine for sure and also one thing to think about is that in theory I could multiply this F by any scalar number so I could multiply all the numbers by two and in this equation would definitely still be true and so there's one more degree of freedom there that you lose and so basically what this comes down to is that F is 3 by 3 but has 7 degrees of freedom and so the reason I'm saying this is that when you estimate F you need to make sure that the F that you estimate kind of has this important property right you can't just estimate a 9 numbers and hope that it works okay so how would we go about estimating the fundamental matrix right so like you know I kind of shown you once I know it kind of how I use it and what the f polar lines look like but how do I get that in the first place okay and so estimating the fundamental matrix so in there are many ways to do it you know there's there's there's many ways to do it there may be better than I was going to show you right now but this is the basic idea so estimating fundamental matrix well you know the way I think about this is that how do we estimate a projective transformation for example well we got a bunch of white correspondences and we said okay well if I was trying to estimate a project Rance formation there are eight degrees of freedom so I need to have at least four point correspondences hopefully I had more to give me some sort of robustness and then I solved this direct linear transform that we showed a couple of lectures ago to find you know the best set of eight parameters that was consistent with all the corresponds that I created right do we do exactly the same thing here right so basically the first step is to obtain featured responses and so again this is exactly you know this is one of the reasons why we talked about stuff in Chapter four so these future correspondences you know if the cameras are not too far apart they could come from Harris corners if the cameras are further apart they could come from separate features right so I've got or or I could manually click on it if I want it to be SuperDuper sure okay so firstly I obtained feature correspondences usually there is a step of normalizing them to have you know zero mean and standard deviation one basically so what I mean by that is that again if all your correspondences are like way off into the left hand corner the image or if they're not centered well what I want to do is to just do a little normalization step a linear transformation that brings all the correspondences to be stared at the origin and to have a nice spread of not more you know that's nice spread of one and so I think this is one of the I think I sign this is a homework problem to basically show kind of how this normalization process works again all you're doing is you're put you're basically applying a little linear transformation to the correspondences before you put them into this process and so now if you think about it what is our fundamental matrix equation well it looks like looks like this right so every correspondence gives me basically one constraint right so we think about writing this out this is like saying okay well each correspondence generates one constraint on earth by which I mean if I think about multiplying out this matrix so again this is a one by three this is a three by three this is a three by one right when I start to multiply this out right I can think about okay well what's the first you know what's one of these terms going to look like well I'm going to have you know x-prime y-prime 1 transpose times F 1 1 X plus F 1 2 y plus F 1 3 and some other similar stuff equals 0 and then when I multiply this out I'm going to get something looks like X prime F 1 1 X plus X prime F 1 2 y plus X prime F 1 3 plus a bunch of other terms equals 0 and then again I can kind of organize that into stuff I know and stuff I don't write what I don't know are the F that's why I'm trying to find out what I do know are all the X's and Y's and so I kind of make this into a matrix that looks like ok you know I have a long matrix like this here are my unknowns F 1 1 through F 3 3 and then every line is this something looks like ok so this is going to be like X 1 X 1 prime that's the thing that multiplies F 1 1 what multiplies F 1 2 is y 1 X 1 Prime and so on right so I'm going to get basically a n where n is the number of correspondences by 9 is the number of columns this is going to be a 9 by 1 matrix of unknowns and so this is the way I would kind of set up that linear system right and so again this is one of those helpful so I think that again this is one of things I always like to kind of emphasize while teaching in class is seeing a problem and setting it up into the corresponding linear system that you would then put into my lab result right so kind of it's a good it's a good skill to be able to have the transform problems into the corresponding linear so okay so now how would I solve this problem so at this point what I have is basically a big linear system so I basically have a big linear system looks like a F equals 0 where this is some N by 9 matrix this is some 9 by 1 vector and so the way I solve this problem is I compute what's called the singular value decomposition of a and so show of hands on who's ever heard of the singular value decomposition a few people but not everybody so basically the SVD is kind of like a generalization of an eigen value eigen vector decomposition that you do when the matrix is not square ok so I have unfortunately the time to teach you the whole SVD but basically this is a very common linear algebra tool ok and the way this works is that this is going to be a 9 by 9 diagonal matrix and the entries are this diagonal are what are called the singular values right they're kind of like the eigen values but in a more generic way and so the next step is let F be the last column of the what this kind of means is that it's like saying that I want though I want to find the lowest singular value right in theory if the correspondences were perfect one of these singular values would be exactly zero that kind of corresponds to the fact that we know that F has a you know has a zero eigenvalue in practice the correspondences that I obtain or that I click on may not be exactly perfect and so what's going to happen usually is that there's going to be one very very small I get in vector or a singular value of F and that's the one that I want to find so basically I pick F out of here and then I would reshape into a three by three matrix let's call that F hat and so this kind of goes from my vector back into a matrix this is almost what I want because what I want is this three by three fundamental matrix the problem is at this stage there's still no guarantee that the Q by three matrix that I obtain exactly has zero eigen value right it could be a it could have an eigen value that's very close to zero but not exactly zero and so the last step is basically to compute SVD of F hat which is going to be again some other thing this is going to be a three by three matrix now because F is d by three and I zero out the lowest singular value of E and then recompute this thing so kind of what I'm doing is I'm forcing the F to have a zero eigenvalue by kind of zeroing this guy out and then I usually have to do some sort of renormalization process since I did some stuff to the correspondences initially I have to undo that stuff at the end and again that's not very hard to do and you can see in the book that's easy to read we estimate up in the right space okay so this algorithm is very I mean this is very easy to do in MATLAB and you can get that I think I actually either I did or I pulled some code off the web that did it and so this is a very straightforward algorithm it was called the eight-point algorithm very famous computer vision algorithm the reason it's called eight-point algorithm is that in theory you only need to have eight points to get a solution that if all the points are accurate gives you the exact answer right of course in practice you generally want to have more than eight points you want to have like in my example with the temple I'm crawling at about twenty or twenty-five points right so typically what you'd like to do is you'd like to get as many good points as you can use those the s of a PF polar geometry again just like with the fundamental or just like with the project transformations you know it could be that if you generate these correspondences automatically that some of them may be kind of crummy and so you probably need to use some sort of outlier rejection like ransack to robustly throw out the ones that are not really underlining the corresponding to the right fundamental matrix and so just in the same way that you can use ransack for feature matching you can use ransack well I guess the same kind of thing right you're using ransacked to shurok features that are not consistent with the epipolar geometry right so that's something that we could add on to for example sift matching right so sift matching has a bunch of outlier rejection heuristics to say okay well these are reasons why I don't think these are corresponding points now we can layer on even one more you're one more thing to say I should never generate sift matches that are not consistent with some true fundamental matrix right and in fact you know that's kind of putting the cart before the horse because some people do use sift matches to estimate the fundamental features in the first place so they're just kind of back and forth but that's the basic idea question my dimensions of V are wrong in step five so it's good so this is an N by nine matrix you wrote on the column Oh so V itself is 9x9 right and I'm taking the last column of that which should be a 9 by 100 air okay oh yes no I'm the last column is nine by one yeah is that okay other comments or questions okay so what I just want to talk about quickly now is so the subject of the whole next lecture is going to be like I said stereo correspondence right so once I have these four lines what I want to do is I want to now search along them for correspondences okay and those of you that are you know computer programmers can imagine that it would be kind of a pain to have to search along these non horizontal lines right you'd have to be searching along this resampling you know resembling the image along the way it would kind of suck right and so what would we like to do we'd like to turn these eppur lines into horizontal lines so kind of what I want to do is I want to take these two images and warp them in such a way that after I've been watering them the lines are all horizontal right then life would be very good because then I could just step along corresponding rows right and so that's exactly what the sec's process is so that process is called rectification so it's kind of like warping images so that conjugate epipolar lines oops correspond to image rows and sometimes you gratings called scan lines kind of because you know used to be if you have a raster display that was kind of constantly using electron beam to produce you know the rosier to be set those are scan lines right and so they did is that I want to turn the F polar lines into scan lines and so this is kind of a schematic of the process that I want to create right so in the middle are the two original images with their possibly slanted and diagonal at Port lines in this case I kind of drew them where the F polls are visible on the image so that may not be true in real world and so those two images are related to each other by the fundamental matrix F that I could estimate from correspondences right now what I want to do is I want to work these images using transformations H 1 and H 2 so that the F polar lines become horizontal okay and so that's the goal of what I want to be able to do and so going back to my notes here yeah sorry so going back to my notes each image has a rectifying projective transformation H applied to it so that's like saying that again I have my original images these are related to each other by the original fundamental matrix now I apply some weird transformations this is going to become these images now are not going to be square anymore they're going to look kind of strange but you know after I apply H 1 here and H 2 here if this is my warped image 1 and this is my warped image 2 then life should be good right and I want so look let's think about what should this new fundamental matrix between these guys be so throw that down so what should the new fundamental matrix between the Warped images be well it turns out to have a very simple form so why is this work well let's think about it so that's like saying that if I have a correspondence this in one image here's my special fundamental matrix and I have the correspondence here in the other image so multiply this out what does this say so this is going to be zero why I'm sorry zero one negative y equals zero multiply this out I get Y prime minus y equals 0 or Y prime equals y right so that tells me that the fundamental constraint right is that you know if I had if I had this point X comma Y here and I fixed Y that's like saying hey you know your new app for line is the place where Y prime equals y sorry now it's kind of so small right that's like saying that hey you know the only constraint that I have is that the two you know Y coordinates have to be exactly the same right so instead of having this weird offset line my lines are nice and partly right nice and straight and also the way I've done this lines them up 1 2 1 so that line 30 of image 1 is the same as line 30 of image 2 right so I mean in theory I could do the rectification in such a way that the polar lines were still horizontal but they could be like offset so like line 5 of edge 1 correspond 30 minutes to really what I want to do is I want to make this so that I can just kind of go loop around you know loop over the entire height of the image and press every row you know with the same index going from up to down right so this is really the case scenario okay the process of how you actually do this is a little bit tedious and so I don't want to go through the gory details right now but the basic idea is that some sort of crota my thinking so the idea behind the process so how do I make these up four lines horizontal right so actually if you think about it that kind of contradicts in some sense why I told you earlier about all these lines meeting at the epipolar right so kind of what I'm trying to do is you think about what I'm going to do is I'm going to apply a projective transformation to each image which is kind of like rotating the camera view right because we talked about how when I rotate the camera I get a projective transformation and so here if I'm on the original setup where I've got my floor lines that are converging to this F pull over here well how would I get those things to be horizontal you know parallel to each other well what I would have to do is I would have to rotate this image plane until it became parallel to the baseline right so I would be like saying if both of the images were pointed straight ahead and the image planes were in a configuration that kind of looked like this right in that case there would never be any intersection of the baseline with image plane right and that would mean that when I rotated my you know plywood around the you know baseline it would intersect these image planes in nice horizontal lines right so that's kind of what we're trying to do in the rectification process is we're basically taking the image plane of each camera and rotate them so that I make the image planes parallel to the base line right and that way I never have any worries about those polar lines intersecting right and then you know I need to be so so in theory right I you do a little bit of extra footwork to make sure that the upper lines are exactly how I want them right because it would certainly be true kind of looking from above or I'm as slitting the side so this would still be a case where the if I was kind of thinking about you know this is again kind of a schematic view here this is a case where again I could get parallel lines but in one of the images they wouldn't be horizontal right and so there's an extra degree of freedom to make sure that both these guys have lined up this way also it's possible that if one of these image planes like in the top view if when these image planes was like much further away than the other then the corresponding F polar lines would kind of be you know not equally spread out right so there are basically a bunch of little homework problems too not that you're going to homework but they're a bunch of little nitpicky things to make sure that in fact I get the lines to be exactly lined up with each other there are a bunch of degrees of freedom of these rectifying protection transformations to make the epipolar lines line up and so if you look in the book there is one example of a rectification algorithm that tries to do a good as good of a job as possible number one satisfying this constraint that at the end of the day the row indices have to be the same and number two you know when you think about it I can you know there are still some degrees of freedom in terms of trying to make the images not so distorted right so I could have for example a situation like this where say these are my lines and then suppose I wanted to stretch this image out like crazy after I do this right or make this guy like super tiny right so those who still be technically rectified but they will be really hard to work with right so in some sense what I want to do is I want to minimize the overall distortion of the image pair so that nothing is artificially squeezed or stretched because you know again when I do this free sampling I'm actually you know I have to go into the image and I have to resample pixels to build this new rectified image I don't want to incur any overstretching of the image or over squishing of the image that will produce kind of bad results for my stereo correspondence algorithm later and so in MATLAB I've already put together a example of this make sure I remember these index right so I've already rectified a couple images the same images and so now if I look at these two images you can see that they are kind of weirdly worked right I mean this is the one on the right doesn't look so bad the one left is definitely a little bit squished and so you could argue that maybe I could you know make that one a little bit wider right but now you know I've got my same thing where I can click on a point in the image and now you can see that actually this is actually interesting choice because my green line happens to be right on top of one of these brown lines that bisects the front of the thing you can see that the correspondence is you know you think about you look at the way that the rectified epipolar line is just like running right parallel to this brown line along the side you can see the correspondences lookit so if I actually click on the brown line here you can see that I've actually kind of done this in such a way that now I could really do search left to right hopefully along this pair of lines and obtain correspondence and even in the places where things are a little bit weird so for example I believe that this was the one that we looked at earlier again now the corner and the corresponding point on the roof occur on the same line in the image right and so if I kind of click on some more points like here again you can see that this rooftop point and this weather van or something there sticking out the tip of those two things there on the same road here and they're also both from the same row here kind of obscured by the tree right and so this is what you would call a rectified image pair all right so your images now look a little bit goofy but your correspondence problem computationally is going to be a lot easier okay okay so they pause and ask any questions or comments okay so where we're going next time basically is now we're going to start from the assumption that we have a stereo pair that is rectified like this and so now the question is how does our you know optical flow problem our correspondence problem where we're trying to find a dense correspondence between all the points over here and all the points over here how does that make our computation process easier and more robust when we know the polar geometry and everything is good so now that's going to lead us to some algorithms that kind of are much more kind of easy to implement in terms of OpenCV and stuff like that right so for example you know a lot of optical flow started from this world of continuous-time partial differential equations right you saw guys doing partial differential equations we talked about two Kalou in stereo we're going to be much more in the discrete world of saying okay so now I'm going to basically kind of imagine that I'm going to index the correspondence problem by inching along this line in the image and then asking what is like my integer number of pixels offset for the correspondence and the other image right so now I'm kind of trying to solve an integer problem or I could say okay in terms of units of half pixels or quarter pixels but the problem becomes discrete instead of continuous so that's where we're going
Info
Channel: Rich Radke
Views: 41,201
Rating: 4.9566159 out of 5
Keywords: visual effects, computer vision, rich radke, radke, cvfx, epipolar geometry, epipoles, fundamental matrix, rectification, projective rectification
Id: QzYn0OPO0Yw
Channel Id: undefined
Length: 63min 56sec (3836 seconds)
Published: Mon Mar 17 2014
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.