OpenCV Python Pose Estimation

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right in this video we're going to talk about post estimation and opencv using python so we will start off by saying what it is why do we need it how does it work and jump right into a coding example so by the end of this video we will see how we could draw this Cube here on the right right on top of our calibration images so what is post estimation the idea is finding both position and orientation of an object in a video or image or image or video so why do we need post estimation so we have augmented reality you have activity recognition film and animation and many other examples so how does post estimation work you have the idea is you have World points are projected on to the image plane so here are your world points and gets projected onto your image plane and we are trying to find the position and orientation that minimizes the reproduction error so you may have the true Point here and then you have like a projected Point here and there could be some error and you might have errors between all the points and we're trying to minimize what that error is okay so the idea is you have your image points and you have your world points and you have some transforms that relate the two you have your intrinsics you have additional Matrix here and then you have your rotation and translation which capture your extrins six so all of this combined will give you your projection Matrix that brings your world points to your image points and then from there you could rewrite it into a more um compact expression here as you see down below so the idea is we want to estimate in opencv we call it as arvec and tvec and then those are the actual parameters that we're trying to find okay so there's typically different solvers that are out there you have um here we have the solve PMP iterative so this is the default one that we'll use but then there's some other ones too which we will go we won't go into too much detail with but for the iterative one you know you have the what's called a levenberg marquatt and what this does is it takes like the Jacobian and you know does like a Newton type of method of minimization error so you know if you have like a function here and then you're essentially trying to get lower and lower and lower until you find the minimum so this is like a two-dimensional case but you expand this to a multi-dimensional case using What's called the Jacobian but essentially it comes down to a minimization error by looking at the gradient at each time step so that's the general idea and then you also have like DLT methods which tries to solve like least squares directly some methods uses like SVD and so on but you can see part two you have some other different methods that you can explore but usually the default one works fine so that's the one we will use in our example so let's jump right to it okay so here's our post estimation script we import some of the modules that we need and we're going to declare e num of two types because we're going to be drawing a axis and Cube later on so we made an enum for that and we have a draw axis function here which we'll use and I'll explain in a bit and same with the draw Cube which I'll explain later but the main program we're running is called the post estimation function so what this function does is going to take the draw option based on what we're drawing and what we want to do first is retrieve the camera calibration that we obtained in our previous video pull it out from the scripts the calibration.mpz file and then read in the values and store it as the cam Matrix and Distortion coefficient variables so after we do that what we want to do is get all the paths for our images so I think this might make more sense if I say obtain image paths here so we're going to get all the paths for the images and then what we want to do is set up our termination criteria which we talked about before so here we have the EPs and iteration so that's the error and number of iteration counts and then same here is our if we take a look at this we set up we set up to have the number of um or error and iterations that we set so we take our world points here and from our world points what we do is we assume some location in the world space and then this step here is we're going to try to reshape that into an array into the structure that's going to work for our functions later on okay so what we want to do here is to find some access points this is arbitrary points in the world space and then we want to find Cube Corners which are more points the eight points of the cube and those will be used to project from the world points into the image playing later on so the steps here is similar to camera calibration we find the corners so we read in the image convert it to grayscale pass it into the fine chessboard corner so we find the corners okay so after we find the corners call Corners org after we find that we refine it using the corner sub picks function we should talk about in the camera calibration and then the new function here is called a solve PNP okay so the solve PMP function what this does is it finds the rotation and translation vectors which is here and we'll use that to do some of our projection later on so what this takes in is the world Point um here is called object points so that's going to be an N by 3 array you have your Corners refine which is the m by n numpy array your camera Matrix which is going to be a three by three and then your Distortion coefficients which is a one by five So based off of our drawing options first off we're using the axis so we have our project points function so what this one does is it'll take in our object points which here we're calling axes you have your RVC and tvac and then you pass in the camera Matrix and your Distortion coefficients and then what this will return is your image points which will be a m by n array and then the Jacobian which we're not using so we call the draw axis function that we've defined up on top so what this does is it'll take your your image your corners and your image points and then it's going to draw the lines onto the image and one thing to note is we need to do some conversion to make sure that all the points are integers so that's what we're doing and then we do the RGB for the different axes which is what these different color codes are for so if I go ahead and run this program we can see our axes being drawn so if I run this you see we have our axis drawing for each of the image for calibration image that you have so you can see that my camera is rotating and the axes are adjusting accordingly so that tells us that our calibration is looking pretty good Okay so now if I take a look at our next step so here is our Cube implementation so the only difference everything else is the same we have a different option here for draw Cube so we're going to project it but now with our draw Cube function it's a little bit different we're going to read in our image points we're going to create a contour which is going to be the green plane and then for each of the corners on our Cube we're going to connect the dots and draw the borders of our Cube okay so if I go ahead and run this we will see our Cube being drawn with the plane where it's on the chessboard okay so you can see all the images is adjusting accordingly to the translation orientation just like we want okay so if you found this video helpful give a like And subscribe and I'll see you in the next one
Info
Channel: Kevin Wood
Views: 2,779
Rating: undefined out of 5
Keywords: pose estimation opencv, pose estimation python, opencv pose estimation, pose estimation computer vision, solvepnp
Id: bs81DNsMrnM
Channel Id: undefined
Length: 8min 12sec (492 seconds)
Published: Mon Sep 25 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.