Perspective Transformation | OpenCV in Python | Image Processing

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome to the tutorial on applying perspective transformation on an image using opencv in python so perspective transformation consists of two aspects one of them is the coding aspect the other is digital image processing aspect in this video we will be dealing with the coding aspect and will completely disregard the digital image processing aspect and therefore we have a prerequisite for this video and also a post follow-up the post follow-up will be the perspective transformation of an image from the digital image processing point of view or that aspect now the prerequisite is the understanding of how to read a video file in opencv and for that we basically need 14 lines of code that i have already typed on my screen in case you do not know or you are not able to understand what i have already typed on the screen or i have never read a video on opencv uh just watch watch that video in which we uh quote line by line and understand how the reading of a video is done in opencv so let's begin by understanding the very concept of perspective transformation so look at this example what i have done is we have a view of road we select four points four corner points uh draw an enclosing polygon and we convert it into a rectangle that is a frame so here on the left side we have the input for which we select a region which is in this case a trapezoid drawn on the almost the center of the road and then we apply perspective transformation and obtain the output here in the output we can see that the top left corner of the trapezoid gets shifted all the way to the top left of the screen and the top right gets shifted to all the way to the top right of the screen now when we do that there are certain set of problems like how did the distance between the two points on the top of the trapezoid let's say it was three centimeter in length got converted to six centimeter or even nine centimeter on the output image without any distortion where did the extra of six centimeter pixel intensity values came from there are all sorts of problems when we look at this simple um transformation from the digital image our processing point of view but since we are not focusing on that we'll just stick with the coding on where we have the input we will select our four corner points use opencv and obtain the output as shown on the right side so the approach is very straightforward for achieving this sort of transformation using opencv we don't need to be a very skilled coder nor do we need to have a lot of knowledge about digital image processing everything is already there in the opencv library we just need to understand the flow the syntax is and make the right calls at the right times so all i'm saying is we just need to follow a certain set of syntaxes parallely will understand what we are doing and we make those calls at the right moment and those calls will return as the desired output we won't be doing any raw coding and that's a plus point that's the power of these powerful open source libraries we just need we just need to work with on the surface we just make the calls on the surface in the deeper layers they perform all the work and they just return us the output back to the top surface surface layer so let's begin the coding first of all we want to draw the four corner points that we want to transform for us it is a trapezoid for you it could be anything so how do we select those or draw those four corner points first of all i'll store the coordinates of the corner points in variables i am creating so tl for top left bl for bottom left uh x comma y in an array top right 3 3 3 comma 3 3 3 so the coordinates is x is equal to 3 to 3 y is equal to 3 3 and these are just random variable values the value value coordinates that i'm assigning at the moment but to give it a sort of a proper representation what i'll do is make them a rectangle so top left should be on the left and top and bottom right should be on the bottom and right now that we have these four coordinates these are just by themselves nothing they are just values in an array stored to a variable assigned to a variable how do we convert them into coordinates well i'll use cv2.circle command to just draw them on the image frame so that way these just variables which were just sort of variables we give them as an input to the cv2.circle function which is an inbuilt function from the opencv library and it will draw circles with those coordinates as center now for radius i'll assign 5 which is pretty small it will be basically a point sized circle and then i want the circle to be red so the color channel is bgr and i am giving rs255 and i am giving thickness as minus one so it it will be a fill circle it will be completely filled it will not be a hollow circle that is what thickness -1 denotes now i'll copy paste it four times because i want four circles and each time i want the center to be different top left bottom left this time top right as of my center this time bottom right as my center now let's see what does this have effect on our video file great we have the four coordinates in a rectangle pointing almost towards the sky on the scenery we don't want that we want to put them on the road and form a trapezoid so next up we will do that but yeah the coordinates are working fine and we are able to draw them on the video so i'm wondering what if i just apply the transformation to these four coordinates which are sort of pointing to nothing but anyway the transformation will be something or it might be a messy one but it will be a transformation anyway so let's do that first of all i want to store all the corner points all four corner points in an array in a single array so i'll do that i'll name the giant area as points1 pts1 and points1 will just store all the four coordinates of the first selection we are doing that is the selection of coordinates on the input image but then i think it's just better to stick with the trapezoid ideology i'll just comment this we'll come back to this later i'll just pause the video pause the video quickly and edit these coordinates so that they format represent on the road so welcome back i have changed the coordinates and i'll just show you on the video that these are perfectly the coordinates that form a trapezoid on the road and we want to transform it into a rectangle that fits the entire screen so now that we have the four coordinates let's proceed with the transformation part so we have four coordinates and now we will be proceeding with applying geometrical transformation i'll uncomment the first points one variable that contains four coordinates of the trapezoid points points2 will be a variable that will contain the four coordinates of the transformed image we want the transformed image to have top life top left at 0 0 that is the absolute left of the screen we want the bottom left to be 0 comma 480 x is equal to 0 y is equal to 480 which is the absolute bottom left of the screen then we want um top right to be at the absolute top right that is 640 comma 0 and bottom right to be at absolute right that is 640 comma 480 now we want to geometrically transform the coordinates corresponding coordinates in points one to the corresponding coordinates in points to how do we do that well i'll create a variable matrix which will be is equal to cv2 dot get perspective transform so get perspective transform is an inbuilt function of the cv2 library it will take as argument argument as points1 and points2 so even from the coding aspect it is absolutely crucial that we understand what is this line doing and what is this matrix that is storing what is cv2 dot get perspective transform returning so in image processing what will happen is it is returning us a matrix the cv2.get perspective transform is returning us a matrix what is a matrix it is 2d array and what is the how it is returning that matrix how it is developing and returning that matrix digital image processing aspect ignore it for now but it is returning that uh that matrix what is the use of a significance of that matrix well it is a matrix which when you multiply with the matrix that contains the input coordinates it will give the transformed corresponding coordinates so basically it is a matrix if you might do the matrix multiplication of this matrix that we obtain with the image coordinates it will give us the corresponding output coordinates of the input coordinates but again if we do the simple matrix multiplication we will again just obtain a matrix that will store the corresponding transform coordinates for the set of input coordinates so we won't do that because it's too much of matrix and we won't be getting any image output we want to use this matrix and get an image output get the transformed image so we will again make a call to cv2 function that will do all do the multiplication and then convert that matrix into an image and return us the image the transformed image so i'll call that transform frame and it is nothing but the function is perspective the first input will be the frame the other input will be the matrix then the third input will be to this function be the coordinates of the transformed frame that is 640 comma 480 we want the transform frame to be 640 comma 480 now i'll quickly do the i am show for the corresponding transformed frame so this sort of transformation is also called the bird's eye view because the output will be from the top view as if a bird is flying and looking down on the road just a standard term i came across somewhere on the internet so i'll add this to the window name and we'll just make sure that transform frames are connected both the variables are connected that is no spelling mistake so i run the program now and it's an error probably because we need to use actually numpy to make the set of points in points one and points to that basically goes well or goes easily with the cv2 arguments in the matrix so i'll import numpy library and pause the video and do the corresponding changes and we'll be right back so i added numpy dot float 32 to 0.1 and points to it will basically be more precise in type and that will help us avoid the error in opencv so now running the video and not as as expected there must be some error or typing mistake on the code that is causing some sort of problem so the mistake is okay got it bottom right it should have been bottom left here i have written bottom right twice so i'll make the change and run the program and here we have our transformation the bird's eye view and that is how you do perspective transformation on opencv in python thank you for watching this video give it a thumbs up if you liked it share with your friends who are also working on image processing and have a great
Info
Channel: AdiTOSH
Views: 18,129
Rating: undefined out of 5
Keywords: image processing, video processing, computer vision, opencv, python, image, transformation, image transformation, bird's eye view, bird eye view, top view, geometric transformation, lane detection, transform, wrap, perspective, programming, coding, project
Id: drp_mr2x6A8
Channel Id: undefined
Length: 13min 59sec (839 seconds)
Published: Mon Mar 14 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.