Text detection with Python and Opencv | OCR using EasyOCR | Computer vision tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey my name is Philippe and welcome to my channel in this video we are going to work with optical character recognition using Python opencv and Eco CR this is exactly what we will be able to do with today's tutorial the idea is that we are going to take an image any type of image containing text and we are going to detect absolutely every single text in this image and we're also going to do a very pretty drawing as the one I did in this picture so this is going to be an ideal project for beginners in computer vision and the best of all is that we are going to make this a very very very quick tutorial we are going to complete this project in only a few minutes so let's get started so let's start working on today's tutorial and this is exactly the pipeline in which we are going to be working today you can see that this is a very quick four steps process so we are going to complete this project we are going to make this a optical character recognition software upon running in only a few minutes following these four steps before starting with this tutorial remember to install the Project's requirements these are the requirements these are the libraries we are going to be using today we're going to use Eco CR but leave and we're also going to use the Headless version of opencv so please remember to install these requirements before starting with this project and also let me show you the data we are going to use in order to test this algorithm you can see that these are three images I have prepared and each one of these images contain a text that we will try to read with today's tutorial with today's algorithm so let's start working on this project we don't really have too much time today because we want to make it a very very quick tutorial so I'm going to import CV2 and I'm also going to import Eco CR and let's also import metal leave Dot pipelot as PLT right these are the three libraries we are going to use today so the first step is to read an image and I'm going to make this project up and running with one of these three images and then I'm just going to show you how it works with the remaining two at the at the end of this tutorial but for now let's just make everything up and running with one of these three images so I'm going to select this one and this is how I'm going to do I'm going to find the image path like this and then I'm going to read this image using CB2 and I'm going to name this object's image okay so we have completed the first step in our process We Are One Step Closer to complete this optical character recognition algorithm let's move to the next step and now we have to create an instance of the text detector technology we are going to use in order to read the text from all images and this is how we are going to do we are going to name it Reader and reader will be a cocr DOT reader and we need to input we need to input the language in which we want to create this reader in which we want to create an instance of this reader and then we I am going to set another flag which is GPU and I'm going to set it in false and this is because I don't have a GPU in my local computer I'm running this algorithm on my local computer and I need to set this flag in false if anyone is taking false is the default value so even if I don't do anything this is not absolutely new but I'm just going to do it in order to show you and now we have to detect the text on the image we have read from our computer right and this is how we are going to do it we are going to use reader dot read text and we're going to input the image from which we want to read the text and we're going to name this object text okay and now let's see what's the structure for text let's see exactly what we are reading so we are going to print text and let's see what happens I'm going to execute this code you can see I'm getting this message because I don't have a GPU in my computer and it's telling me hey if you run this code in a GPU it's going to be much much faster but yeah what can I do and you can see that this is the output we are getting and in order to make it a little more a little more clean what I'm going to do is to iterate and I'm going to say something like 40 in text print t and I'm going to print exactly the same values but everything is going to look much much prettier and you can see that we are getting three lines and if I show you the image again you can see that we are reading exactly exactly the value for this image right we are Reading Road closed to all pedestrian and bike use so we are reading exactly the value in this image we are reading exactly the text so everything seems to be going okay and if you look at this structure for each one of these lines for each one of these objects you can see that we are getting something that looks like a bounding box then we are getting text and then we are getting a float right we are getting a float value and this is how we are going to unwrap each one of these lines we are going to say t will be I'm going to hide this thing and I'm going to say T will be equal to boundary box text and score right and this will be T I am just going to call this text underscore so I don't have any problem uh having those two objects with the same value with the same name so this is pretty much all we aren't grabbing all the information and all we have to do now is to draw the bounding box and the text on top of the image so we are almost there we are almost there and this is how we are going to do it I'm going to put these four here and we are going to I'm just going to print T because it's going to be the it's going to be better if we can just print Tick T at every execution and then I am going to call CV2 rectangle I'm going to input my image and then remember that this is uh this is a rounding box and I'm just going to execute it again so we can't remember how it looks like remember that the bounding box are something like four items and each one of these items is the X and the y-coordinate so this is an x coordinate this is a y coordinate and so on X and Y X and Y and X and Y and these four items are the four corners of the rectangle which contains the text and if we want to draw this rectangle we need the upper left corner and the bottom right corner and this the first element is the bottom sorry the third element is the upper left corner and the third element is the bottom right corner so this is exactly what we will use we will call bounding box 0 which is the first element and then the bottom right corner is bonding Box 2 which is the third element okay and that's pretty much all for the bonding box location and now we have to choose a color which I'm going to set in green and then a thickness value which I'm going to set in five and for now that's going to be all let's draw the text in the next iteration let's do it in a few minutes but for now let's see how this looks like so I'm going to call plt.in show image and then PLT dot show and let's see what happens we have to wait a few minutes and we don't really need to see the print now we want to see the image and you can see that we are getting something everything seems to be okay but this is not in the color space we were we were expecting and this is because we are using metal lib and remember that when we are plotting an image using Matlab we need to convert this image from BGR into RGB so we need to say something like color uh BGR to RGB okay and now everything will be okay so we have to wait a few seconds and you can see that now we are getting exactly what we should be getting right we are getting the bonding box which are drawn on top of the image and this image looks exactly the one I should exactly I want to show you in my computer so everything seems to be okay everything seems to be going super super well and this is pretty much uh this is pretty much all for the bonding box drawing now we also need to draw the text and this is how we are going to do it we are going to call CB to put text we need to input the image and also the text we want to draw which is text and then the location and we are going to draw this text in the upper left corner so this is the upper left corner then I need to specify a font this is the font I want for my text and then the size I'm going to set it in one and the color I'm going to set it in blue so this is 255 0 and 0 and then the thickness value will be 1 2. we will adjust these two values but for now let's just uh put the text like this let's just draw a text like this and in a few seconds we are going to adjust it but let's see what happens let's see how this text is drawn on top of the image let's see if everything is going well you can see that everything seems to be going well because we are definitely able to draw the text as we were intending to do but let's do it a little nicer and I'm going to set this value in 0.65 because I have been doing some tests already and I see this values I would value and we are going to set this value in two and let's see now and you can see that now we are plotting the text on top of the image and it looks a little nicer so we are detecting exactly the value for this text we are getting Road closed to all pedestrian and bikes and bike use we are the coding we are extracting exactly the value for this text so everything seems to be working super super fine and this will be all for this image now let me show you how this pipeline works with the other two images because you can notice that we have already completed this tutorial we have already completed this pipeline so the only thing I'm going to do now is to show you how this works with the other two images and this is how it works so I'm going to load the second test image now remember this image this is how it looks like right we have this text and let's see what's the text we are detecting okay so you can see that we are definitely detecting all the text we are getting a five dollar C per usage for non-members sign Harbor supervisor so we are definitely detecting all the text but we are still getting some noise over here we are still reading some signs we are reading some stuff which is in the background so what we are going to do in order to fix this issue is that we are only going to draw a rectangle and to put the text on top of the image if the confidence value is greater than a given threshold we will Define right if the confidence value is greater than a given threshold then we will draw the rectangle and we will draw the text and in your case we will not do it and in order to set this threshold I have been doing some tests already and I noticed that 0.25 is a good value ideally we will want to set this this threshold in an even higher value ideally the threshold will be something like I don't know 80 percent or maybe 70 percent at the very least I mean we should not be set in this Threshold at this at this value because this is a very low value but it doesn't matter for now let's just do it like this and this will work just fine remember that the idea for today's tutorial is to make a very very quick very short tutorial on uh optical character recognition I want to give you like the foundations the basics you need in order to understand this technology obviously we are not going to we are not doing a very comprehensive tutorial we are not going super super deep into building an OCR because obviously there are many other things we could do in order to improve this detection in order to improve the confidence value with which we are detecting all of these different bounding boxes right we could be doing many different things and we are not doing anything today because the idea is to make a very very quick very short tutorial so anyway we are getting a perfect addiction now we have we have fixed this uh noise we were detecting in the background and we are taking an absolutely perfect detection so let's move to the last image in our example which is test three let's remember how test 3 looks like this is how it looks like so it's a social distancing sign something like that and let's see if we could if we can detect exactly all the text in this image let's just leave it like this over here and I'm going to enlarge this image and you can see that we are getting exactly exactly all the text as we should be getting it as as we are expecting it we are reading questions then maintain social distancing at least six feet and distance from others we are reading exactly exactly the value in each one of these bounding boxes so everything seems to be working fine in this image as well and this will be pretty much all for this tutorial so this is going to be your for today if you enjoyed this video I invite you to click the like button and I also invite you to tell me what you think about this video in the comments below my name is Philippe I'm a computer vision engineer and in this channel I make tutorials coding tutorials which are exactly like this one and I also share my resources and my experience as a computer vision developer so if these are the type of videos you're into I invite you to subscribe to my channel this is going to be your for today and see you on the next video
Info
Channel: Computer vision engineer
Views: 17,104
Rating: undefined out of 5
Keywords:
Id: n-8oCPjpEvM
Channel Id: undefined
Length: 15min 39sec (939 seconds)
Published: Fri Dec 30 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.