Optical Character Recognition with EasyOCR and Python | OCR PyTorch

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] what's happening guys my name is nicholas ronatt and in this video we're going to be taking a look at ocr let's take a deeper look as to what we're going to be going through so in this video we're going to be covering everything you need to get started with optical character recognition also known as ocr so first up we're going to start out by setting up easy ocr so this is a python package that makes it super easy to actually go and perform ocr then what we're actually going to do is we're going to use the easy ocr package to extract text from a bunch of different images and then we're going to be able to visualize those results using opencv so you'll actually be able to see the extractor text overlaid on top of the image now let's take a look as to how this is all going to fit together so in order to format ocr we're going to be using the easy ocr library and easy ocr is powered by pytorch so it's a deep learning library similar to tensorflow so we're going to be making our detections inside of a jupyter notebook and this is obviously going to be coded using python and then when we come to visualize our results we're going to be using opencv to go and overlay our text similar to what you can see on the slide on the screen right now ready to do it let's get to it alrighty so in this video we're going to be covering ocr also known as optical character recognition now in order to do that we're going to be using the python library called easy ocr so this is going to make it a whole heap easier to actually go through and perform optical character recognition on an image or on a document now in order to get through this tutorial there's four key things that we need to do so first up we need to install and import our dependencies then second we need to read in our images or video third we're going to draw our results and for that we're going to be using opencv and then last but not least we're going to take a look at how we can handle images that have multiple lines of text and how we can visualize that as well so natively easy ocr is going to pull out all of the components of text but when we want to map those onto our image we might need to loop through and print those out alrighty so let's get started but first up let's take a look at what our directory looks like so within our directory we've got our jupyter notebook so in this case it's called easy ocr tutorial or ocr tutorial and we've also got a couple of images so in this case here we've got an out of service sign and the cool thing about this is that it's got some really small text down the bottom so we'll see if easy ocr can pick that up and we've also got another image which just says surf so we should be able to pick that up using easy ocr pretty easily alrighty now the first thing that we need to do is install and import our dependencies now the first dependency that you're going to need is pi torch so easy ocr runs on pi torch now depending on what type of operating system you're running and whether or not you're using a gpu pi torch installation is going to be slightly different but the great thing about it is that if you go to pytorch.org it will automatically select the most appropriate type of install method for you so in this case because i'm operating on my mac it's selected stable mac the type of package insulation that we want to use so i can either use conda or anaconda i can also use pip i can choose the language i want to install it for and whether or not i want to install using gpu support so in this case i'm running it on my mac we want to use pip and we're going to install it for python so our command that we need to run is just down here so it's pip install torch torch region and torch audio so we can copy this and paste it into our notebook so we're just going to include a pip install command and then the second thing that we need to install is easy ocr itself now in order to install easy ocr it's pretty easy again it's just another pip command so this is the easy ocr documentation inside of github but again i'll include links to all of these links as well as links to the actual completed jupiter notebook inside of the description below so be sure to check that out if you want a guided walk through so again to install easy ocr we just need to use this pip install command so pip install easy ocr so we can copy that as well and paste that into our notebook and then we're just going to run that and install pytorch and easyocr alrighty so those are finished installing now in this case i already had pytorch and easy ocr installed if you're installing it for the first time it might take a little bit longer the next thing that we need to do is import our dependencies so let's go ahead and do that and then we'll take a look at what we've imported okay so we've imported four things so first up we've imported easy ocr this is going to be the main library that we're using to perform optical character recognition we've also imported opencv as cv2 so this is going to allow us to show our image import our image and visualize we've also imported matplotlib and specifically we've got pi plot so in order to write that command we've just written from matplotlib import pi plot as plt and then we've also imported numpy so import numpy's mp now what we can go and do is actually perform a little bit of optical character recognition so we're now so step one is now done so we can mark that off as done now the next thing that we need to do is read in our images and video so the first image that we're going to work with is this surf image here you can see it's pretty easy to work with so we're just going to include a variable to hold that image path alrighty so you can see there that we've gone and created a variable and our variable is called image underscore path and that is our path to our image if you had images sitting inside of different folders you'd include the entire path to that image now the next thing that we're going to go on ahead and do is actually use easy ocr to perform that optical character recognition so let's go ahead and do it alrighty so that's how ocr done so you can see that in these three lines of text we've managed to go and extract the text that's actually within our image and you can see it's done it with about 95.5 percent confidence and it's accurately extracted surf so in terms of the code that we wrote we first up defined a easy ocr reader and to this we passed the language that we want to use now in this particular case the mac that i'm using doesn't have a gpu so we've set gpu equal to false and then what we've done is using our reader that we defined up here we use the read text command and pass through our image path and then what we did is we displayed our result so our result is actually going to come back with a few different things so first up this big array here is basically defining where the text is in our image and we'll be able to see this better when we visualize it the second part is the text that's been identified and the last bit is the confidence now this is great but it'd be nice to actually visualize this on our image so let's go ahead and do that so first up what we're going to do is define a couple of key variables to determine where our different coordinates are so in terms of plotting these coordinates using opencv we just need the top left corner and the bottom right corner which is going to be this value here and this value here so we're going to set up a couple of variables to hold those as well as our text so let's do that first and then we can visualize also this component here is also now done alrighty so let's set up our coordinate variables okay so we've now gone and defined our coordinate variables so let's just make this a little bit easier to see so first up what we've done is we've defined where our top left variable is in here so in this case what we've gone and done is traversed our result so we've grabbed our first result and our first component within our first result which is this whole array here and then we've grabbed the first set of values which is the 18 and 18 and then we've converted that to a tuple because when we pass it to opencv it's expecting a tuple as an argument then we've gone and done a similar thing to grab our bottom right variable which is this value here and to that we've again gone through and grabbed our first detection we've then gone and grabbed our first value in our first detection and then we've gone and grabbed the second component from that array so one so also zero one and two which is this value here and then we've gone and grabbed our text which is just this surf component here we've also gone and defined the font that we're going to be using so in this case we're using opencv and font hershey simplex now the next thing that we need to do is actually go on ahead and visualize so let's go and do this and there you go so we've now gone and visualized our optical characters that have been detected from our image so let's quickly take a look at the code but you can clearly see the results there so it's gone and drawn a box around the text and it's also printed out the text there so you can see it saying surf so in order to do this what we've first gone and done is we've used opencv to read our image and in this case we've passed our image path to that that's been stored in a variable called img and then what we've done is we've overlaid a number of additional visualizations so we first up overlaid our rectangle which is this component here and then we've overlaid our text so to draw our rectangle again we've used opencv and for that we've used the rectangle method we've passed our image our top left coordinate our bottom right coordinate and then the color so this is just that really bright green there and we've also passed through our line thickness then the next thing that we've overlaid is that text so you can see surf is appearing above our rectangle and to that we've passed image we've passed the text that we want to use in this case the surf detection which is up here we've then passed through our top left coordinate so this is just where we want to position our text we've passed through the font that we want to use how big our font needs to be as well as its color so in this case it's 255 255 255 which is white in terms of an rgb color code and then we've also passed through our font line width so you can see it's a little bit bigger than usual there as well as the line style then to visualize it we've used matplotlib so in this case we've used plot dot im show to show our transformed image because remember we've passed it we've imported it from its raw format here we've then gone overlaid a rectangle then gone and overlaid some text and then we've used plot.show to pretty print it and there you go so that's optical character recognition in a nutshell now what happens if we had an image that had multiple lines of text so that sign that we were taking a look at here so if we take a look at our outer server sign you can see that there's a little bit more text here and we've got multiple lines plus we've also got this little bit of text down here so in terms of how we actually handle this it's pretty much the same the only thing that really changes is how we go and visualize because when you see the results you'll see that we have a number of different lines of results so we need to loop through to visualize those so let's go on ahead and bring this image in so we're just going to grab the name of the image so in this case it's out of service.jpg then we can just replace our image path up here so in this case we're just replacing our surf image without a service image and then what we can do is use the exact same reader code to go and process that image now in this case it's going to take a little bit longer to go and process because there's a lot more text that it needs to pass through as well as all this tiny little text down here but we'll see if it actually detects that so let's go ahead and run that piece of code so again it's all the same all we've had to do is just change our image path okay so that's our detection done now again as i was saying you're going to get multiple different results here so you can see that we've got a big array and then we've got all of the different components in that but you can see that it's still gone and grabbed all of our text so it's got out of service and then australian safety signs which if we go and take a look out of service and then australian safety signs and then if we take a look at the rest so it's got sticky stuff as well as the phone number and the website so it's actually gone and pulled out all of that text now if we wanted to take a look at a single result we can type result.0 and you can see you're going to see the result there so in this case we can see out of if we take a look at result 1 you can see service and then result 2 you can see australian safety signs and if we keep going along we're going to get all of those different components so we can see by there and then sticky stuff there so you've got all of those different components but you can see now that when it comes to visualizing this there's multiple lines that we need to visualize so in this case when we visualize here we're just visualizing one detection now what we need to do is actually loop through and then plot each one of those detections individually so let's go on ahead and start doing that so in this case we can mark draw results as done so we've gone and completed that now all we need to do is actually go and visualize these multiple lines so let's go ahead and write the code and then we'll take a look at what it's doing and there you go so we've now gone and visualized that image so we've written a fair bit of code there but you can see that we've gone and printed everything out so you can tweak the look of this so if we wanted to make the font a little bit bigger we can adjust that make that four and you can see it's going to be a little bit bigger now but because we've got a lot of text in this bottom right hand corner it's sort of overlapping but you can play around with the formatting but basically what we've gone and done is again we've just read in our image similar to what we did up here so cb2.imread and then for each one of these detections that we had over here we're basically looping through each one of those establishing a tuple for the top left and the bottom right and because some of the formatting came in and we had rather than having an integer we had a float we've just gone and done a little bit of pre-processing on those there but again we're still using a tuple then we're going and extracting our text we're setting up our font similar to what we did above and we're using the exact same rectangle and put text methods from cv2 to go and plot these on our image and then last but not least we've just made our image a little bit bigger so we can go and see that using plot.figure and we've set fig size then what we're doing is we're using plot.i am show to show that image and we're pretty printing it out so you can see now that we've now gone and extracted all of that text out of our image and we can now begin working with it so we've done quite a fair bit in this tutorial so we started out by installing and importing our dependencies using pytorch and easy ocr we then read in our images using the easy ocr reader we drew our results using cv2 or opencv and then we took a look at how we can handle multiple detections as well so in this case we took a larger image and were able to detect all of the text within that image and that about wraps up this video thanks so much for tuning in guys hopefully you found this video useful if you did be sure to give it a thumbs up hit subscribe and tick that bell so you get notified when i release future videos and let me know in the comments below what you're going to be using ocr for thanks again for tuning in peace
Info
Channel: Nicholas Renotte
Views: 119,154
Rating: undefined out of 5
Keywords: OCR, optical character recognition, optical character recognition using python
Id: ZVKaWPW9oQY
Channel Id: undefined
Length: 16min 0sec (960 seconds)
Published: Sat Nov 07 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.