OCR Model Comparison | Tesseract OCR, EasyOCR, Keras-OCR, Paddle OCR, MMOCR, OCR-SAM

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone today I'll show you who OCR model is best sometimes you confused about chosing the OCR model to recognize the text from the image first I'll show you some basic idea of OCR models and finally I will show you from the Google collab notebook with source code which OCR model is performing best for which type of image so let's get started so here is the first question is what is OCR basically OCR is optical character recognition and this model is trained to recognize and EXT information from the images scan document or other visual sources OC model use pattern recognition algorithm to interpret the visual data and convert it into the machine readable text if I show you some uh example image then it will be more clear basically OCR model first detect the text that means where is the text inside the image it will detect the bounding box area so here you can see some objected inside the image and also here is some text so OC model first detect the bounding box uh inside the image where is the text and then it will recognize the text uh what is uh exactly right here so o model use two types of algorithm one is for the detection of the text and another is for the recognition of the text if I show you some other images then it will be more clear here you can see the handwritten uh text here you can see an example of document text basically this type of complex lout text is very critical to recognize exactly using all OCR model I will show you step by step and here you can see a simple document text is here here you can see the hand text and symol text is here also here you can see the scene text over the image so all types of algorithm is not suitable for all types of text recognition and detection here you can see a copy of uh document text is here and also here you can see the number PL dat uh also OCR model used the card number plate recognition here you can see an ID card and the passport so all OCR model is not suitable to detect this type of text I'll show you now I would like to share the some purpose of the OCR model here you can see text recognition document digitization data entry automation searchability accessibility translation service text analysis form processing and also I'll show you some practical use case of OCR model here you can see number one the car number plate recognition receipt and inv processing document scanning And archiving passport and ID card scanning hand text recognition medical records digitization boarding pass and ticket scanning text based image sech text translation apps maybe other lots of use case also uh today I'll show you T OCR EAS OCR K OCR pel OCR mm OCR and OCR Sam for these two OCR model I have already uploaded detailed video on MM OCR and OCR Sam you can watch from my channel now I will show you some advantage and disadvantage of all OCR model here you can see the T OCR basically for structured T document printed text recognition and simple text extraction tasks uh this Des OCR is based uh and get some advantage and disadvantage is for complex layout hand itent recognition and noise or distorted images uh this OC model performance is not so good uh and here you can see the AC easy OCR model Advantage document Digi quick prototype typing and multiple lingual support and also it has some disadvantage complex layout handed and recognition specialized front for these cases this OCR model is not work well and here you can see the kasas OCR kasas OC model performance is very good for this type of complex layout text recognition and detection here you can see the complex OCR TX deep learning customization transfer learning capabilities and also it has some disadvantage like complexity for the beginner customization complexity and limited language support and uh now I'll show you the padel OCR basically padel OCR model has the capability to detect a table inside the image uh it can detect table area and text age of use and simplicity high accuracy and here also P has some disadvantage limited third party integration limited language support and finally the mm ooc here you can see some Advantage uh basically M has the modular architecture I have details video in my channel on this mm ooc if if you are interested you can watch from it and variety of pre-end model high accuracy for detection and disadvantage is uh only support English and Chinese dependency on open mm lab ecosystem if you wish to watch OCR Sam and mm OCR you can visit my channel and watch details video on this two cting is OCR model now I will show you from Google collab notebook for all OCR model source code and how it's performing for different type of images first I'll show you T to model here you can see four Images image 1 2 3 and four check it from here uh this is image one here you can see some text uh this is image two this is simple text and here is also some hand text and very small text is here uh this is image three uh this is complex layout text and also uh we can check the image for this is image four first I will try the treas OCR model for all of these four Images so uh first you need to install requirement is already satisfied uh because I have already installed uh so in this way basically you can use different type of OC here uh this is for Bengali language and here you can see this is for the English language this t o here can work for different type of language here I will install the supervision for display the image uh this is for getting the current working directory and here you can see the image path first I'll show you for image one so I just need to Sange it to image one then uh I just run this I'm just keeping the copy of this image to for for using OCR model for the same image and the K OCR model for the same image so here you can see the image uh this is the first image and if I run this uh text info p image to string from this image language English then uh here here is the output basically uh this Des ocer fail to recognize exact text from this image uh here you can see uh uh nothing is detected by the treas OCR model now I will try for this image image one for easy OCR so we need to install the EAS OCR first uh then you need to import and assign the reader or here is the language English GPU is true then here details is one and paragraph is false basically this is the boundary box information and here is the easy OCR result uh if you plot this here you can see uh detected guide dogs and for so EAS OCR model is uh working nice here you can see also the detected text now I will try for K OCR model uh so first we need to install and here is showing the requirement already satisfied and now I need to import Kos model and assign the pipeline and here's the image uh here the image number two we need to change it to image number one and here is the K result it's working so here is the result with bounding box information then we need to plot using this coordinates uh basically I use supervision to plot all of the images so this is the result so this is the guide dogs and for uh K model is working very nice here you can see uh this is also recognized uh this T also recognized not perfectly working for all of the cases now I will uh show all of the things for image two so we just need to Sange it here image two then run it for TAS OCR so this is the image I think Tas OC model will perform very well for this simple text yes uh this three text this three line text recognized very perfectly but T OC model cannot recognize this handwritten text and also here you can see some text here but T OC model uh cannot recognize uh this portion of the text only it recognize this part and now I will try it for easy OCR model for image 2 so we just need to run this part and also you need to plot here you can see EAS OCR model uh working very nice uh this handr text also recognized and also this small text recognized perfect L here you can see and here detect another bounding box V uh maybe some text is looking like this here that's why it's detect so now we just need to change this for image two now I will say the K OC model performance for image 2 this sco model also perfectly recognize all of the text and uh detect the separate bounding boxes for each of the text so this very nice now I will show you for image 3 this type of complex layout text which OC model perfectly work for this complex type of layout so first this is the Tas here so nothing detected here you can see nothing is detected by uh T here now I will check it for uh EAS OCR uh maybe detected something but uh not so good output is not so good it cannot detect uh perfectly now I will SE it for kiras oier uh I just need to Sange it here three it's working now I need to plot it uh K OCR model perfectly recognize all of the text here you can see the crossing crossing uh here is the road and here you can see the rail so for this type of complex layout kasas OC model performance is very fantastic I think now finally I will check for this image 4 uh who soci model uh performance is good first I need to check it for the Tas o here so here we just need to change for here is the image now we'll try to check the output for the t o here here here is the language is [Music] English so nothing detected by the treasure ooc here basically T to is very good for document text that's why this type of complex text or complex layout test cannot recognize by the T OCR model uh now I will check it for easy OCR model uh after changing the bounding box color it's uh showing uh here you can see the bounding box sell sell sell sell but 50% is not perfectly detected by the easy OCR uh here you can see uh not perfectly recognize the 50% so now I will try it for Kos here so we just need to Sange it here four sale is perfectly detected also here is detected 50% uh percent is not detected but uh this not perfectly recognized by the K this is the all about for Desert o EAS o and k o performance uh now I'll show you the pedal o now I will show you the pedal OCR so first we need to install pedal OCR model here you can see the requirement already satisfied because I have already installed pedal now now we need to clone the pedal from grid then we need to run this two line then we need to install supervision then we need to import ped here and assign this language English then we need to import CB2 and read the image uh image number one here you can see the result now we need to extract the coordinate here you can see pedia result output image guide for guide dogs for also so here you can see the detection perfectly detected this text now we need to check it for another image image 2 and run it and need to plug plot the output here you can see uh perfectly detected all of the bounding boxes and recognize the text now you can check it for image 3 so we just need to change it here here you can see uh this pedal are also perfectly detected this is the rail and here you can see the crossing and this is the road so Pia also perform very well now we'll check it for uh the last image image 4 and also perfectly detected here you can see the sale 50% sell sell but this two text uh 50% uh is not recognized or detected for mm OCR and OCR Sam you can watch two videos from my channel for all of the ooc model I will also upload separate video fort ooc easy ooc pedal ooc and K so please subscribe my channel thank you thank you all
Info
Channel: SILICON VISION
Views: 1,718
Rating: undefined out of 5
Keywords: Tesseract OCR, EasyOCR, Keras-OCR, Paddle OCR, MMOCR, OCR-SAM, OCR, optical character recognition, what is ocr, ocr model comparison, best ocr model, car number plate recognition, which ocr model should i use
Id: svSwmklFb6Q
Channel Id: undefined
Length: 17min 57sec (1077 seconds)
Published: Sun Dec 10 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.