Text Detection using Neural Networks | OPENCV Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
  • Original Title: Digits Classification/Recognition Using Convolution Neural Networks CNN | OPENCV Python
  • Author: Murtaza's Workshop - Robotics and AI
  • Description: In this video we are going to learn how to create a Convolution Neural Network to Classify digits from 0 to 9. We will write the Training code from scratch and go ...
  • Youtube URL: https://www.youtube.com/watch?v=y1ZrOs9s2QA
👍︎︎ 1 👤︎︎ u/aivideos 📅︎︎ Mar 08 2020 🗫︎ replies
Captions
hey everyone welcome to my channel today we are going to learn how to create a convolutional neural network to classify digits from 0 to 9 we will write the training code from scratch and go step by step to train about 10,000 images of 10 different classes later we will create a testing script to use it along with the webcam I will be sharing all the details including the versions of the packages and the parameters for the training I upload videos on a weekly basis so don't forget to Like comment and subscribe to keep getting the latest updates so let's get started [Music] we are going to start with importing the data which is our images of the digits from 0 to 9 we will have folders for each class which contains about a thousand images each we will put all these images in a single list and create a corresponding list which will contain all the labels of the images then we will split the data into training testing and validation later we will plot the training images distribution so that we can see if the data of each class is distributed ee the images will be processed next to optimize for the training process then we will augment the images to make the data more generic we will use functions such as rotation translation zoom and more next we will one-hot encode the matrix later we will create the model and start the training process once the process is done we will plot the results finally we will save the trained model and import it in a test script now this will be done to test it with a webcam so before we begin we are going to import all our packages now one of the questions I have been asked quite a bit is which package versions are you using so I will be sharing the details of those as well so let's do it together we will go to settings and we will go to our project and then we will start by adding our relevant libraries so the first one will be open CV so let's write down open CV and we will click on open CV Python now the version I am using is for point two points oh three two which is the latest one so we'll install that next we are going to install numpy now the numpy i am using is the latest one 1.1 8.1 then we will install s key learn now the version for this actually we don't have any version so it's just installed then we have Kira's now in Kira's I am using two point three point one then we have matplotlib and for that we have the version 3.1 three three points one three and then we have tensorflow which is probably the most controversial library then serve low now this is two point one point one we have the latest version but we are going to install 2.0 point zero so sometimes this happens it actually fails so you don't have to worry just close it open it again and you can go back to tensorflow hello and as I mentioned before we have to use 2.0 point O so we will install the package so there we go the package is now installed if we go back to our settings and we will see here tensorflow is now installed so sometimes it doesn't work the first time so you can try it again so we are going to start with the training code first the first thing we will need is the training set so what we are looking at is my data and we have nine folders actually ten folders and we need all the images in a form of a list and later on we can use this list to actually send the data for training so let's import the library is required so we will need numpy NP we will need C v2 and we will also need OS so the first thing we will do is we will get the list of directories in our path what what it means is that we need the names of each one of these folders so we need to know how many folders there are and we need to know their names so to do that we are going to create my list is equals to OS dots list of directories and we are going to give it a path now this path now we don't need to find that we can define it up here so we can create a section here for all the settings the initializations so here we can say path is equals to my data my okay so this will create a list of all the folder names and what we can do is we can print it out and we can see what happens so let's run that and there you go so now we have the list of the names so 0 1 2 3 2 line and now what we can do is we need to know how many there are so all we need to do is we can write the length of my list and we can track it again so it is 10 so we have 10 folders so this means we have 10 different classes so we can write here number of classes is equals to length and the length of my list so this we will use later on so the spelling of classes are wrong ok so next we are going to import all these images and we are going to put them in a list now this list we are going to call it what do you want to call it let's call it images so we will put all the images in one what we call list so we need to iterate so in order to e-trade we can say for X in range we will say from 0 to our length which is number of classes so that we have and then so now we need to get the names of each of the images inside our folders so we have the list of the names of the folders but inside these folders we have the names of the images so if we have the names of the images we can use the I am read function to read the images one by one and then we can store them in a list so in order to do that we are going to use the same function as we did before but this time we are going to say my pick list is equals to we are going to give the directory now this time the directory will not be our main directory which is our not the main directory sorry the the my data path we are going to say in the my data go into the zero folder and inside that whatever names you find put them in a list so we need to say that we need to add here backlash plus what do you call the X so we will convert it into string and then we'll put the X so it will go so first it will go to 0 then it will go to 1 2 3 4 and it will keep iterating now for every time it iterates we need to read our image and we need to store it in a list so in order to do that we will create another loop for Y in pick lists my pick list so for however many images we have we are going to iterate to them and we are going to store them in a variable so as we discussed we can call that images so let's write here images and we will create a list an empty list and here we can say that our current image is equals to I CV to not I am read and we will define the file name now our file name is our recall path plus the string our filename is our path weight path plus the string plus the actual name of the image so which is y so we will add another / and then we will add why okay and this should yeah this should work then we are going to resize our image so the current image is equals to cv 2.3 sighs now this resizing we are doing because right now our images are 180 by 180 we need to reduce the size because 180 by 180 is too large for the network it will we can still run it but it will be very computationally expensive so we will use 32 by 32 instead so which image do we want to resize we have to define that and the size so we want to resize the current image and once we have done that we are just going to store it in the images not append so because it's a list we will use append and we will write current image so yeah that is about it so what we can do yeah so next we have to save the corresponding ID which is the class ID of each of these images so we are going to create another matrix or another list with the name class class number and for for example when the x is 0 the class number will be 0 when the X is 1 the class number will be 1 so we will store the value of x sorry my so it will be class number dot append and then we will write the value of x so now we will have what do you call all the images stored in a list and we will have a corresponding class list in which all the IDS are stored so what we can do we can print out the x value so that we know which of the what I call class folders are we in so how many images we have important so we have 0 1 2 3 so every time it loops it adds onto the images it will give us that value so let's run that and see what happens so yeah so it goes to 0 1 2 3 4 5 6 7 8 & 9 so once it is done what we can do is we can check how many images did we actually import so we can write because it it is a list we can check the length so we can write print print the length of our images so if we do that it will import all the images and then it will tell us we have imported 10000 160 images and if we do the same thing with our class number we should get the same result and there you go so we are getting the same result 1000 116 now one thing that I don't like is that it is scrolling down so it is a bit harder to see so what we can do is we can print this with an end and we can put equals to this so this should put it in the in the horizontal what I call direction instead of vertical and then after that we can write here prints that whatever we print afterwards does not come in front of that and what else can we do okay so in order to make it more intuitive we can initially write for example here total weight total number of classes detected yeah that will give us a good feedback and then we can also write when it starts before it starts importing the images we can write print and then we can write importing glasses so I think that should look nice let's clear that out and let's run it again so there you go total number of classes detected ten importing classes and it imports one by one it's had some dots here okay now it's importing the classes one two three four five six seven eight nine and okay so the next thing we have to do is we have to convert it into a numpy array so we can take this list and convert it into an umpire so we will say images is equal to num pi dot array and then images that's it and for class number we'll say the same class number is equals to number dot array and then class number as simple as that so now it is very easy to check the shape of this so we can write print images dot shape and then print or we can print it in front let's print it forward print class number dot shape okay let's run that and see so there you go so now we can see actually what's going on behind the scenes so we have 32 by 32 image of three channels it means it's colored and then of these images we have ten thousand 160 images of this size so this is all in one matrix and we have another matrix that has all the class numbers that are only values we can see that it matches with our images so the class number size and the images size should match to split the data we are going to create a set of training set of testing and a set of validation so in order to do that we need to create we need to import a different library or a package by the name pesky learn so let's import that M import skill learn and we need to import a very specific module from there which is taught from model selection we are going to import oops this should be from then import train test and split so we will try to train our Purgatory's so we are going to import that and then we are going to use it to split our data so let's name it first so let's write here Spelling's all right I think anyways so we will go to X we will write here X train then we are going to write X test and then we are going to write Y train train and then we are going to write Y Y test okay so this is how you can split using the train test split function now if what it will ask you it will ask you for the arrays that you want to split for and you just input it and you give it a ratio of the splitting so that means that we can say images and it will split the images and then it will we can write a class number and it will spread that too and then here you can keep adding but what we need to do now is that we need to write down the test ratio so for the test ratio we have we will write test underscore size equals let's say zero point two now what does zero point two mean zero point two means that your training will be zero point eight percent and your test will be sorry zero point eight and your test will be zero point two so 20 percent testing and 80 percent training so instead of writing zero point two we can write here test ratio and the test ratio we can define in the settings so we can use it later on test ratio is equals to zero points to now let's go down here we have the test ratio so it should split this up now in order to see whether it has split it or not we can write here X underscore train and then we can do dot shape and here we can write print okay so before before the splits this is your shape and after you split this is your shape for the train and we can also do it for the test so we can copy this and we can write here for the terrorists let's go on that there you go so this is your main shape so the number of images you already had we already have before and then this is our training set and now this is our testing set so you might ask why do we need all of this well we need this because we need to shuffle all the classes if we just say that okay take the first 80% of the images well the first 80% then it will be till seven so from zero to seven we will get all the classes of zero to seven and we will not have anything in the training set for eight and nine and then in the testing set we will only have eight and nine because that will be the split so this is why we need an external package that will show up on all the data and it will it will divide it up it will split it evenly for us so that we can train and we can test it accordingly so next we are going to split it again because we want to do the validation as well so we will right here now from the X train we will take some more data and we will split it up so we will take X train and then we need X validation X validation and then we need Y train and then Y so we are going to use the same function and this time we are going to split the X train x terrain and we are going to split the y train okay and the ratio we can write here that Oh tests ratio we can write here validation ratio validation ratio we can copy this and we can add another variable here and we can see 0.2 so this now we'll split it further so we can check here X validation dot shape let's run that and there you go so this is your original set of data then it was splits for training and then testing and then validation so what we are doing is we are taking the original value first and we are dividing it into 80 percent and 20 percent and then we are taking that 80 percent and then dividing that 80 percent again to 80 percent and 20 percent so the largest one is our training and then we have our testing and then we have our validation so next we are going to look at how many images we have for each class so this is useful because we don't want to favor any of the classes by having more of the data from that class so we can get that information from the white rain since it contains all the IDS of all the images so exchange contains the actual images and white rain contains the IDS of each image so for example we have lots of zeros one stools so we can find out the number of zeros we have and then we can tell okay for the ID 0 this is how many images we have so in order to do that we are going to use the numpy function numpy dot where and we are going to find in the white rain for example number 0 so this will give us the index of where 0 is present what do I mean by that let's find out so we can just print this out so there you go so these are all the index values at which 0 is present for example index at index number 7 the class is 0 at in the index number 11 the class is 0 and so on so all we need to do is we need to find how many of these total we have so let's find out the total length of our 0 for example IDs so we can write here the index as this and then we can put our length so that will give us I think I'm missing a bracket and yes ok so yes so it gives us 3 638 images and that is for the ID of 0 so again what we can do is we can put this in a loop and for each one of them we can print it out so we can say for X in range of 0 to number of classes we want to print this out and we can change this to X and there you go so for each class it's now giving us how many what I call images we have so what we can do next is instead of printing it out we need to save it in variable so we will call it number of samples is equals to we can create lists and then in that list we are going to append every time we get that so number of samples dot append and what do we need to append we need to append this length so once we are adding that we do not need to print this out so just to confirm what we can do we can just print out once it's done we can print out the number of samples so let's see how that works out so it should give us a list of values and there you go so here we have how many images we have for each class so this is 0 1 2 3 and so on so we have total of 10 classes so what we can do is we can create a bar chart and that should give us some good information of how the images are distributed so we can use matplotlib and let's import that so imports mat plot lib dot pie plot and as VLT VLT so matlab matplotlib basically allows us to plots as you can tell by its name so we will use PLT dot figure now in the figure we are going to define the figure size fig size think too much okay fake yes big size is equals to let's say 10 by 5 and then we are going to do PLT dot bar and we are going to give it a range so the range sorry range the range is from 0 to the number of classes number of classes and then we are going to plot the number of samples for each one of them and now we can add the title and the label so PLT dots title titl e and here we can say number of images for each class and then in the PLT dot x label we can add in the class ID and in here PLT dots wine label we can add the number of images so we can write number of images so yeah at the end we have to show it as well so PLT not show and that's or not and there you go so now all these values basically are plotting here with a bar chart and we can see our distribution is quite even I'm quite happy about the distribution the number of images we have for each class it's good enough to classify all of them evenly well at least we can hope so so let's move on so the next step is to pre-process all the images so we are going to create a function for that we are going to define it as pre-processing and in that we are going to request for an image now once we have that image what are we going to do with it the first thing we will do is we will convert it into grayscale so image is equals to C V 2 dots CV T color and then we are going to define our image and then CV 2 dots color CP 2 dots color BG are to gray yes the next step we'll do is we are going to equalize our image so basically what it does is it makes the lighting of the image distribute evenly so let's equalize it using the C v2 dot equalize histogram method and we just input the image that's it and then we are going to normalize our values now normalization means that now in grayscale values we have values from 0 to 255 now what we want to do is we want to restrict it from 0 to 1 so let's do that image is equal this is better for the training process so this image is equals to image / 255 so that is about it and then in the end we are going to return our image as it is so what we can do now is we can grab any of the images and can test it out so let's write down here let's take one of the images see v2 dot I am show and here we are going to show that our pre-processed okay image and it is IMG so we need to define the IMG IMG is equals to pre-processing of our image which will be let's pick it from the X train and let's say it is number 30 so once we have that now these images are very small these are 32 by 32 so what we can do is we can resize them so we can at least see them clearly so we can say I am G resize or sorry CV 2.3 size and then IMG and then we can define let's say 300 by 300 this will help us clean mores see more clearly and then we can write the CV two dots weight key because it will close if we don't put that and there you go so this is our distribution so we close it and there you go this is our pre processed image with the number 7 looks good now do we need all of this I don't think so so we can just comment this out so now what we need to do is we need to pre process all the images that we have in the X train right so what we can do is we can use the Python function called map that will allow us to run a function over a list or an array of elements so we can use map map and then we are going to write the name pre-processing and then we will say where are the images and the images are in the x string so it will take this each element one by one and it will send it to this function now it is doing that but then it will give us a list a list of it and then we need to store this list in back to our extreme variable or our extreme array so what we can do is we need to convert it back if you remember in the beginning once we created the list of images then we had to convert it into an umpire array so this is the same thing we are doing here we are creating the list and then we are converting into numpy dot array so once we do that we can store it back to our extreme so we can say X strain is equals to this so if we run that it will pre process them but we won't see any output so to see an output what we can do let's copy this let's copy it here and what we can do is we can yeah we can remove this because we are already doing that and then it will show us the pre processed image so let's run that okay so this is our pre-processed image now one way to confirm it is by looking at the shape so what we can do is we can print out the shape of one of the images let's move that so we can say that for example this is our image and we want to check the shape of it so we can say so we can go back and write it here and then we can check before the pre-processing and after the pre-processing so in our pre-processing we are converting it into grayscale so that would mean that our image will have only one channel so if we cross that out yes so here we have three and here we have only one so that gives us the pre processed images so we need to do this for all of them so let's do it for the test and for the validation as well so let's copy that X test and we will do it here as one test and then we have to do it for the validation validation okay so now that we have all of this done next step is to add a depth of one in our images now this is required for the convolutional neural network to run properly so we are going to add a depth so we are going to read reshape our images so X train is equals to we are going to say X train X train dot reshape reshape and then we have to define the parameters so what do we want it to reshape to now our initial parameters will remain the same now the initial parameters what do I mean by that we are going to take the size of our X train dot no dot shape I'm going to take the shape of our extreme we're gonna say that it is zero it will remain zero so the first three elements will remain the same so the first second and third elements will remain the same whatever the size is and then in the third one we will add one so it will create that depth so the same thing before we go forward let's let's print it out and see if it works fine so before that we can yeah let's let's print out the complete shape so X X underscore train dot shape and then we can say that we want to print it out and see what is the size now this is before the reshape and this is this is why they're not going down this is after the reshape so let's see and there you go so before we did not have that depth of one now we have that depth of one so now we need to do this for all of them by all of them I mean the test and the validation so let's add those and we can write here now we can write here why is it dropping that so we can add here test then we can also have a hair test and we need to change for them test here as well test and then here is validation air is validation so there you go so next we are going to augment our images which means we will add some zoom some rotation we will shift it will add some translation so all of this will make the dataset more generic so let's see how we can do that so for that we have we have a module in Kira's so we will say from Kira's from Kira's dot pre-processing dot image imports the name of the module is image image that image data generator yes so we are going to use that to generate some images that will be augmented so let's say we will call it that generator is equals to image data generator and then we need to define our parameters so the first one let's talk about the width so we can shift the width so we have to define the parameter of how much shift you want so for example if we write 0.1 that will give us 10% then we can write the same for the height we can put it 10% as well and then we will add the zoom range zoom range now the zoom range if we are adding for example 10% then it will it will zoom in 10% and zoom out 10% so let's put 0.2 so 20% then we will add the sheer range magnitude which is again percent and then we will add the rotation so rotation range and this will be in degrees so let's say 10 degrees next we are going to help our image generator calculate some statistics before it actually performs the transformations so in order to do that we will add the data gen dot fit and then we will give it the parameter that we will be using which is our extreme so basically what happens is that we do not generate these images before starting the training we are generating them as we go along during the training process so we want the generator to know a little bit about the data set before we actually send it for the training process so the images that we will request will be in batches so we will request a batch and it will augment and send it back so we will look into this in more detail when we are actually requesting the images so the next step is to one hot-and-cold our matrices this is a necessary step for the network so let's do that so we will put our white training is equals 2 so we need to import the module for this and the module for this is in the Kiera's here our dot utility starts and it is called two categorical so we will go down and here we are going to write it down to get a coracle then we have boy train and then we have the number of classes so the same thing we will do for validation and testing so this is validation this is testing test so if you are not familiar what with what this means here I'll make another video explaining in detail of what this is but I'm assuming because this is an implementation with you I'm assuming you are familiar with the concept so now we are going to create our models so first we will define define mod my model and here we are going to define the number of filters number of filters we'll put them 60 then the size size of filter 1 we are going to define that 5 by 5 so this model is basically based on the Lynette model so we are just writing down all the parameters that are required to create that model so you can create your own but we are going to use the Lynette model so right here size of filter 2 is 3 by 3 then we have the size of pool which will be 2 by 2 and the number of nodes so all these parameters will be required once we start building our model so let's start with that so model is equals to sequential sequential and then we are going to create our first layer which is the model dot add and we are going to add a convolution layer so convolutional to D and we will define the number of filters and the size of the filters so we will say number of filters and then we will define the size of filter so here we are using bigger Builders by the size five by five and later on we are going to reduce the size to three by three so then we are going to input the shape input shape is equals to our then we will add the dimension of our image so we can write 32 actually you know we can write let's let's declare a variable we should write here image Diet mentioned is equals to 32 by 32 by 3 so I think we use it somewhere else as well yeah here so we can just take these values from here so we can say the first element and then the second element so the second ok so the same we will go down and to over here so here we will have our input dimension okay this is getting a little bit messy so then we have input dimension 1 then we have the image dimensions element number 1 and then we have the 1 so basically this is the depth so let me see the brackets let's put it up here and then we can put this down here filters and then we will add the activation function activation let's put it as relu ok so I hope there is no prakit missing yeah it seems fine okay so next we are going to add another one of these convolutional layers and this time around we will keep everything pretty much the same and but we don't need to define the dimension again so I can leave it at that and then we can close it back up here then we are going to add a pooling layer so we will write model dot add so maximum pooling and then we will define the pool sighs so pool size is equals to size of our pole then we will add another convolutional layer in fact we will add two convolutional layers one and two and for both of them this time we are going to decrease the number of filters by half so let's divide it by two divided by two and then we are going to change the size of the filter to filter number two so this time we are using smaller filters and again we will add the pooling there and then we are going to add our first drop out layer so model dot add then we are going to say drop out and there we will add 0.5 so 0.5 means 50% and then we are going to write here model dot flatten not add and then flatten so this is the flattening layer so then we will add our dense layer and we will add model dot add and then we are going to write dense now in the dense layer we have to define the number of nodes the number of nodes as we have decided previously it will be 500 and the activation we can keep it then we will add another dropout layer now the dropout layers are basically helping you reduce the all fitting and it's making it more generic so at the end we are going to add another dense layer which will be our final layer so we will have the number of classes so modern dot add and then we will have the dense layer and then and number off we will add number of classes here and this time around we will use the softmax function so activation is equals to softmax okay now this is our model all we need to do now is to compile so we write model dot compile compile and we have to mention the optimizer which is Adam and the learning rate for it is 0.001 and the loss is again here we will write get the categorical cross and dropping so again these parameters you can play around with you can change them if you are familiar with these parameters you can change them as you wish and I'm assuming this because this is more of an implementation video so I'm assuming you are familiar with the stuff but later on I am planning to create another video tutorial explaining in depth of housing and work and what are these parameters so here we will write accuracy of the spellings are correct and then we are going to return our model okay so now we just need to call this function and that will create our model so we can say model is equals to my model and once we call it we can check the summary of this model so we can write the print model dot summary okay so this should give us the summary of the model so if we play it now hopefully everything will work out fine we have this close it and there you go so here we have the parameters of our model now the last part will be to actually run the training and to do that we are going to use model we are going to use our model and then we are going to use the fit generator so if you remember we discussed earlier in the augmentation part where is it yeah so in the augmentation part we are going to actually request the images from this data generator so this image generator which will augment these images so now we are going to request in batches that we need these images so we are going to use the data gen dot flow to actually get them and here we have to define our images and then our labels or our IDs so this train and then we need to define the bat size so the batch size is let's put some variables here so we can use them later on patch size value and value so bad size value as 50 then we will also need a box hee value is equals to let's say ten we will also need steps per Eibach steps and let's say 2000 so now we will define all of these here so we are writing this here so that we can take all of this information up in our settings so that we don't have to go and look for it in the code so here we can write our bat size value so next we will add the steps per Eibach so steps per a pop is equals to quite not this one so with this steps for an epoch is equal to two steps for epoch and then we have the epoch value so we have so a box is equals to our epochs value and we should write value here as well keep it constant we can put that here and then we are going to define our validation data so validation data is equals to let's put it down see so our validation data is X underscore validation and Y underscore validation then the last thing do we need shuffle yes we do so we will write here shuffle is equals to one okay so this will give us this will start the training for us and what we need yeah we need to save it so we will write here history is equals to model stop it so now this will stay save the parameters in the history variable so we can check out later what happened during the training process so before we plot anything let's see if this actually runs properly so by now it should start the training process and then we can stop it and continue later on so we have our figure we'll close it and there you go so the training process starts and it tells you it will take about six minutes to do one epoch and it's giving you the accuracy and the loss as it goes along so I'm going to stop it before my PC fries now now we will plot we will plot once we are done with the training process we want to know how was the loss of the variation in the loss and how was the variation in the accuracy so to do that we are going to use our history that we stored in and then to use that we are going to right PLT figure which is our matplotlib we are going to write this figure and then in our figure we are going to use what let's write that down PLT dot plot and we are going to write our history dots history of the loss and then we are also going to plot history of his history and then we are also going to plot the validation laws then we are going to add the legend so that we can differentiate so then we are going to write training and validations of training and then we have validation so we are going to make it look a little more professional by adding a title title put it as loss and then in the X label we can write the number of epochs or we can simply write epoch and so that that's about it and then we are also going to plot a figure two in which we are going to plot the accuracy so we will copy this and we'll paste it here we will write this as two and instead of the loss we are going to write accuracy and we will copy that and we'll paste it right here and then we are going to paste it right here okay so in the title we should have a capital letter accuracy and then to show it we have to use PLT dot show PLT dot show so then we can check out the score of our model so we can write score is equals to model dots evaluates and then we can write the X test and the Y tests so if you remember we added these in the beginning of our code so that we can test it at the end and we'll put this as zero so to print the score out we can write friends and then we can write our test score basically equals to score so the first element will be the test score and the second element will be our accuracy so we will print out the accuracy as well so tests accuracy so remember this data has not been seen by the the model the training while it was training so we should have a good picture of how our model would work from these score points so we will write 1 so this will give us all the things that will require to actually train our model and then to see the testing score but one thing is missing and that thing is the saving part so we need to save this file so that we can use it in our testing model using a webcam or some images so what we can do for that is we will use the pickle package so if we go here so we can write here import pickle so if we go down and we are going to store our model as a pickle object so we will say pickle underscore out this is just a variable name and we will say open we are going to say file type so you the file name and then what are we going to do with it so we are going to say that we want to save it as model underscore trained dot P and then we want to write it so you will say WB so basically that means writes the bytes and then we are going to write pickle pickle dot dumped and we will use our model and then where do we want to dump it in our pickle outs well why are there two brackets we need to remove okay so then we need to close its picot dot no pickle out because out not close that's that that should create our pickle object and it should save it in our directory so now we are going to so I already have one here I am going to move it and so what I'm going to do is I'm going to train this for one ibaka and let's see how that runs so to see if everything is working properly where did I put the variables one thing is that we put our parameters somewhere yeah we put our parameters here now these parameters should be right up in the settings so that we can change them easily whenever we want so that's why we're gonna put them here and the bad side is 58 box let's put it one just to test it out and I'm gonna play it and we will see if it saves properly and if it runs properly and then we can go on the longer run so hopefully it's loading all the images it's showing us that data is distributed evenly we can close that then it starts the training process and I will see you once it's done so here are the results actually went back and changed the epoch to two so that we can get at least a line not just some single points so here we can see our accuracy plot and the loss plot and here we have the final values now once we close it we should get yeah so there is your test score and here is your test accuracy and then at the end we should have our pickle file so here I have saved it in a the folder a so if we open that we have the model trained dot P so now we are heading towards the testing script we will write some code so that we can send our new image to our model and let it classify which category it belongs to in order to do that we are going to import our packages first so we need numpy c b2 and pickle so import numpy as and P then we need import c v2 and then we need imports pickle then we are going to use our webcam so we need the webcam object so cap is equals to say v2 dot video capture and my camera number will be at 1 so we can define the width and the height before we do that we can add the area for our settings or the parameters up top so we can write weight is equals to 640 and then we can write height is equals to 480 so here we can write cap dot sets and our width is number ID number 3 so we can write width and then cap dot set and ID number 4 is the height so height and that should do it so next we are going to unpick all our object and we are going to put it in the variable model so we will say that we are going to pickle in our model and then we will do the same as we did before we are going to write open and then we will write the name of the model file so model underscore trained tra I and n DT dot pi ax dot P and then this time we are going to write that we are going to read the bytes so we'll write model is equals to pickle dot load and we are going to load our pickle in so now we have the model that was trained in the variable model and now we are going to send it an image to our webcam and see what model it classifies into okay so we will write while while while true and then we are going to write the first one is the success and then the other one is the image or a channel okay so we are going to capture the read and there you go so next we are going to convert it convert our image into an umpire array so image is equals to numpy dot as array that will convert it into an array of similar dimensions so I am original and then image is equals to now we are going to resize our image because we wanted 32 by 32 so we will write CV 2.3 size and we are going to write 32 by 32 and which image we want to resize is the image itself okay then we are going to send it to our pre-processing so if you remember if you go back we have our resizing after our resizing yeah this is our pre-processing so we are going to copy this here and now we are going to run the image through this pre-processing function so image is equals to three processing and an image so now we have our pre processed image that is ready to go almost ready to go for the prediction function so we can output this to see how it looks like save me to dot I am show let's let's do this so that we can see if everything so far is going well so processed image and then we can write IMG so now one thing is that this image is very small so I don't want to run that like this because it would be very hard to grab so I will make it a little bit bigger just for the sake of viewing and then we can scale it up again so models are trained so spelling mistake probably somewhere I think one end yeah okay yeah so one thing I forgot is the so when you are using the webcam you have to add the function for breaking and sorry the if statement for breaking so when do we break so we need to add the CV 2.8 key and then we are going to say and we are going to say that whenever we find the Q the Q key pressed we want to break so Q and then we have so this is fairly common to use during webcams and you can see here and there you go so this is the image coming life from my webcam as you can see there you go so this is the one that we will try to detect number 5 okay so so far it's working well we will go back to 32 by 32 and now we are going to just reshape the image before we send it to the predictor so we have to reshape its is equals to IMG dot 3 shape and then you will write 1 32 by 32 by 1 so once we have it in this form we can send it to the predictor so we can write here predicts and the first thing we want to predict is the class ID so we will say our class class index is equals to model dot protect and we want to predict the class through this image so that will give us that and let's convert it into an integer and now we can print it class index so let's run that and there you go so we are getting what we call the class ID so if I bring it in front of the number five you can see it changes to five now it's changing to eight so again we are using it we are using this model that was trained only for two epochs so we don't expect it to be very accurate it's just the concept later on I will train it again and we can check out the video of how it works with better accuracy so then once we have the class index we are going to predict the what I call the probability of it so for that we will need prediction is equals to model dot predict and we are going to predict our image using our image and then we are going to get our probability value so prob let's write prob and Val probability value is equals to numpy dot max a max and we are going to find the prediction okay so it will give us the the element with the highest number so in the probability array in the predictions array it will give us the maximum one so if you do not understand that we can just print it out first so you can get the idea of predictions so let me remove that and let's from that so here it's giving us the prediction of how likely we are getting each one of the classes so right now it's giving it so it will have ten values inside and the highest value will be the one the class that represents the current image so what all we need to do is we need to get the highest value so the highest value will be our probability value coming from the maximum and we can run that okay let's remove this one and here we so there you go so it's giving us the probability and what we can do is we can put them together so it will be easier to see let's put the class index class index and next to it we will put the probability class index and we'll put the probability okay so now let me put it in front wait what happened it stopped can we run that again okay so now I will bring it for in front of number five and there you go so you are getting five and when you are getting it it's 87 percent or around 90 percent sure that it is number five so that's how you can get the probability and later on we can actually write this down on what you call the image itself so we can see what happens and we can also say that if if the probability of all of them is very low so we can neglect that there is no none of the classes are present so to do that we can simply write if our probability value is greater than a threshold so that threshold we can define in our settings let's put it at 80% well let's put that 65% because we are just using two epochs so it's not a good idea to go very high so so what we can do is we can write that if the threshold is above this then only you need to display something or yeah we can put it on the text so we can write C v2 dot put text and let's put it on the original image I am original and then we are going to write the string so let's write the class index and do we need to add something to it let's let's not add anything let's keep it very simple and then we can define the origin so let's say it is 50 and 50 okay so right the start and then we have to define the font so CB 2 dots font let's pick up one of the fonts and then we have to define the scale let me bring it down so we have to define the scale so let's keep it at 1 the color let's put it red so 0 0 and then 255 and the thickness we can write as 1 as well so if we get enough probability we are going to show which class did we find you know what let's let's put the probability as well plus the probability value which is string and we have the probability value here and let's put a space in between so it's easier to read yeah so it's getting very long let me put it down okay so let's run that and see how that performs so where is the image the original image did are we printing the image now we are not printing the original image so we need to print it so we are going to show C v2 dot I am show and then we will write our original original image and then we are going to define our original IMG origin so this is our image and there you go so if we go really close then it gives us the probability that is as five and it is 0.7 percent 0.9 sometimes it's also thinking that it is 8 so I also have 2 let's try that out so if I move it around oh there you go so we are getting it 0.9 0.8 percent that it is - well that is good so I have trained the model for 10 a box now and here are the results as you can see the accuracy is quite good and you also have the lost plot and you can see that is good as well here you can see the validation accuracy as 0.996 9 which is again quite good so in order to run the test part we will close our windows so let's close both the windows so close this and this and then it will run the evaluation and it will give us the test scores so the test accuracy is 0.99 75 which is fairly good and once we are done with this you can see we have now our model trained for 10 a box so I like to write underscore 10 so that we know that this was trained for 10 i box so now we are going to take this and we will run it in our test so if we go back here we will say it is our it is in our main directory and it is called 10 so let me see my webcam and it is connected so let's run that and see so I have printed out a few digits and let's see how it performs on that so this one is handwritten and still it can detect but as you can see overall the accuracy is quite good it is able to detect much better than the previous two epochs so that's it for today's video I hope you have learned something new so if you haven't subscribed yet please do and hit that like button and if you have any comments if you would like to know something else please share that in the comments below and I will see you in the next video
Info
Channel: Murtaza's Workshop - Robotics and AI
Views: 73,875
Rating: 4.967742 out of 5
Keywords: python, opencv, machine learning, ai, text detection, neural networks, image processing, ocr, optical character recognition, text recognition, ocr api, python api, opencv text recognition, opencv ocr, Text Detection with OpenCV, OCR using Tesseract, Tesseract (2020), Text Detection with OpenCV in Python, image to text, how to convert image to text, recognize text, detect text opencv, extract character, character recognition, opencv tesseract
Id: y1ZrOs9s2QA
Channel Id: undefined
Length: 86min 23sec (5183 seconds)
Published: Tue Feb 18 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.