Intelligent Telegram AI Classifies Images in Python

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Music] what is going on guys welcome back in today's video we're going to build an intelligent ai telegram bot that is going to be able to recognize animals and vehicles inside of images that we sent to it so let us get right into it now before we get into the actual coding i want to show you what this bot is going to look like in the end i have it on my phone now i have telegram opened and i can now start the conversation by clicking on start and then it's going to say welcome to the neural nine telegram bot i can do something like hi and it says send a photo okay the first thing we're going to do here is we're going to call the slash train command so slash train this is a function that i have implemented in the bot and then it's going to say once it's sent neural nine image classification we are training the neural network please wait now at this point in time my computer my laptop right now the bot is running on my system is starting to train the neural network based on training data once this is done i'm going to get a message that this is done and then i can send images of horses planes cars trucks whatever and the telegram bot is going to give me a response with what i see in this image is a truck what i see in this image is a plane what i see in this image is a deer or a horse or whatever like we have uh 10 things that it can classify and now we're going to skip the part where the bot is uh already trained here or the model's already trained alright so the model is now done with training and we can try to send a picture let's go and send a picture of a car there you go and it says in this image i see a car so this one is correct let's go with a deer and it says in this image i see a deer and let's go with this one of a horse in this image i see a horse now it's not always going to be 100 accurate by the way those are all pictures that the model has never seen before so they're not part of the training data set but as you can see those three at least were classified correctly all right so the first thing we're going to do here is we're going to go to telegram to the app or to webtelegram.org and we're going to look for the bot father and once we have the bot father we're going to start a conversation with the bot father and we're going to click on new bot or we're going to type slash new bot down here this is basically just going to start process of creating a new bot it's going to ask you for the name you provide the name for an identifier you provide the identifier and then basically it gives you the api key and that's it then we proceed to the python code i'm not going to do this here again because i already have a bot created it's not complex just type slash new bot and it's going to give you all the instructions uh the important thing is that you copy the token the api key uh and if you really want to see the step uh by step on how to do this which is not really difficult you can check out my video telegram bot in python where i go through the very basics of creating a very simple telegram bot in python this video is a little bit more advanced it's like a second part where we make it intelligent using a neural network so basically just slash new bot creating a bot and then copying the api key that's the first step then we go to pycharm and the first thing we want to do is want to get the token into the script somehow so if you're not recording like i'm doing you can just type token equals and then a clear text string where you just paste the token in my case i'm going to load it from a file so i don't have to show it to the viewers so with open token.txt in readingmode sf and then uh token equals f.read basically now we're going to need a bunch of libraries for today's video and the most important one is of course the telegram library we're going to install it uh and we also need opencv we need numpy and we need tensorflow and we're also going to need the i o module but we're going to talk about that in uh or when it's necessary so first of all going to start cmd and we're going to install install the necessary modules we're going to start with tensorflow pip install tensorflow like that then pip install numpy then pip install opencv dash python and of course the telegram library it's called python telegram dash bot like that this is what you need to install those are the libraries and now we can go ahead and import them so we're going to say from telegram [Music] dot x import everything then from i o import bytes io we're going to talk about why this is important then import cv2 import numpy snp and import tensorflow s tf all right so let's start first of all with the basic structure of the bot and then add the uh the rest later on so first of all we're going to define the basic functions that we already know so def start the start function what happens when we actually um start the conversation with our bot so we're just going to pass update and context here again like in the last video when i say again i'm referring to the last video i recommend you to watch the last video if you haven't watched it yet so the last telegram bot video um and basically all we did here was just saying update dot message dot reply text and then uh welcome or something you can you can design the messages however you want you can say welcome to the classifier welcome to whatever i'm just gonna say welcome here oh i close the script um then we can also find a help function if we want to so we're going to say update context and we're just going to have a message that says multi-line string there you go slash start starts conversation slash help shows this message and then slash train trains neural network which is going to be another function down here is going to be the train function update context as well but we're going to pass for now because we don't have the model yet um and then we're going to have a basic uh handle message and a basic handle photo function so the handle message function is just going to be what happens if we just send a text message we're going to call this handle message update context again and we're just going to say update message reply text please train the model and send a picture there you go and the important function now is the handle photo function again with update and context and we're going to pass this for now as well so let's first complete the steps that we already know from the last video we're going to say updater equals updater with a capital u from the telegram module we're going to pass the token here and use context equals true and then the dispatcher is going to be updater dot dispatcher not dot dispatch dot dispatch sure as an object here and now we can add the handlers that we already have so we're going to say dp dot add command handler or actually add handler and we're going to add a command handler for the command start and if we see the command start we're just going to call the start function go into the same thing with help and with train there you go and now we're going to add a message handler or actually two message handlers but they're going to have different filters so we're going to say dp.addhandler handler this is what we already had in the beginner video where we basically just say okay filters dot text whenever text is coming it's going to be filtered for this message handler and the handler itself is going to be handle message like that and now we're going to add a second message handler which is going to not filter for text it's going to filter for photos so filters dot photo and it's going to be handle photo all right and in the end of course updater dot start polling and updater dot idle there you go so what we need to do now is we need to have a basic neural network model already ready and we need to be able to train it by calling the train command and then when we send a photo we want to process that photo we want to scale it down we want to feed it into the neural network get a prediction formulate an answer or a prediction response and then send this to the user now let's start with the model itself we're going to build a convolutional neural network to classify the images and as a training data set we're going to use the cipher or cipher i'm not sure how it's pronounced it's c-i-f-r-a-r this 10 data set so cipher 10 we're going to use that to train our model and it's going to have 10 classes 10 objects it can classify 10 image types it can classify so let's start by loading the training and testing data we're going to say x train and y train and x test and y test is going to be tf keras dot data sets dot cipher 10 dot load data like that and then we're going to normalize this so we're going to say x train and x test is going to be x train divided by 255 and x test divided by 255 so that we have all the values between zero and one and now we also want to have a list with the class names because in the end the neural network is going to produce a a number a numerical output and we want to map that to a class name so we're going to say class names is and uh this is the right order you need to have that order the first one is plain the second one is car the third one is bird then we have cat we have deer we have dog we have frog we have horse we have ship and we have truck so as you can see vehicles and animals that's basically it um now we can start with the actual neural network so let's go ahead and say the model is going to be a basic sequential model so tf keras dot models dot sequential very simple sequential model and then we're going to start by adding some layers the first layer is going to be a two-dimensional convolutional layer so we're going to pass tf.keras dot layers and then conf to d like that and we're going to pass the filters we're going to have 32 here and then the kernel size is going to be three by three the activation function is going to be a rectified linear unit so relu and the input shape is going to be 32 times 32 32 with three channels this is because the pictures the images in the training data set uh have 32 by 32 pixels and they're rgb so they have three channels now what's the problem here oh we need to say that is the input shape there you go so that's the first convolutional layer i'm not going to go into the theory of convolutional neural networks here because that would deserve a video on its own so if you want to know okay how does a convolutional neural network work how does a max pooling layer work and so on let me know in the comment section down below that you're interested in theoretical videos on neural networks and then i may make some tutorials in the future after the convolutional to a 2d layer we're going to add a max pooling layer so tf keras dot layers dot max pooling 2d and we're going to pass a pool size of two and two then we're going to add another convolutional layer so model add tf keras dot layers dot conf 2d and this time we're going to have 64 filters and the kernel size is going to be 3 3 again and the activation function is going to be rectified linear unit once more and now we can actually just go ahead and copy this we're going to have another max pooling layer we're going to have another convolutional 2d layer and uh after this one after the third convolutional layer we're going to flatten the the results we're going to say model dot add a flattened layer and we don't have to specify anything here and in the end we're going to have one more dense layer for complexity and then the final output layer so we're going to say model dot at tf.keras.layers.dense with a hundred or actually let's go with 64 units and the activation function is going to be rectified linear units and in the end the output layer is going to have the soft max activation so we're going to say layers dense again and we're going to have 10 outputs because we have 10 possible classes and the activation function is going to be soft max for those of you who don't know what soft max does soft max is basically just uh producing a result where if you combine if you if you add up all the individual neuron activations you're going to get one so they all add up to 100 which basically means that each neuron indicates how likely it is the right classification so for example if we have a picture of a dog an obvious picture of a dog the neuron that is responsible for the dock is going to be very highly activated something close to 1 0.9 something and all the others are going to have a pretty low activation if the model works properly so this is basically giving us the likelihood for each neuron to be the perf uh to be the the correct solution all right so this is the neural network itself but it's not trained it's not even compiled yet and this is what's going to happen in the train function once we call the train function we want to say to the user that now we are training the model model is being trained then we're going to first of all compile the model so we're going to say model.compile and we're going to have the atom optimizer for this we're going to have the sparse category called cross entropy as a loss function and the metric that we're interested in metrics equals is going to be accuracy there you go and after that do we have a mistake here oh i still have the pass right let's remove the pass there you go then model dot fit and we're going to fit on the training data with 10 epochs this is going to be necessary for this one and the validation data we're going to provide here is going to be x test and y test so let me just see what the problem is here we don't have a problem okay so the model is being fit and then we can save it model dot safe and we're going to save it as i don't know cipher classifier dot model and once this is done we're just going to tell the user that done you can now send a photo all right so that's almost all of it the only thing that's left now is to get the actual images from the user and then feed them into the neural network and get a response get a prediction and then send a response to the user so first of all through this filters.photo we're only going to get photo messages to that function so whenever we get something in that function is going to be an image and because of that we can just get this image from saying file equals context dot bot so the context is this uh parameter here we have use context equals true so that works context bot get file and we're getting the file update message photo we're going to get a photo from the message and uh we will just get the last one if we have multiple of those and we're going to get the file id so we're going to actually get the whole file as a byte stream as a byte array so what we're going to do here is we're going to say f equals bytes io which is why we had this import in the beginning this from i o import bytes io and inside of that we pass file dot download s byte array like that and then we get the bytes themselves so file bytes is going to be nps array it's a little bit technical here right byte array of f dot read and the data type of this whole thing is np dot u int 8 bytes so unsigned integer 8 bytes uh those are the bytes and we can now actually go ahead from those bytes and load the image using opencv so this is the uh you could say pre-processing of the image or actually not pre-processing this is the getting of the image this is the way we actually load the image into the script uh and now we can go ahead and actually get the image object so cv2 dot m decode we're going to decode the image from the byte stream so we pass the file bytes here with the code from the file bytes um cb2 dot im read underscore color is going to be the flag that we passed here and now that we have that we basically have the image in our script so the first thing we want to do is we want to swap bgr to rgb because by default uh opencv uses the bgr color scheme but the image has rgb so what we're going to do is we're going to say cv2 dot convert color of the image cv2 dot color rgb to bgr there you go and what we also need to do is we need to resize the image now the best thing would be if we just pass images with the right resolution but sometimes we're going to pass pretty high release resolution images as we did in the beginning in the demonstration so in this case we want to resize the images to be 32 times 32 pixels and for this we say image equals cv2 dot resize and we resize the image we pass 32 32 here as a tuple and the interpolation is going to be cb2 dot inter area there you go so we end up with 32 by 32 pixel image and now we can go ahead and say prediction equals model dot predict and we need to pass the right format first of all we need to pass an np array and a list of the image however the image as we saw here we had divided by 255 as a training data and because of that we want to do the same with the actual prediction because otherwise we're going to get inaccurate results so divided by 255 here and this is how we pass it this is then going to be the prediction and remember the prediction is the activation of all the neurons so what want to have is we want to have the neuron with the highest activation in order to get that we need to apply the numpy arc max function onto the prediction to get the index of the of the of the highest value so we're going to do is we're going to say update reply text f string i'm going to say in this image i see and then a or n whatever um and what we're going to pass here is the arc max so we're going to say np arc max of the prediction now the problem with that is that this is going to give us the index of the neuron with the highest activation so what we want to do is we want to get the class name at that index so class names at the index of the highest activation that is it or do we have a problem here uh no we don't have a problem here right class names mprgmax i don't think we should have a problem here all right so that is basically it let's review the code real quick we have a model up here we have this convolutional neural network structure we load the training data we normalize it by dividing by 255 because the rgb values go from 0 to 255 we have the class names we have the token here we have a basic start help function a basic handle message function then we have this train function that actually trains the model by calling compile fit and then saving the model just in case we want to load it later on and then we have this handle photo function where we basically get uh get the image from the message we convert it to a byte stream we decode the byte stream we convert the color scheme from rgb to bgr or you could also say from bgr to rgb we resize it and um then we predict the actual class and in the end uh we say okay in this image i see a class named mp arcmax prediction mp arcmax basically just gives us the highest activation index and at this index we also have a class name which is going to be the proper class name all right so that's it uh we need to run it but of course this is going to take a long time let's just see if we have any mistakes here if i run this this should work uh and then we're going to skip through the part where the model is already trained or actually we need to train it first so actually let's go to to telegram web because we need to go to the neural bot here and we need to actually delete all that or actually not delete it we can just go ahead and train the model again because if we don't give the slash train command we're not going to see any training happening so we can go to pycharm you can see nothing is happening but if i say slash train then you can see model is being trained and the training started here in the script as well and now we're going to skip to the part where the training is done already all right so the training is now finished we can look at pycharm and see that 10 epochs are done and we can now go ahead it says done you can now send a photo we can go ahead and send a photo first of all a truck let's see if it classifies that one correctly let's send it and it says in this image i see a truck so this is correct let's go ahead and try the second one which is a bird send in this image i see a bird so this works let's go with a frog now those are all pretty obvious pictures obviously i can see a frog and last but not least i have a plane which is a little bit less obvious in my opinion so i'm not sure if it's going to classify it correctly but it should be able to do that no it sees a cat okay uh so probably because it only sees the shadow of a plane this one is a little bit more tricky so you can see the model is not perfect but if you have an obvious image of a frog or a bird or a cat or a plane or whatever a truck car it works and it can even distinguish a deer and a horse which uh for a computer could seem quite similar actually so it works decently but this is how you build an intelligent image classifying uh telegram bot so that's it for today's video hope you enjoyed it hope you learned something if so let me know by hitting the like button and leaving a comment in the comment section down below and of course don't forget to subscribe to this channel and hit the notification bell to not miss a single future video for free other than that thank you much for watching see you next video and bye [Music] you

Info

Channel: NeuralNine

Views: 3,647

Rating: 4.9814816 out of 5

Keywords: telegram, ai, telegram ai, ai bot, telegram bot, image classifier, classify images, image classification, image recognition, python telegram, python telegram ai, python ai bot, neural network, neural networks, machine learning, deep learning, python neural networks, tensorflow, tutorial, python telegram ai bot

Id: b3dAioTSSa4

Channel Id: undefined

Length: 26min 1sec (1561 seconds)

Published: Mon Aug 30 2021