MobileNet Image Classification with TensorFlow's Keras API

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey I'm Mandy from deep lizard in this episode we'll introduce mobile nets a class of a lightweight deep convolutional neural networks that are much smaller and faster in size than many of the mainstream popular models that are really well-known mobile nets are a class of small low-power low latency models that can be used for things like classification detection and other things that CN NS are typically good for and because of their small size these models are considered great for mobile devices hence the name mobile nets so I have some stats taken down here so just to give a quick comparison in regards to the size the size of the full VG g16 network that we've worked with in the past few episodes is about five hundred and fifty three megabytes on disk so pretty large generally speaking the size of one of the currently largest mobile Nets is only about 17 megabytes so that's a pretty huge difference especially when you think about deploying a model to run on a mobile app for example this vast size difference is due to the number of parameters or weights and biases contained in the model so for example let's see dgg 16 as we saw previously has about 138 million total parameters so a lot and the 17 megabyte mobile net that we talked about which was the largest mobile net only about 4.2 million parameters so that is much much smaller on a relative scale than vgg 16 with 138 million aside from the size on disk being a consideration when it comes to comparing mobile nets to other larger models we also need to consider the memory as well so the more parameters that a model has the more space in memory it will be taking up also so while mobile nets are faster and smaller than the competitors of these big hefty models like BG sixteen there is a catch or a trade-off and that trade-off is accuracy so mobile nets are not as accurate as some of these big players like vgg sixteen for example but don't let that discourage you while it is true that mobile nets aren't as accurate as these resource heavy models like vgg sixteen for example the trade-off is actually pretty small with only a relatively small reduction in accuracy and in the corresponding blog for this episode I have a link to a paper that goes more in depth to this relatively small accuracy difference if you'd like to check that out further let's now see how we can work with mobile nets and code with Kharis all right so we are in our Jupiter notebook and the first thing we need to do is import all of the packages that we will be making use of which these are not only for this video but for the next several videos where we will be covering mobile net and as I mentioned earlier in this course a GPU is not required but if you're running a GPU then you want to run this cell this is the same cell that we've seen a couple of times already in this course earlier where we are just making sure that tensorflow can identify our GPU if we're running one and it is setting the memory growth to true if we do have a GPU so again don't worry if you don't have a GPU don't worry but if you do then run this cell similar to how we downloaded the vgg 16 model when we were working with it in previous episodes we take that same approach to download mobile net here so we call TF carest applications nibble metal net and that the first time we call it is going to download mobile net from the internet so you need an internet connection but subsequent calls to this are just going to be getting the model from a saved model on disk and loading it into memory here so we are going to do that now and assign that to this mobile variable now mobile net was originally trained on the image net library just like vgg 16 so in a few minutes we will be passing some images to mobile net that i've saved on disk they are not images from the image net library but they are images of some general things and we're is going to get an idea about how mobile net performs on these random images but first in order to be able to pass these images to mobile net we're going to have to do a little bit of processing first so I've created this function called a prepare image and what it does is it takes a file name and it then inside the function we have an image path which is pointing to the location on disk where I have these saved image files that we're going to use to get predictions from mobile net so we have this image path defined to where these images are saved we then load the image by using the image path and appending the filename that we pass in so say if I pass in image 1 dot PNG here then we're going to take that file path append 1 PNG here pass that to load image and pass this target size of 224 by 224 now this load image function is from the Charis API so what we are doing here is we are just taking the image file resizing it to be of size 224 by 224 because that is the size of images that mobile net expects and then we just take this image and transfer it to be in a format of an array then we expand the dimensions of this image because that's going to put the image in the shape that mobile net expects and then finally we pass this new processed image to our last function which is TF Kerris dot application stop mobile net that pre process input so this is a similar function to what we saw a few episodes back when we were working with BG 16 it had its own pre process input now mobile net has its own pre process input function which is processing images in a way that mobile net expects so it's not the same way as bgg 16 actually it's just scaling all the RGB pixel values to be on a scale instead of from 0 to 255 to be on a scale from minus 1 to 1 so that's what this function is so overall this entire function is just resizing the image and putting it into an array format with expanded dimensions and then mobile net processing it and then returning this processed image okay so that's kind of a mouthful but that's what we got to do two images before we pass them to mobile net all right so we will just define that function and now we are going to display our first image called one dot PNG from our mobile net samples directory that I told you I just set up with a few random images and we're going to plot that to our Jupiter notebook here and what do you know it's a lizard so that is our first image now we are going to pass this image to our prepare image function that we defined right above that we just finished talking about so we're going to pass that to the function to pre-process the image accordingly then we are going to pass the pre processed image returned by the function to our mobile net model and we're going to do that by calling predict on the model just like we've done in previous videos when we've called predict on models to use them for inference then after we get the prediction for this particular image we are then going to give this prediction to this image net utils decode predictions function so this is a function from Charis that is just going to return the top five predictions from the 1000 possible image net classes and it's going to tell us the top five that mobile net is predicting for this image so let's run that and then print out those results and maybe you can have a better idea of what I mean once you see the printed output so we run this and we have our output so these are the top five in order results from the image net classes that mobile net is predicting for this image so it's assigning a 58% probability to this image of being an American chameleon 28% probability to green lizard 13% to a gamma and then we have some small percentages here under 1% for these other two types of lizards so it turns out if you're not aware and I don't know how you could not be aware of this because everyone should know this but this is an American chameleon and I don't know I've always called these things in green anoles but I looked it up and they're also known as American chameleons so mobile net got it right so yeah it assigned a 58 percent probability that was the highest most probable class next was green lizard so I'd say that that is still really good for that to be your second place I don't know if green lizard is supposed to be more general but and then a gamma which is also a similar-looking lizard if you didn't know so between these top three classes that it predicted this is almost 100% between the three of these I would say mobile and that did a pretty good job on this prediction so let's move on to number two all right so now we are going to plot our second image and this is a cup of well I originally thought that it was espresso and then someone called it a cup of cappuccino so let's say I'm not sure I'm not a coffee connoisseur although I do like both espresso and cappuccinos this looks like it has some cream in it so hey now we're going to go through the same process of what we just did for the lizard image where we are passing that passing this new image to our prepare image function so that it undergoes all of the pre-processing then we are going to pass the pre processed image to the predict function for our mobile net model then finally we are going to get the top 5 results from the predictions for this model relative and regards to the image net classes so let's see all right so according to mobile net this is an espresso not a cappuccino so I don't know but it predicts 99% probability of espresso as being the most probable class for this particular image and I'd say that that is pretty reasonable so let me know in the comments what do you think is this espresso or cappuccino I don't know if I mobile or if image that had a cappuccino class so if it didn't then I'd say that this is pretty spot-on but you can see that the other the other four predictions are all less than 1% but they are reasonable I mean the second one is cup third eggnog fourth coffee mug fifth wooden spoon gets a little bit weird but there is wood there is a circular shape going on here but these are all under 1% so they're pretty negligible I would say mobile net did a pretty great job at giving a 99% probability to espresso for this image all right we have one more sample image so let's bring that in and this is a strawberry or multiple strawberries if you call if you consider the background so same thing we are pre processing the strawberry image then we are getting a prediction from mobile net for this image and then we are getting that top 5 results for the most probable predictions among the 1000 images and we see that mobile net with 99.999 percent probability classifies this image as a strawberry correctly so very well and the rest are well well under 1% but they are all fruits so interesting another really good prediction from mobile net so even with the small reduction in accuracy that we talked about at the beginning of this episode you can probably tell from just these three random samples that that reduction is maybe not even noticeable when you're just doing tests like the ones that we just ran through so in upcoming episodes we're actually going to be fine-tuning this mobile net model to work on a custom data set and this custom data set is not one that was included in the original image net library it's going to be a brand new data set we're going to do more fine-tuning than what we've done in the past so stay tuned for that by the way we are currently in Vietnam filming this episode if you didn't know we also have a vlog channel where we document our travels and share a little bit more about our selves so check that out at people's 'red vlog on YouTube also be sure to check out the corresponding blog for this episode along with other resources available on the Boozer calm and check out the people's are type mind where you can gain exclusive access to perks and rewards thanks for contributing to collective intelligence I'll see you next time [Music] you [Music]
Info
Channel: deeplizard
Views: 40,852
Rating: undefined out of 5
Keywords: deep learning, pytorch, cuda, gpu, cudnn, nvidia, training, train, activation function, AI, artificial intelligence, artificial neural network, autoencoders, batch normalization, clustering, CNN, convolutional neural network, data augmentation, education, Tensorflow.js, fine-tune, image classification, Keras, learning, machine learning, neural net, neural network, Python, relu, Sequential model, SGD, supervised learning, Tensorflow, transfer learning, tutorial, unsupervised learning, TFJS
Id: 5JAZiue-fzY
Channel Id: undefined
Length: 14min 3sec (843 seconds)
Published: Thu Sep 03 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.