ml5.js: Image Classification with MobileNet

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[BELL RINGING] Hello. Welcome to the first code tutorial that I am ever making. I did make this kind of long-winded, rambling introduction to what ML5 library is itself, but in this video, I'm actually going to make a code example that does image classification. So this is going to be our "hello world," "hello to machine learning," first start. It's one of the easiest things you can do with the ML5 library, and it involves-- this is very key-- a pre-trained model. So this is not where everyone starts in tutorials about machine learning. A lot of tutorials about machine learning start with, like I said in the previous video, linear regression and the math and training this. And we're just going to use a pre-trained model. Somebody else made a machine learning model that knows how to recognize the contents of images, and ML5-- they open-sourced it, and ML5 is providing us access to it within JavaScript to use with the p5 library. So I'm going to write the code for this and talk about how it works, but let's first actually-- if you just even want to just play around with this model to start with, you could do this right here on the ML5 home page. So right here, you can see this image of a macaw. And guess what? The MobileNet model labeled this image as a macaw with a confidence of 98.79%. Oh, this must be the best, smartest machine learning model ever. It's like magic. It just knows everything. And in fact, I can grab this toucan and I can drag it in here, and look at this. It's a toucan with a conference of 99.99%. This is so smart. So smart. I can't believe the MobileNet model is amazing. Now let's get this puffin in here. It's got to know what a puffin is. Look at this. The MobileNet labeled this as an oystercatcher, and its confidence is kind of low. Huh. So what's going on here? Why is it not saying puffin? That's clearly a puffin. Well, first of all, guess what? A pre-trained model for image classification-- forget about accuracy-- has a fixed number of even things in the world it knows about. And one of the things that's really important if you're working with something like a pre-trained model is actually to know what's in that set. Now, it's not the most easy thing to find. But right here on the ML5 GitHub, there's actually this page of JavaScript code that shows all the things that the MobileNet model happens to know about. In fact, there are 1,000 classes. And you could see this is crazy. It knows about a beagle and a bloodhound and a black and-- and a English foxhound. It's trained to know obscure dog breeds. But if I look for puffin, it's nowhere to be found. Oystercatcher, however, is in there. And if I were to look-- like, what does an oystercatcher look like? It kind of makes sense. That's the thing it knows about that's closest to puffin. So if your project is all about puffins, you're not out of luck. And I will show you ways to train with your own data, to retrain a pre-trained model. But right now, we're at this starting point. So birds aside-- it happens you know a lot about birds. It happens you know a lot about dogs. But let's just say, for the sake of argument, I'm going to now give it this coding train logo. It thinks that's a wall clock, and it has a low confidence. And that-- I don't know, maybe because of what's here, the flat nature of it. I can give it an image of myself. It thinks I'm a snorkel, which strangely is probably quite accurate. I'm kind of snorkel-like in more ways than one. But whatever. Maybe that's too weird. But the point is, it actually knows nothing about people, and it knows nothing about a lot of things that exist in our world. Who made these decisions? What's in the pre-- why are these classes written this way? And what was it even trained with? So first-- so let's talk about for a second supervised learning. What is supervised learning? So image classification. What's happening with image classification? So the stage that we're going to do in our code, we're really just going to do the prediction stage. So the prediction stage-- we have this pre-trained MobileNet model. The model expects some input, and it expects that input to be an image. So that's the input to the model. Then the model generates an output, and that output is a list of labels-- cat, dog, puffin, et cetera-- and confidence scores-- 98%. Well, I guess the-- do the scores all add up to 100%? I think they should. 90%. 6%. 2%. Et cetera. So this is the stage that we're doing. But how did we even-- how do we get to the point where we could do this ourselves? So somebody had to do this with a process known as supervised learning. I'm going to write this down here-- supervised learning. So supervised learning involves a data set. And this is often referred to as a labeled data set, meaning-- you could think of it as a spreadsheet, a table, a table of images and labels. Essentially, there's two columns here. One column is a lot of examples of inputs, and the other column is a lot of examples of outputs. So I might have cat.jpeg. Oh, and that's a cat. And then I might have kitten.jpeg, and that's also a cat. And then I have puffin.jpeg, and that's a puffin. So the idea here is that there is this existing training data set. This is a training data set. The data set is used to train the MobileNet model. It just says over and over again, here's this image, here's its label. Here's this image. Here's this label. And the process of how that works involves looking at the error. The model tries to make a guess. Does it get it wrong? Since it knows the right answer, it can change its settings to try to get the right answer the next time. This is the process. Once that finishes, usually the model is tested with some other images that weren't used in the training set, and then a paper is written about it, it's published. Or it might be something that a company owns and keeps closed down and doesn't provide access to. It uses it in their own proprietary software. MobileNet is one that is public. So the important question for us to ask, if we're going to use this pre-trained model and we're going to start trying to understand why it's giving us certain results and what is it good at, what is it not good at, is we need to know, what was that training set? In this case, the training set is a database of images called ImageNet. We can come back to the computer. Sorry. Whoops. We can look at the ImageNet website. And look at this. ImageNet is a database of almost 15 million images. And that seems pretty good, right? A machine learning system, for it to do a good job in the supervised learning process, it needs a lot of data. If we just have 10 images, it's not going to be able to do very much. There's some tricky things we can do that I'll show you later with small data sets, but in general, we need a large data set. So that seems like a good thing. But one thing you have to realize-- what are the motivations of this data set? Where are the images coming from? Well, most of the-- this data set was really put together for research into image classification. And in order to do that, it needs to use, well, images that are in the public domain, that it can get access to. And frankly, almost all of-- it's very likely that if you do a Google search for any of these-- so any of the-- let me just do this really quickly. So I go back to the labels. Let's look for king penguin. So I'm going to do a search for king penguin. I'm going to go to Images. I'm going to grab this image. I'm going to do Save Image As. I'm going to save it to the desktop as Penguin. I mean to go back to ML5. And then I'm going to go here and say-- Look, 100%. Do you know why it's 100%? I am almost sure that that image is in-- I mean, I'm not sure about this, but go and look. That's probably in ImageNet. So a lot of these things that come up in the first search result that are in the public domain for Wikipedia, they're actually in that image database. So it's going to work really well with something like that versus the image of a puffin that's not in the database. Doesn't even know what a puffin-- So maybe if we could find an image of a penguin that's not in the database, ImageNet database, it probably would guess it as a penguin. But we wouldn't necessarily see that much of a confidence. Maybe if we had a zoomed-out picture with a lot more background noise-- something that you could try that's super interesting is to just download this image and draw over it or change a couple of pixels. What happens to the image classification afterwards? All these things you could try. And better yet, you can try them by writing your own code. So I think it's time-- I've rambled on and on for long enough about what this model is. The last thing I'll mention is, if you really want to dive deeply into the MobileNet model, there is a whole paper written about it. There's a GitHub repository. I'll link to all that stuff. I encourage you to research this stuff and to think about it, to ask questions, and to be critical. What are the motivations behind the creation of a model? What data was used in that model? And where might you want to use it or not use it? So all that aside, let's actually write some code to use the model now. We're ready to write the code. Now, what I'm starting with is one JavaScript file with a little bit of code in it. Again, I mentioned I'm using the p5 library. The p5 library supports a setup function which executes when the web page loads. createCanvas makes a canvas. And now I color the background with the color 0, which is black. And if I go into the browser, I am able to see the results of this code because I also happen to be running a local server using a node server package. Now, there are so many ways that you can run a web page that's running JavaScript. And you could use CodePen or OpenProcessing or something new that's going to be coming out soon, which is a p5 editor that you could use online that, when it comes out, I'll link to it in the video description. But for right now, I'm going to use my workflow of having my own text editor on my computer and the p5 libraries imported in the HTML file. So here's the HTML file. The only thing it's doing right now as it's referencing the p5 library and something called the p5.dom library which I need for some of the things I want to do. And then also, of course, my own sketch.js. I have a video where I go through my entire workflow in more detailed steps. I will also link to that this video's description. Now that that's settled, though, I can actually start to use ML5. So how do I use ML5? So right here, I'm on the ML5 homepage. The first place I should go is just click on this big Get Started button. So I'm going to click on that. And I totally clicked around to the wrong place. Let's try that again. I'm going to click on the Get Started button. And I'm going to go right down here, and I'm going to look at this. This is what I need. Now, I can download the ML5 JavaScript library file itself and include it on my computer. And I might want to do that if I'm going to work offline. But I prefer to just reference this particular URL in a script tag. And this is-- at the time of this recording, this is the current version of the ML5 library-- 0.1.1. Now, by the time you're watching this, it might be like 7.8.6. And if that's the case, this video probably means absolutely nothing whatsoever. But if it's relatively close to that version, hopefully this video is still helpful to you. I will not make my joke about your brain being inside of the gelatinous thawing machine thing in this video. I've made it too many times. It's not even a-- it's not even funny. So I'm going to put that in here. I'm just going to go back to my web page, and I'm going to refresh. It's still working. Good. So now we have the ML5 library imported. We can start calling ML5 functions. So one thing I might actually do is just go back to the ML5 website-- oops, sorry-- which is here, and click on Examples. And I could go down to Image Classification, and then I could start looking at this code and copy-pasting some stuff. I'm going to type it out. And I have some of this in my head, so I can just type it out. But more importantly, probably-- there are these examples which you can use to get started, but I might also suggest just having the web page open to the reference page for a particular feature. So again, I want to have videos that go through all of these features, but let's go through the Image Classifier one. I'm going to click on that. So now, this is the documentation. There's a little bit of an example here, and there's some information about the syntax. And if I get confused, I can refer back to that. But since I know some of this having done this before, I'm going to just start writing directly into sketch.js itself. So the first thing that I need to do is I need to create an image classifier itself. So I'm going to make a variable, and I'm going to call it classifier. And actually, you know what? I'm going to call it mobilenet because I want to remind myself that this is not magic, that this is using a very specific pre-trained model called MobileNet. So I'm going to call my variable mobilenet. Then I'm going to say mobilenet equals ml5.imageClassifer. So this is a function that generates an image classification object. It's going to be stored in that variable mobilenet. And now it needs some arguments. So what goes in there? Now, the truth of the matter is, there might be a variety of things you can put in there. At the time of this recording, there's really only one option, which is the MobileNet model. So I'm telling ML5 that I want to make an image classifier, and the first argument I'm giving it as a string with the name of the model. Now, in theory, as ML5 supports additional pre-trained model, I might get to type something in here, like UnicornClassifier. Maybe it classifies all different kinds of unicorns. MobileNet. And then I need something else really important. I need a callback. Deep breath, deep breath, deep breath, deep breath. So I've got to stop for a second. Oi. The ML5 library supports callbacks, which is what I'm going to use in these video tutorials, and something called promises. If you don't know what a JavaScript promise is, I will refer you to my playlist about JavaScript promises, and you might look at some of the documentation. But if you're someone who already uses JavaScript promises, it's another way of handling asynchronous events that's slightly different but very similar to callbacks. That's something you can go and do on your own, or I'll make some videos about that too. But right now, I'm going to use the callback methodology. So the idea here as I type in the name of a function. I could also put an anonymous function in there called model ready. And I'm going to write-- I'm just going to put that function up here in the global space because that's going to be very simple. And I'm just going to say console.log, "Model is ready!" So the idea here is I am now creating an image classifier with a MobileNet model. It's going to take some time for it to load that model. This is not a small thing. Now, it's called MobileNet because it's actually a tiny model that can even run on mobile phones. So this code would work well even in a mobile browser. But even the tiniest model is something that's got some size to it. It's going to take a little while to load. So let's go see how long. I go over here, into here. I'm going to hit Refresh. [WHISTLING] Model is Ready. So I don't know-- that was a few seconds. You can see-- now, one thing that's interesting-- where did it load the model from? Does the ML5 library have a copy of it loaded-- actually, no. So one thing that's important here is a lot of the pre-trained models-- this is going to be different, but a lot of the pre-trained models that you might make use of in ML5 are actually loaded from the cloud, meaning some underground bunker of servers where the model file is stored. And I believe it's coming from a Google server. So if you're not connected to the internet, this example won't even run. At some point, it would be nice to support ways of running MobileNet model offline. It is certainly technically possible. But the ML5 library-- not all of the examples, but this example in particular requires you to be online. So the model is ready. So now what can I do? I can classify an image. This is fun for me. All right. So let's see. I already have in this directory a bunch of those images that I was experimenting with. Let's start by using the puffin, since that's one that we worked with earlier. Personally, I like to use in my examples stuff that doesn't work because it leads you to ask questions. If stuff just works, it can start to feel like magic and you can-- it's important to think about why stuff and how stuff is working. So what I'm going to do in setup, I'm going to make a variable called puffin. Puffing. Puffin. And I'm going to say, puffin equals createImg. Now, this is a particular function in the p5 library. This is now-- like createCanvase is p5, createIMG is p5. This makes a dom element, an image element on the web page that is that image. So if I were to reload this page, you're going to see-- OK. Ah, it's in the directory images, so I need to include that there. You're going to see there it is down there. Now, I can do things, like if I wanted to have this all happen in my canvas, I could say puffin.hide, and then I can actually say image, puffin. And this won't work, but it'll be interesting to think why for a second. And I want to draw the puffin into canvas. The reason why it didn't work is, once again, things don't happen in JavaScript instantaneously. So it's going to take some time to create that image. I'm going to-- I could give that a callback called imageReady. A lot of this stuff is unnecessary to the example I'm building, but just to demonstrate the idea. And then, now I'm going to draw the image into the canvas. Ah, but that image is really a big size, so I could resize it. Or I could just force it to be the size of the canvas. And there we go. So I now have a p5 canvas displaying my puffin image, and the model is ready. So now, once the model is ready, what can I do? I can classify the image. I can try to ask MobileNet what's in it. And the way I do that is with a function called predict. So you'll see different names for different functions in the ML5 library. Predict will come up often. It might change to classify by the time you watch this video, but I think it's called predict right now, this version of ML5. I'm going to say mobilenet.predict puffin, meaning I want MobileNet to give me a prediction of what it thinks the content of the puffin image is. And then, guess what? That takes some time. It doesn't happen instantaneously. So I need yet again another callback. And I'm going to give it a callback called gotResults. And again, I could do this with a promise if that's something that you're more comfortable with, but I'm going to do with a callback. And then I'm going to write this function called gotResults. And now, if you've used p5 before, this is very similar to loadJSON. Oh, I want to load the data from this JSON file. And then, you usually have an argument to the callback which has the stuff in it. The stuff that you want fills that argument. But here's something. ML5 works with error first callbacks. This is a JavaScript pattern where, when you write the callback, you always must include as the first argument an error argument, an error variable, meaning the library is forcing you to check for errors. So in this case, I might want to say something like, if (error)-- whoops, not a new promise-- console-- and by the way, I could say console.log, but I can actually say console.error(error). And then otherwise, console.log(results). So this is a little bit of extra error handling, if something went wrong in the prediction process, hopefully I would see that there. Otherwise, I'm going to see the results. Let's see if this works. Model is Ready. Model is Ready. And then-- oh, look at this. [MUSIC PLAYING] And here we go. Oystercatcher. And we can see this probability is-- that is crazy. So this is really interesting to me. This got 75%. Then it also thought it was albatross and drake, different probabilities. Now why-- 5%, 11%. Why-- why is this different than what I got with the oystercatcher in the home page of ML5JS? I don't know the answer to that. I will have to think about that. It's a different probability. But nonetheless, I'm getting a pretty similar result. It might have to do with versions of something, and that the home page is running a different version of something. But we can see this is the idea. Now what I could what do is I could go and I could say, all right, let me get the label is results. This, by the way, is an array with three objects in it. So I could say, I want to get the first object, index 0. And I want to get the class name, object 0 dot classname. That's the label. And then I'm just going to draw-- I'm going to say fill(0) textSize(100)-- no, starts with (64). I don't know. And then I going to say text(label). And I'll say 10, and like height minus 50 or something. I don't know. I'm just putting it arbitrarily somewhere on the page, on the canvas. So let's run this again. We can see Model Loaded. And then it says oystercatcher right there. How thrilling. We have now run the MobileNet model, gotten a prediction for our image, and there it is. Of course, I could also do something like create a dom element for the label. I could also get the probability by saying equals results(0).-- and what was it called? className probability is the other probability. And I don't know if I spelled that right. And I could create two dom elements using createP. And we could run this one more time. And now we're going to see that stuff down here. It's very, very small. Probability is not defined. Probability. Probability. Oh, prob. I named my variable prob because I got a lot of probs. So we can see here there's the probability. So this is the basics. Now you see you could actually just go and use this in a project right now. I have a couple of ideas for you. Number one is make a little interface, try different images. Can you make one where you can actually drag and drop your own image like it does on the ML5 home page? Could you make something where you draw on the canvas and it's trying to classify what you're drawing? Another thing you could try is-- can you get it to classify what the web canvas is seeing? And guess what? That's what I'm going to do in the next video. So there are so many more things you could do with this, but this is the basic idea. Try your own version of this experiment. Play around. Try a lot of different images. But in the next video, I am going to basically take this exact example but hook it up to the live webcam feed, and we can see it classify images from the webcam in real time. All right. I'm doing this as a Livestream right now. If you're watching this recorded, it's recorded. But I'm going to check to see any questions, and I'll answer those also at the beginning the next video. See you soon. [BELL RINGING] [MUSIC PLAYING]
Info
Channel: The Coding Train
Views: 125,914
Rating: undefined out of 5
Keywords: machine learning basics, machine learning beginner tutorial, machine learning beginner, ml5, ml5.js, machine learning js, machine learning javascript, machine learning javascript tutorial, javascript, itp, shiffman, coding train, machine learning, artificial intelligence, daniel shiffman, the coding train, itp nyu, machine learning tutorial, neural network, neural networks, what is machine learning, javascript (programming language), deep learning, mobilenet, image classification
Id: yNkAuWz5lnY
Channel Id: undefined
Length: 24min 20sec (1460 seconds)
Published: Wed Aug 01 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.