Machine Learning Zero to Hero (Google I/O'19)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[MUSIC PLAYING] LAURENCE MORONEY: So the first question that comes out, of course, is that whenever you see machine learning or you hear about machine learning, it seems to be like this magic wand. Your boss says, put machine learning into your application. Or if you hear about startups, they put machine learning into their pitch somewhere. And then suddenly, they become a viable company. But what is machine learning? What is it really all about? And particularly for coders, what's machine learning all about? Actually, quick show of hands if any of you are coders. Yeah, it's I/O. I guess pretty much all of us, right, are coders. I do talks like this all the time. And sometimes, I'll ask how many people are coders, and three or four hands show up. So it's fun that we can geek out and show a lot of code today. So I wanted to talk about what machine learning is from a coding perspective by picking a scenario. Can you imagine if you were writing a game to play rock, paper, and scissors? And you wanted to write something so that you could move your hand as a rock, a paper, or a scissors. The computer would recognize that and be able to play that with you. Think about what that would be like to actually write code for. You'd have to pull in images from the camera, and you'd have to start looking at the content of those images. And how would you tell the difference between a rock and a scissors? Or how would you tell the difference between a scissors and a paper? That would end up being a lot of code that you would have to write, and a lot of really complicated code. And not only the difference in shapes-- think about the difference in skin tones, and male hands and female hands, large hands and small hands, people with gnarly knuckles like me, and people with nice smooth hands like Karmel. So how is it that you would end up being able to write all the code to do this? It'd be really, really complicated and, ultimately, not very feasible to write. And this is where we start bringing machine learning into it. This is a very simple scenario, but you can think about there are many scenarios where it's really difficult to write code to do something, and machine learning may help in that. And I always like to think of machine learning in this way. Think about traditional programming. And in traditional programming, something that has been our bread and butter for many years-- all of us here are coders-- what it is is that we think about expressing something and expressing rules in a programming language, like Java, or Kotlin, or Swift, or C++. And those rules generally act on data. And then out of that, we get answers. Like in rock, paper, scissors, the data would be an image. And my rules would be all my if-thens looking at the pixels in that image to try and determine if something is a rock, a paper, or a scissors. Machine learning then turns this around. It flips the axes on this. And we say, hey, instead of doing it this way where it's like we have to think about all of these rules, and we have to write and express all of these rules in code, what if we could provide a lot of answers, and we could label those answers and then have a machine infer the rules that maps one to the other? So for example, in something like the rock, paper, and scissors, we could say, these are the pixels for a rock. And this is what a rock looks like. And we could get hundreds or thousands of images of people doing a rock-- so we get diverse hands, diverse skin tones, those kind of things. And we say, hey, this is what a rock looks like. This is what a paper looks like. And this is what a scissors looks like. And if a computer can then figure out the patterns between these and can be taught and it can learn what the patterns is between these, now, we have machine learning. Now, we have an application, and we have a computer that has determined these things for us. So if we take a look at this diagram again, and if we look at this again and we replace what we've been talking about by us creating rules, and we say, OK, this is machine learning, we're to feed in answers, we're going to feed in data, and the machine is going to infer the rules-- what's that going to look like at runtime? How can I then run an application that looks like this? So this is what we're going to call the training phase. We've trained what's going to be called a model on this. And that model is basically a neural network. And I'm going to be talking a lot about neural networks in the next few minutes. But what that neural network is-- we're going to wrap that. We're going to call that a model. And then at runtime, we're going to pass in data, and it's going to give us out something called predictions. So for example, if I've trained it on lots of rocks, lots of papers, and lots of scissors, and then I'm going to hold my fist up to a webcam, it's going to get the data of my fist. And it's going to give back what we like to call a prediction that'll be something like, hey, there's an 80% chance that's a rock. There's a 10% chance it's a paper and 10% chance it's a scissor. Something like that. So a lot of the terminology of machine learning is a little bit different from traditional programming. We're calling it training, rather than coding and compiling. We're calling it inference, and we're getting predictions out of inference. So when you hear us using terms like that, that's where it all comes from. It's pretty similar to stuff that you've been doing already with traditional coding. It's just slightly different terminology. So I'm going to kick off a demo now where I'm going to train a model for rock, paper, and scissors. The demo takes a few minutes to train, so I'm just going to kick it off before I get back to things. So I'm going to start it here. And it's starting. And as it starts to run, I just want to show something as it goes through. So if you can imagine a computer, I'm going to give it a whole bunch of data of rock, paper, and scissors, and I'm going to ask it to see if it can figure out the rules for rock, paper, and scissors. So any one individual item of data I give to it, there's a one in three chance it gets it right first time. If it was purely random, and I said, what is this, there's a one in three chance it would get it correct as a rock. So as I start training, that's one of the things I want you to see here is the accuracy that, the first time through this, the accuracy was actually-- it was exactly 0.333. Sometimes, when I run this demo, it's a little bit more. But the idea is once it started training, it's getting that random. It's like, OK, I'm just throwing stuff at random. I'm making guesses of this. And it was, like, one in three right. As we continue, we'll see that it's actually getting more and more accurate. The second time around, it's now 53% accurate. And as it continues, it will get more and more accurate. But I'm going to switch back to the slides and explain what it's doing before we get back to see that finish. Can we go back to the slides, please? OK. So the code to be able to write something like this looks like this. This is a very simple piece of code for creating a neural network. And what I want you to focus on, first of all, are these things that I've outlined in the red box. So these are the input to the neural network and the output coming from the neural network. That's why I love talking about neural networks at I/O, because I/O, Input/Output. And you'll see I'll talk a lot about inputs and outputs in this. So the input to this is the size of the images. All of the images that I'm going to feed to the neural network of rocks, papers, and scissors are 150 square, and they're a 3-byte color depth. And that's why you see 150 by 150 by 3. And then the output from this is going to be three things, because we're classifying for three different things-- a rock, a paper, or a scissors. So always when you're looking at a neural network, those are really the first things to look at. What are my inputs? What are my outputs? What do they look like? But then there's this mysterious thing in the middle where we've created this tf.keras.layers.Dense, and there's a 512 there. And a lot of people wonder, well, what are those 512 things? Well, let me try and explain that visually. So visually, what's going on is what those 512 things are in the center of this diagram-- consider them to be 512 functions. And those functions all have internal variables. And those internal variables are just going to be initialized with some random states. But what we want to do is when we start passing the pixels from the images into these, we want them to try and figure out what kind of output, based on those inputs, will give me the desired output at the bottom? So function 0 is going to grab all those pixels. Function 1 is going to grab all those pixels. Function 2 is going to grab all those pixels. And if those pixels are the shape of a rock, then we want the output of function 0, 1, and 2 all the way up to 511 to be outputting to the box on the left at the bottom-- to stick a 1 in that box. And similarly for paper. If we say, OK, when the pixels look like this, we want your outputs of F0, F1, and F2 to go to this box. And that's the process of learning. So all that's happening-- all that learning is when we talk about machine learning, is setting those internal variables in those functions so we get that desired output. Now, those internal variables, just to confuse things a little bit more, in machine learning parlance, tend to be called parameters. And so for me, as a programmer, it was hard at first to understand that. Because for me, parameters are something I pass into a function. But in this case, when you hear a machine learning person talk about parameters, those are the values inside those functions that are going to get set and going to get changed as it tries to learn how I'm going to match those inputs to those outputs. So if I go back to the code and try to show this again in action-- now, remember, my input shape that I spoke about earlier on, the 150 by 150 by 3, those are the pixels that I showed in the preview. I'm simulating them here with gray boxes, but those are the pixels that I showed in the previous diagrams. My functions, now, is that dense layer in the middle, those 512. So that's 512 functions randomly initialized or semi-randomly initialized that I'm going to try to train to match my inputs to my outputs. And then, of course, the bottom-- those three are the three neurons that are going to be my outputs. And I've just said the word neuron for the first time. But ultimately, when we talk about neurons and neural networks, it's not really anything to do with the human brain. It's a very rough simulation of how the human brain does things. And these internal functions that try and figure out how to match the inputs to the outputs, we call those neurons. And on my output, those three at the bottom are also going to be neurons too. And that's what lends the name "neural networks" to this. It tends to sound a little bit mysterious and special when we call it like that. But ultimately, just think about them as functions with randomly initialize variables that, over time, are going to try to change the value of those variables so that the inputs match our desired outputs. So then there's this line, the model.compile line. And what's that going to do? That's a kind of fancy term. It's not really doing compilation where we're turning code into bytecode as before. But think about the two parameters to this. And these are the most important part to learn in machine learning-- and these are the loss and the optimizer. So the idea is the job of these two is-- remember, earlier on, I said it's going to randomly initialize all those functions. And if they're randomly initialized and I pass in something that looks like a rock, there's a one in three chance it's going to get it right as a rock, or a paper, or scissors. So what the Loss function does is it measures the results of all the thousands of times I do that. It figures out how well or how badly it did. And then based on that, it passes that data to the other function, which is called the Optimizer function. And the Optimizer function then generates the next guess where the guess is set to be the parameters of those 512 little functions, those 512 neurons. And if we keep repeating this, we'll pass our data in. We'll take a look. We'll make a guess. We'll see how well or how badly we did. Then based on that, we'll optimize, and we'll make another guess. And we'll repeat, repeat, repeat, until our guesses get better, and better, and better. And that's what happens in the model.fit. Here, you can see I have model.fit where epochs-- "ee-pocks," "epics," depending on how you pronounce it-- is 100. All it's doing is doing that cycle 100 times. For every image, take a look at the parameters. fit those parameters. Take a guess. Measure how good or how bad you did, and then repeat and keep going. And the optimizer then will make it better and better and better. So you can imagine the first time through, you're going to get it right roughly one in three times. Subsequent times, it's going to get closer, and closer, and closer, and better, and better. Now, those of us who know a little bit about images and image processing go, OK, that's nice, but it's a little naive. I'm just throwing all of the pixels of the image-- and maybe a lot of these pixels aren't even set-- into a neural network and having it try to figure out from those pixel values. Can I do it a little bit smarter than that? And the answer to that is, yes. And one of the ways that we can do it a little bit smarter than that is using something called convolutions. Now, convolutions is a convoluted term, if you'll excuse the pun. But the idea behind convolutions is if you've ever done any kind of image processing, the way you can sharpen images or soften images with things like Photoshop, it's exactly the same thing. So with a convolution, the idea is you take a look at every pixel in the image. So for example, this picture of a hand, and I'm just looking at one of the pixels on the fingernail. And so that pixel is value 192 in the box on the left here. So if you take a look at every pixel in the image and you look at its immediate neighbors, and then you get something called a filter, which is the gray box on the right. And you multiply out the value of the pixel by the corresponding value in the filter. And you do that for all of the pixel's neighbors to get a new value for the pixel. That's what a convolution is. Now, many of us, if you've never done this before, you might be sitting around thinking, why on earth would I do that? Well, the reason for that is that when finding convolutions and finding filters, it becomes really, really good at extracting features in an image. So let me give an example. So if you look at the image on the left here and I apply a filter like this one, I will get the image on the right. Now, what has happened here is that the image on the left, I've thrown away a lot of the noise in the image. And I've been able to detect vertical lines. So just simply by applying a filter like this, vertical lines are surviving through the multiplication of the filter. And then similarly, if I apply a filter like this one, horizontal lines survive. And there are lots of filters out there that can be randomly initialized and that can be learned that do things like picking out items in an image, like eyes, or ears, or fingers, or fingernails, and things like that. So that's the idea behind convolutions. Now, the next thing is, OK, if I'm going to be doing lots of processing on my image like this, and I'm going to be doing training, and I'm going to have to have hundreds of filters to try and pick out different features in my image, that's going to be a lot of data that I have to deal with. And wouldn't it be nice if I could compress my images? So compression is achieved through something called pooling. And it's a very, very simple thing. Sometimes, it seems a very complex term to describe something simple. But when we talk about pooling, I'm going to apply, for example, a 2 x 2 pool to an image. And what that's going to do is it's going to take the pixels 2 x 2-- like if you look at my left here, if I've got 16 simulated pixels-- I'm going to take the top four in the top left-hand corner. And of those four, I'm going to pick the biggest value. And then the next four on the top right-hand corner, of those four, I'll pick the biggest value, and so on, and so on. So what that's going to do is effectively throw away 75% of my pixels and just keep the maximums in each of these 2 x 2 little units. But the impact of that is really interesting when we start combining it with convolutions. So if you look at the image that I created earlier on where I applied the filter to that image of a person walking up the stairs, and then I pool that, I get the image that's on the right, which is 1/4 the size of the original image. But not only is it not losing any vital information, it's even enhancing some of the vital information that came out of it. So pooling is your friend when you start using convolutions because, if you have 128 filters, for example, that you apply to your image, you're going to have 128 copies of your image. You're going to have 128 times the data. And when you're dealing with thousands of images, that's going to slow down your training time really fast. But pooling, then, really speeds it up by shrinking the size of your image. So now, when we want to start learning with a neural network, now it's a case of, hey, I've got my image at the top, I can start applying convolutions to that. Like for example, my image might be a smiley face. And one convolution will keep it as a smiley face. Another one might keep the circle outline of a head. Another one might kind of change the shape of the head, things like that. And as I start applying more and more convolutions to these and getting smaller and smaller images, instead of me now having a big, fat image that I'm trying to classify, that I'm trying to pick out the features of to learn from, I can have lots of little images highlighting features in that. So for example, in rock, paper, scissors, my convolutions might show, in some cases, five fingers, or four fingers and a thumb. And I know that that's going to be a paper. Or it might show none, and I know that's going to be a rock. And it then begins to make the process of the machine learning these much, much simpler. So to show this quickly-- I've been putting QR codes on these slides, by the way. So I've open-sourced all the code that I'm showing here and we're talking through. And this is a QR code to a workbook where you can train a rock, paper, scissors model for yourself. But once we do convolutions-- and earlier in the slide, you saw I had multiple convolutions moving down. And this is what the code for that would look like. I just have a convolution layer, followed by a pooling. Another convolution, followed by a pooling. Another convolution, followed by a pooling, et cetera, et cetera. So the impact of that-- and remember, first of all, at the top, I have my input shape, and I have my output at the bottom where the Dense equals 3. So I'm going to switch back to the demo now to see if it's finished training. And we can see it. So we started off with 33% accuracy. But as we went through the epochs-- I just did this one, I think, for 15 epochs-- it got steadily, and steadily, and steadily more accurate. So after 15 loops of doing this, it's now 96.83% accurate. So as a result, we can see, using these techniques, using convolutions like this, we've been actually able to train something in just a few minutes to be roughly 97% accurate at detecting rock, paper, and scissors. And if I just take a quick plot here, we can see this is a plot of that accuracy-- the red line showing the accuracy where we started at roughly 33%. And we're getting close to 100%. The blue line is I have a separate data set of rock, paper, scissors that I tested with, just to see how well it's doing. And it's pretty close. I need to do a little bit of work in tweaking it. And I can actually try an example to show you. So I'm going to upload a file. I'm going to choose a file from my computer. I've nicely named that file Paper, so you can guess it's a paper. And if I open that and upload that, it's going to upload that. And then it's going to give me an output. And the output is 1, 0, 0. So you think, ah, I got it wrong. It detected it's a rock. But actually, my neurons here, based on the labels, are in alphabetical order. So the alphabetical order would be paper, then rock, then scissors. So it actually classified that correctly by giving me a 1. So it's actually a paper. And we can try another one at random. I'll choose a file from my machine. I'll choose a scissors and open that and run it. And again, paper, rock, scissors, so we see it actually classified that correctly. So this workbook is online if you want to download it and have a play with it to do classification yourself and to see how easy it is for you to train a neural network to do this. And then once you have that model, you can implement that model in your applications and then maybe play rock, paper, scissors in your apps. Can we switch back to the slides, please? So just to quickly show the idea of how convolutions really help you with an image, this is what that model looks like when I defined it. And at the top here, it might look like a little bit of a bug at first, if you're not used to doing this. But at the top here-- remember, we said my image is coming in 150 x 150-- it's actually saying, hey, I'm going to pass out an image that's 148 x 148. Anybody know why? Is it a bug? No, it's not a bug. OK. So the reason why is if my filter was 3 x 3, for me to be able to look at a pixel, I have to throw-- for me to start on the image, I have to start one pixel in and one pixel down in order for it to have neighbors. So as a result, I have to throw away all the pixels at the top, at the bottom, and either side of my image. So I'm losing one pixel on all sides. So my 150 x 150 becomes a 148 x 148. And then when I pool that, I halved each of the axes. So it becomes 74 x 74. Then through the next iteration, it becomes 36 x 36, then 17 x 17, and then 7 x 7. So if you think about all of these 150 squared images passing through all of these convolutions are coming up with lots of little 7 x 7 things. And those little 7 x 7 things should be highlighting a feature-- it might be a fingernail. It might be a thumb. It might be a shape of a hand. And then those features that come through the convolutions are then passed into the neural network that we saw earlier on to generate those parameters. And then from those parameters, hopefully, it would make a guess, and a really accurate guess, about something being a rock, a paper, or scissors. So if you prefer an IDE instead of using Collab, you can do that also. I tend to really like to use PyCharm for my developments. Any PyCharm fans here, out of interest? Yeah, nice. A lot of you. So here's a screenshot of PyCharm when I was writing this rock, paper, scissors thing before I pasted it over to Collab, where you can run it from Collab. So PyCharm is really, really nice. And you can do things like step-by-step debugging. If we can switch to the demo machine for a moment. Now, I'll do a quick demo of PyCharm doing step-by-step debugging. So here, we can see we're in rock, paper, scissors. And for example, if I hit the Debug, I can even set breakpoints. So now, I have a breakpoint on my code. So I can start taking a look at what's happening in my neural network code. Here, this is where I'm preloading the data into it. And I can step through, and I can do a lot of debugging to really make sure my neural network is working the way that I want it to work. It's one of the things that I hear a lot from developers when they first get started with machine learning is that, this seems to be your models are very much a black box. You have all this Python code for training a model, and then you have to do some rough guesswork. With TensorFlow being open-sourced, I can actually step into the TensorFlow code in PyCharm, like I'm doing here, to see how the training is going on, to help me to debug my models. And Karmel, later, is also going to show how something called TensorBoard can be used for debugging models. Can we switch back to the slides, please? So with that in mind, we've gone from really just beginning to understand what neural networks are all about and basic "Hello, world!" code to taking a look at how we can use something called convolutions. And they're something that sounds really complicated and really difficult. But once you start using them, you'll see they're actually very, very easy to use, particularly for image and text classification. And we saw then how, in just a few minutes, we were able to train a neural network to be able to recognize rock, paper, and scissors with 97%, 98% accuracy. So that's just getting started. But now, to show us how to actually stretch the framework, and to make it real, and to do really cool and production-quality stuff, Karmel is going to share with us. Thank you. [APPLAUSE] KARMEL ALLISON: Hi. So quick show of hands for, how many of you was that totally new, and now you're paddling as fast as you can to keep your head above water? All right, a fair number of you. I'm going to go over, now, some of the tools and features that TensorFlow has to take you from when you've actually got your model to all the way through production. Don't worry, there is no test at the end. So for those of you who are just trying to keep up right now, track these words, store somewhere in the back of your head that this is all available. For the rest of you where you've already got a model and you're looking for more that you can do with it, pay attention now. All right. So Laurence went through an image classification problem. In slides, we love image classification problems, because they look nice on slides. But maybe your data isn't an image classification problem. What if you've got categorical data or text-based data? TensorFlow provides a number of tools that allow you to take different data types and transform them before loading them into a machine learning model. In particular, for example, here, maybe we've got some user clickstreams, right? And we've got a user ID. Now, if we fed that directly into a deep learning model, our model would expect that that is real valued and numeric. And it might think that user number 125 has some relation to user 126, even though in reality, that's not true. So we need to be able to take data like this and transform it into data that our model can understand. So how do we do that? Well, in TensorFlow, one of the tools that we use extensively inside of Google are feature columns. These are configurations that allow you to configure transformations on incoming data. So here, you can see we're taking our categorical column, user ID, and we're saying, hey, this is a categorical column when we pass in data for it. And we don't want the model to use it as a categorical column. We want to transform this, in this case, into an embedding, right? So you could do a one-hot representation. Here, we're going to do an embedding that actually gets learned as we train our model. This embedding and other columns that you have can then get directly fed into Keras layers. So here, we have a Dense Features layer that's going to take all these transformations and run them when we pass our data through. And this feeds directly downstream into our Keras model so that when we pass input data through, the transformations happen before we actually start learning from the data. And that ensures that our model is learning what we want it to learn, using real-value numerical data. And what do you do with that layer once you've got it in your model? Well, in Keras, we provide quite a few layers. Laurence talked you through convolutional layers, pooling layers. Those are some of the popular ones in image models. But we've got a whole host of layers depending on what your needs are-- so many that I couldn't fit them in a single screenshot here. But there are RNNs, drop out layers, batch norm, all sorts of sampling layers. So no matter what type of architecture you're building, whether you're building something for your own small use case and image classification model, whatever it is, or the latest and greatest research model, there are a number of built-in layers that are going to make that a lot easier for you. And if you've got a custom use case that's actually not represented in one of the layers, and maybe you've got custom algorithms or custom functionality, one of the beauties of Keras is that it makes it easy to subclass layers to build in your own functionality. Here, we've got a Poincare normalization layer. This represents a Poincare embedding. This is not provided out-of-the-box with TensorFlow, but a community member has contributed this layer to the TensorFlow add-ons repository, where we provide a number of custom special use case layers. It's both useful, if you need Poincare normalization, but also a very good example of how you might write a custom layer to handle all of your needs, if we don't have that out-of-the-box for you. Here, you write the call method, which handles the forward pass of this layer. So you can check out the TensorFlow add-ons repository for more examples of layers like this. In fact, everything in Keras can be subclassed, or almost everything. You've got metrics, losses, optimizers. If you need functionality that's not provided out-of-the-box, we try to make it easy for you to build on top of what Keras already provides, while still taking advantage of the entire Keras and TensorFlow ecosystem. So here, I'm subclassing a model. So if I need some custom forward pass in my model, I'm able to do that easily in the call method. And I can define custom training loops within my custom model. This makes it easy to do-- in this case, a trivial thing, like multiply by a magic number. But for a lot of models where you need to do something that's different than the standard fit loop, you're able to customize in this way and still take advantage of all the tooling that we provide for Keras. So one of the problems with custom models and more complicated models is it's hard to know whether you're actually doing what you think you're doing and whether your model is training. One of the tools we provide for Keras, and TensorFlow more broadly, is TensorBoard. This is a visualization tool. It's web based, and it runs a server that will take in the data as your model trains so that you can see real time, epoch by epoch, or step by step, how your model is doing. Here, you can see accuracy and loss as the model trains and converges. And this allows you to track your model as you train and ensure that you're actually progressing towards convergence. And when you're using Keras, you can also see that you get the full graph of the layers that you've used. You can dig into those and actually get the op-level graph in TensorFlow. And this is really helpful in debugging, to make sure that you've correctly wired your model and you're actually building and training what you think you are training. In Keras, the way you add this is as easy as a few lines of code. Here, we've got our TensorBoard callback that we define. We add that to our model during training. And that's going to write out to the logs, to disk, a bunch of different metrics that then get read in by the TensorBoard web GUI. And as an added bonus, you get built-in performance profiling with that. So one of the tabs in TensorBoard is going to show you where all of your ops are being placed, where you've got performance bottlenecks. This is extremely useful as you begin to build larger and more models, because you will see that performance during training can become one of the bottlenecks in your process. And you really want to make that faster. Speaking of performance, this is a plot of how long it takes ResNet-50, one of the most popular machine learning models for image classification, to train using one GPU. Don't even ask how long it takes with one CPU, because nobody likes to sit there and wait until it finishes. But you can see that it takes a better part of a week with one GPU. One of the beauties of deep learning is that it is very easily parallelizable. And so what we want to provide as TensorFlow are ways to take this training pipeline and parallelize it. The way we do that in TensorFlow 2.0 is we're providing a series of distribution strategies. These are going to make it very easy for you to take your existing model code. Here, we've got a Keras model that looks like many of the others you've seen throughout this talk. And we're going to distribute it over multiple GPUs. So here, we add the mirrored strategy. With these few lines of code, we're now able to distribute our model across multiple GPUs. These strategies have been designed from the ground up to be easy to use and to scale with lots of different architectures and to give you great out-of-the-box performance. So what this is actually doing-- here, you can see that with those few lines of code, by building our model under the strategy scope, what we've done is we've taken the model, we've copied it across all of our different devices. In this picture, let's say we've got four GPUs. We copy our model across those GPUs, and we shard the input data. That means that you're actually going to be processing the input in parallel across each of your different devices. And in that way, you're able to scale model training approximately linearly with the number of devices you have. So if you've got four GPUs, you can run approximately four times faster. What that ends up looking like-- on ResNet, you can see that we get great scaling. And just out-of-the-box, what you're getting with that is that your variables are getting mirrored and synced across all available devices. Batches are getting prefetched. All of this goes into making your models much more performant during training time, all without changing code when you're using Keras. All right. And mirrored strategy with multi GPUs is just the beginning. As you scale models, as we do at Google, for example, you might want to use multiple nodes and multiple servers, each of which have their own set of GPUs. You can use the multiworker mirrored strategy for that, which is going to take your model, replicate it across multiple machines, all working synchronously to train your model, mirroring variables across all of them. This allows you to train your model faster than ever before. And this API is still experimental, as we're developing it. But in TensorFlow 2.0, you'll be able to run this out-of-the-box and get that great performance across large scale clusters. All right. So everything I've talked about so far falls under the heading of training models. And you will find that a lot of model builders only ever think about the training portion. But if you've got a machine learning model that you're trying to get into production, you know that's only half the story. There's a whole other half, which is, well, how do I take what I've learned and actually serve that to customers or to whoever the end user is, right? In TensorFlow, the way we do that is you're going to have to serialize your model into a saved model. This saved model becomes the serialized format of your model that then integrates with the rest of the TensorFlow ecosystem. That allows you to deploy that model into production. So for example, we've got a number of different libraries and utilities that can take this saved model. For TensorFlow Serving, we're going to be able to take that model and do web-based serving requests. This is what we use at Google for some of our largest scale systems. TensorFlow Lite is for mobile development. TensorFlow.js is a web-native solution for serving your models. I'm not going to have time to go over all of these in the next few minutes, but I will talk about TensorFlow Serving and TensorFlow Lite a little bit more. But first, how do you actually get to a saved model? Again, in TensorFlow 2.0, this is going to be easy and out-of-the-box where you're going to take your Keras model, you call .save. And this is going to write out the TensorFlow saved model format. This is a serialized version of your model. It includes the entire graph, and all of the variables, and weights, and everything that you've learned, and it writes that out to disk so that you can take it, pass it to somebody else, let's say. You can load it back into Python. You're going to get all of that Python object state back, as you can see here. And you could continue to train, continue to use that. You could fine-tune based on that. Or you could take that model and load it into TF Serving. So TensorFlow Serving responds to gRPC or REST requests. It acts as a front end that takes the requests, it sends them to your model for inference. It's going to get the result back. So if you're building a web app for our rock, paper, scissors game, you could take a picture, send it to your server. The server is going to ask the model, hey, what is this? Send back the answer, based on what the model found. And in that way, you get that full round trip. TensorFlow Serving is what we use internally for many of our largest machine learning models. So it's been optimized to have low latency and high throughput. You can check it out at TensorFlow.org. There's an entire suite of production pipelining and processing components that we call TensorFlow Extended, or TFX. You can learn more about those at TensorFlow.org, using that handy dandy QR code right there. And maybe you've got a model, and you've got your web app. But really, you want it on a phone, right? Because the future is mobile. You want to be able to take this anywhere. So TensorFlow Lite is the library that we provide for converting your saved model into a very tiny, small footprint. So that can fit on your mobile device. It can fit on embedded devices-- Raspberry Pis, Edge TPUs. We now run these models across a number of different devices. The way you do this is you take that same saved model from that same model code that you wrote originally. You use the TF Lite converter, which shrinks the footprint of that model. And then it can be loaded directly onto device. And this allows you to run on-device, without internet, without a server in the background, whatever your model is. And you can take it, take TensorFlow, wherever you want to be. Now, we've run through, really quickly, from some machine learning fundamentals, through building your first model, all the way through some of the tools that TensorFlow provides for taking those and deploying those to production. What do you do now? Well, there's a lot more out there. You can go to google.dev. You can go to TensorFlow.org where we've got a great number of tutorials. You can go to GitHub. This is all open source. You can see the different libraries there, ask questions, send PRs. We love PRs. And with that, I'd like to say, thank you. LAURENCE MORONEY: Thank you, very much. KARMEL ALLISON: Laurence, back out. [APPLAUSE] [MUSIC PLAYING]
Info
Channel: TensorFlow
Views: 1,252,698
Rating: 4.9517574 out of 5
Keywords: type: Conference Talk (Full production);
Id: VwVg9jCtqaU
Channel Id: undefined
Length: 35min 32sec (2132 seconds)
Published: Thu May 09 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.