[MUSIC PLAYING] LAURENCE MORONEY: So
the first question that comes out, of
course, is that whenever you see machine learning or you
hear about machine learning, it seems to be like
this magic wand. Your boss says, put machine
learning into your application. Or if you hear
about startups, they put machine learning into
their pitch somewhere. And then suddenly, they
become a viable company. But what is machine learning? What is it really all about? And particularly for
coders, what's machine learning all about? Actually, quick show of hands
if any of you are coders. Yeah, it's I/O. I guess
pretty much all of us, right, are coders. I do talks like
this all the time. And sometimes, I'll ask
how many people are coders, and three or four hands show up. So it's fun that we can geek out
and show a lot of code today. So I wanted to talk about
what machine learning is from a coding perspective
by picking a scenario. Can you imagine if you
were writing a game to play rock,
paper, and scissors? And you wanted to
write something so that you could move your
hand as a rock, a paper, or a scissors. The computer would
recognize that and be able to play that with you. Think about what that would
be like to actually write code for. You'd have to pull in
images from the camera, and you'd have to start looking
at the content of those images. And how would you
tell the difference between a rock and a scissors? Or how would you
tell the difference between a scissors and a paper? That would end up
being a lot of code that you would have
to write, and a lot of really complicated code. And not only the
difference in shapes-- think about the
difference in skin tones, and male hands and female hands,
large hands and small hands, people with gnarly
knuckles like me, and people with nice
smooth hands like Karmel. So how is it that
you would end up being able to write all
the code to do this? It'd be really, really
complicated and, ultimately, not very feasible to write. And this is where
we start bringing machine learning into it. This is a very simple
scenario, but you can think about there are
many scenarios where it's really difficult to write
code to do something, and machine learning
may help in that. And I always like to think of
machine learning in this way. Think about traditional
programming. And in traditional programming,
something that has been our bread and butter
for many years-- all of us here are coders-- what it is is that we think
about expressing something and expressing rules in
a programming language, like Java, or Kotlin,
or Swift, or C++. And those rules
generally act on data. And then out of
that, we get answers. Like in rock, paper, scissors,
the data would be an image. And my rules would
be all my if-thens looking at the pixels in that
image to try and determine if something is a rock,
a paper, or a scissors. Machine learning then
turns this around. It flips the axes on this. And we say, hey, instead
of doing it this way where it's like we have to
think about all of these rules, and we have to write and express
all of these rules in code, what if we could provide
a lot of answers, and we could label those answers
and then have a machine infer the rules that maps
one to the other? So for example, in something
like the rock, paper, and scissors, we
could say, these are the pixels for a rock. And this is what
a rock looks like. And we could get hundreds
or thousands of images of people doing a rock-- so we get diverse hands,
diverse skin tones, those kind of things. And we say, hey, this is
what a rock looks like. This is what a paper looks like. And this is what a
scissors looks like. And if a computer
can then figure out the patterns between
these and can be taught and it can learn what the
patterns is between these, now, we have machine learning. Now, we have an
application, and we have a computer that has
determined these things for us. So if we take a look
at this diagram again, and if we look at this
again and we replace what we've been talking about by
us creating rules, and we say, OK, this is machine learning,
we're to feed in answers, we're going to feed in
data, and the machine is going to infer the rules-- what's that going to
look like at runtime? How can I then
run an application that looks like this? So this is what we're going
to call the training phase. We've trained what's going
to be called a model on this. And that model is
basically a neural network. And I'm going to be talking
a lot about neural networks in the next few minutes. But what that neural network
is-- we're going to wrap that. We're going to
call that a model. And then at runtime, we're
going to pass in data, and it's going to give us out
something called predictions. So for example, if I've
trained it on lots of rocks, lots of papers, and
lots of scissors, and then I'm going to hold
my fist up to a webcam, it's going to get
the data of my fist. And it's going to
give back what we like to call a
prediction that'll be something like, hey, there's
an 80% chance that's a rock. There's a 10% chance it's
a paper and 10% chance it's a scissor. Something like that. So a lot of the terminology
of machine learning is a little bit different
from traditional programming. We're calling it
training, rather than coding and compiling. We're calling it inference,
and we're getting predictions out of inference. So when you hear us
using terms like that, that's where it all comes from. It's pretty similar to stuff
that you've been doing already with traditional coding. It's just slightly
different terminology. So I'm going to
kick off a demo now where I'm going to train a model
for rock, paper, and scissors. The demo takes a few
minutes to train, so I'm just going to kick it
off before I get back to things. So I'm going to start it here. And it's starting. And as it starts to run, I
just want to show something as it goes through. So if you can
imagine a computer, I'm going to give it a whole
bunch of data of rock, paper, and scissors, and
I'm going to ask it to see if it can figure
out the rules for rock, paper, and scissors. So any one individual
item of data I give to it, there's
a one in three chance it gets it
right first time. If it was purely random,
and I said, what is this, there's a one in three chance it
would get it correct as a rock. So as I start
training, that's one of the things I
want you to see here is the accuracy that, the
first time through this, the accuracy was actually-- it was exactly 0.333. Sometimes, when I run this
demo, it's a little bit more. But the idea is once
it started training, it's getting that random. It's like, OK, I'm just
throwing stuff at random. I'm making guesses of this. And it was, like,
one in three right. As we continue, we'll
see that it's actually getting more and more accurate. The second time around,
it's now 53% accurate. And as it continues, it will
get more and more accurate. But I'm going to switch
back to the slides and explain what it's
doing before we get back to see that finish. Can we go back to
the slides, please? OK. So the code to be able to
write something like this looks like this. This is a very
simple piece of code for creating a neural network. And what I want you to
focus on, first of all, are these things that I've
outlined in the red box. So these are the input to the
neural network and the output coming from the neural network. That's why I love talking
about neural networks at I/O, because
I/O, Input/Output. And you'll see I'll talk a
lot about inputs and outputs in this. So the input to this is
the size of the images. All of the images
that I'm going to feed to the neural network of
rocks, papers, and scissors are 150 square, and they're
a 3-byte color depth. And that's why you
see 150 by 150 by 3. And then the output
from this is going to be three things,
because we're classifying for three
different things-- a rock, a paper, or a scissors. So always when you're
looking at a neural network, those are really the
first things to look at. What are my inputs? What are my outputs? What do they look like? But then there's
this mysterious thing in the middle
where we've created this tf.keras.layers.Dense,
and there's a 512 there. And a lot of people
wonder, well, what are those 512 things? Well, let me try and
explain that visually. So visually, what's going on
is what those 512 things are in the center of this diagram-- consider them to
be 512 functions. And those functions all
have internal variables. And those internal
variables are just going to be initialized
with some random states. But what we want to
do is when we start passing the pixels from
the images into these, we want them to
try and figure out what kind of output,
based on those inputs, will give me the desired
output at the bottom? So function 0 is going
to grab all those pixels. Function 1 is going to
grab all those pixels. Function 2 is going to
grab all those pixels. And if those pixels are
the shape of a rock, then we want the output
of function 0, 1, and 2 all the way up
to 511 to be outputting to the box on the left at
the bottom-- to stick a 1 in that box. And similarly for paper. If we say, OK, when the
pixels look like this, we want your outputs of F0,
F1, and F2 to go to this box. And that's the
process of learning. So all that's happening--
all that learning is when we talk about
machine learning, is setting those internal
variables in those functions so we get that desired output. Now, those internal
variables, just to confuse things
a little bit more, in machine learning parlance,
tend to be called parameters. And so for me, as a programmer,
it was hard at first to understand that. Because for me,
parameters are something I pass into a function. But in this case, when you hear
a machine learning person talk about parameters,
those are the values inside those functions
that are going to get set and going to get
changed as it tries to learn how I'm going to match
those inputs to those outputs. So if I go back to
the code and try to show this again in action-- now, remember, my input shape
that I spoke about earlier on, the 150 by 150 by 3,
those are the pixels that I showed in the preview. I'm simulating them
here with gray boxes, but those are the
pixels that I showed in the previous diagrams. My functions, now, is that dense
layer in the middle, those 512. So that's 512 functions
randomly initialized or semi-randomly
initialized that I'm going to try to train to
match my inputs to my outputs. And then, of course,
the bottom-- those three are the three neurons that
are going to be my outputs. And I've just said the word
neuron for the first time. But ultimately, when
we talk about neurons and neural networks,
it's not really anything to do with the human brain. It's a very rough
simulation of how the human brain does things. And these internal functions
that try and figure out how to match the
inputs to the outputs, we call those neurons. And on my output, those
three at the bottom are also going to
be neurons too. And that's what lends the name
"neural networks" to this. It tends to sound a little
bit mysterious and special when we call it like that. But ultimately, just
think about them as functions with randomly
initialize variables that, over time,
are going to try to change the value
of those variables so that the inputs match
our desired outputs. So then there's this line,
the model.compile line. And what's that going to do? That's a kind of fancy term. It's not really
doing compilation where we're turning code
into bytecode as before. But think about the
two parameters to this. And these are the
most important part to learn in machine
learning-- and these are the loss and the optimizer. So the idea is the
job of these two is-- remember,
earlier on, I said it's going to randomly
initialize all those functions. And if they're
randomly initialized and I pass in something
that looks like a rock, there's a one in
three chance it's going to get it right as a
rock, or a paper, or scissors. So what the Loss
function does is it measures the results of
all the thousands of times I do that. It figures out how well
or how badly it did. And then based on that,
it passes that data to the other function, which is
called the Optimizer function. And the Optimizer function
then generates the next guess where the guess is set to be
the parameters of those 512 little functions,
those 512 neurons. And if we keep repeating
this, we'll pass our data in. We'll take a look. We'll make a guess. We'll see how well
or how badly we did. Then based on that,
we'll optimize, and we'll make another guess. And we'll repeat,
repeat, repeat, until our guesses get better,
and better, and better. And that's what happens
in the model.fit. Here, you can see I have
model.fit where epochs-- "ee-pocks," "epics," depending
on how you pronounce it-- is 100. All it's doing is doing
that cycle 100 times. For every image, take a
look at the parameters. fit those parameters. Take a guess. Measure how good
or how bad you did, and then repeat and keep going. And the optimizer
then will make it better and better and better. So you can imagine the
first time through, you're going to get it right
roughly one in three times. Subsequent times,
it's going to get closer, and closer, and
closer, and better, and better. Now, those of us who know
a little bit about images and image processing
go, OK, that's nice, but it's a little naive. I'm just throwing all of
the pixels of the image-- and maybe a lot of these
pixels aren't even set-- into a neural
network and having it try to figure out from
those pixel values. Can I do it a little
bit smarter than that? And the answer to that is, yes. And one of the ways
that we can do it a little bit smarter than
that is using something called convolutions. Now, convolutions is
a convoluted term, if you'll excuse the pun. But the idea behind
convolutions is if you've ever done any
kind of image processing, the way you can sharpen
images or soften images with things like Photoshop,
it's exactly the same thing. So with a convolution, the
idea is you take a look at every pixel in the image. So for example, this
picture of a hand, and I'm just looking at one of the
pixels on the fingernail. And so that pixel is value 192
in the box on the left here. So if you take a look at
every pixel in the image and you look at its
immediate neighbors, and then you get something
called a filter, which is the gray box on the right. And you multiply out
the value of the pixel by the corresponding
value in the filter. And you do that for all
of the pixel's neighbors to get a new value
for the pixel. That's what a convolution is. Now, many of us, if
you've never done this before, you might be sitting
around thinking, why on earth would I do that? Well, the reason for that is
that when finding convolutions and finding filters,
it becomes really, really good at extracting
features in an image. So let me give an example. So if you look at the
image on the left here and I apply a filter
like this one, I will get the
image on the right. Now, what has happened
here is that the image on the left, I've thrown away a
lot of the noise in the image. And I've been able to
detect vertical lines. So just simply by applying
a filter like this, vertical lines are surviving
through the multiplication of the filter. And then similarly, if I
apply a filter like this one, horizontal lines survive. And there are lots
of filters out there that can be randomly
initialized and that can be learned that do
things like picking out items in an image, like
eyes, or ears, or fingers, or fingernails, and
things like that. So that's the idea
behind convolutions. Now, the next thing
is, OK, if I'm going to be doing
lots of processing on my image like this, and I'm
going to be doing training, and I'm going to have to
have hundreds of filters to try and pick out different
features in my image, that's going to be a lot of data
that I have to deal with. And wouldn't it be nice if
I could compress my images? So compression is
achieved through something called pooling. And it's a very,
very simple thing. Sometimes, it seems
a very complex term to describe something simple. But when we talk
about pooling, I'm going to apply, for example,
a 2 x 2 pool to an image. And what that's going
to do is it's going to take the pixels 2 x 2-- like if you look
at my left here, if I've got 16
simulated pixels-- I'm going to take the top four
in the top left-hand corner. And of those four, I'm going
to pick the biggest value. And then the next four on
the top right-hand corner, of those four, I'll
pick the biggest value, and so on, and so on. So what that's going to do
is effectively throw away 75% of my pixels and just keep
the maximums in each of these 2 x 2 little units. But the impact of that
is really interesting when we start combining
it with convolutions. So if you look at the image
that I created earlier on where I applied the filter
to that image of a person walking up the stairs,
and then I pool that, I get the image that's on
the right, which is 1/4 the size of the original image. But not only is it not
losing any vital information, it's even enhancing some of
the vital information that came out of it. So pooling is your friend when
you start using convolutions because, if you have 128
filters, for example, that you apply to
your image, you're going to have 128
copies of your image. You're going to have
128 times the data. And when you're dealing
with thousands of images, that's going to slow down your
training time really fast. But pooling, then,
really speeds it up by shrinking the
size of your image. So now, when we want to start
learning with a neural network, now it's a case of, hey,
I've got my image at the top, I can start applying
convolutions to that. Like for example, my image
might be a smiley face. And one convolution will
keep it as a smiley face. Another one might keep the
circle outline of a head. Another one might kind
of change the shape of the head, things like that. And as I start applying more
and more convolutions to these and getting smaller and smaller
images, instead of me now having a big, fat image
that I'm trying to classify, that I'm trying to pick out
the features of to learn from, I can have lots of little images
highlighting features in that. So for example, in
rock, paper, scissors, my convolutions might show,
in some cases, five fingers, or four fingers and a thumb. And I know that that's
going to be a paper. Or it might show none, and I
know that's going to be a rock. And it then begins to make the
process of the machine learning these much, much simpler. So to show this quickly-- I've been putting QR codes
on these slides, by the way. So I've open-sourced all the
code that I'm showing here and we're talking through. And this is a QR
code to a workbook where you can train
a rock, paper, scissors model for yourself. But once we do convolutions--
and earlier in the slide, you saw I had multiple
convolutions moving down. And this is what the code
for that would look like. I just have a convolution
layer, followed by a pooling. Another convolution,
followed by a pooling. Another convolution,
followed by a pooling, et cetera, et cetera. So the impact of that--
and remember, first of all, at the top, I have
my input shape, and I have my
output at the bottom where the Dense equals 3. So I'm going to switch
back to the demo now to see if it's
finished training. And we can see it. So we started off
with 33% accuracy. But as we went
through the epochs-- I just did this one, I
think, for 15 epochs-- it got steadily, and steadily,
and steadily more accurate. So after 15 loops of doing
this, it's now 96.83% accurate. So as a result, we can see,
using these techniques, using convolutions
like this, we've been actually able to train
something in just a few minutes to be roughly 97% accurate
at detecting rock, paper, and scissors. And if I just take a quick
plot here, we can see this is a plot of that accuracy-- the
red line showing the accuracy where we started at roughly 33%. And we're getting close to 100%. The blue line is I
have a separate data set of rock, paper, scissors
that I tested with, just to see how well it's doing. And it's pretty close. I need to do a little bit
of work in tweaking it. And I can actually try
an example to show you. So I'm going to upload a file. I'm going to choose a
file from my computer. I've nicely named
that file Paper, so you can guess it's a paper. And if I open that
and upload that, it's going to upload that. And then it's going
to give me an output. And the output is 1, 0, 0. So you think, ah,
I got it wrong. It detected it's a rock. But actually, my neurons
here, based on the labels, are in alphabetical order. So the alphabetical order
would be paper, then rock, then scissors. So it actually classified that
correctly by giving me a 1. So it's actually a paper. And we can try
another one at random. I'll choose a file
from my machine. I'll choose a scissors
and open that and run it. And again, paper,
rock, scissors, so we see it actually
classified that correctly. So this workbook
is online if you want to download it
and have a play with it to do classification
yourself and to see how easy it is for you to train
a neural network to do this. And then once you
have that model, you can implement that
model in your applications and then maybe play rock,
paper, scissors in your apps. Can we switch back to
the slides, please? So just to quickly show the
idea of how convolutions really help you with an image,
this is what that model looks like when I defined it. And at the top here, it might
look like a little bit of a bug at first, if you're
not used to doing this. But at the top here-- remember,
we said my image is coming in 150 x 150-- it's actually saying,
hey, I'm going to pass out an image that's 148 x 148. Anybody know why? Is it a bug? No, it's not a bug. OK. So the reason why
is if my filter was 3 x 3, for me to be able to look
at a pixel, I have to throw-- for me to start on
the image, I have to start one pixel in and one
pixel down in order for it to have neighbors. So as a result, I have to
throw away all the pixels at the top, at the bottom,
and either side of my image. So I'm losing one
pixel on all sides. So my 150 x 150
becomes a 148 x 148. And then when I pool that,
I halved each of the axes. So it becomes 74 x 74. Then through the next iteration,
it becomes 36 x 36, then 17 x 17, and then 7 x 7. So if you think about all
of these 150 squared images passing through all
of these convolutions are coming up with lots
of little 7 x 7 things. And those little 7
x 7 things should be highlighting a feature-- it might be a fingernail. It might be a thumb. It might be a shape of a hand. And then those features that
come through the convolutions are then passed into
the neural network that we saw earlier on to
generate those parameters. And then from those
parameters, hopefully, it would make a guess, and
a really accurate guess, about something being a
rock, a paper, or scissors. So if you prefer an IDE
instead of using Collab, you can do that also. I tend to really like to use
PyCharm for my developments. Any PyCharm fans
here, out of interest? Yeah, nice. A lot of you. So here's a
screenshot of PyCharm when I was writing this rock,
paper, scissors thing before I pasted it over to Collab, where
you can run it from Collab. So PyCharm is
really, really nice. And you can do things like
step-by-step debugging. If we can switch to the
demo machine for a moment. Now, I'll do a quick
demo of PyCharm doing step-by-step debugging. So here, we can see we're
in rock, paper, scissors. And for example,
if I hit the Debug, I can even set breakpoints. So now, I have a
breakpoint on my code. So I can start taking a
look at what's happening in my neural network code. Here, this is where I'm
preloading the data into it. And I can step
through, and I can do a lot of debugging
to really make sure my neural network
is working the way that I want it to work. It's one of the
things that I hear a lot from developers
when they first get started with
machine learning is that, this seems
to be your models are very much a black box. You have all this Python
code for training a model, and then you have to do
some rough guesswork. With TensorFlow
being open-sourced, I can actually step
into the TensorFlow code in PyCharm, like I'm
doing here, to see how the training is going on,
to help me to debug my models. And Karmel, later,
is also going to show how something called
TensorBoard can be used for debugging models. Can we switch back to
the slides, please? So with that in mind, we've
gone from really just beginning to understand what
neural networks are all about and basic
"Hello, world!" code to taking a look at how we
can use something called convolutions. And they're something that
sounds really complicated and really difficult. But
once you start using them, you'll see they're actually
very, very easy to use, particularly for image
and text classification. And we saw then how,
in just a few minutes, we were able to train
a neural network to be able to recognize rock, paper,
and scissors with 97%, 98% accuracy. So that's just getting started. But now, to show us how to
actually stretch the framework, and to make it real,
and to do really cool and
production-quality stuff, Karmel is going
to share with us. Thank you. [APPLAUSE] KARMEL ALLISON: Hi. So quick show of hands
for, how many of you was that totally
new, and now you're paddling as fast as you can
to keep your head above water? All right, a fair number of you. I'm going to go over, now,
some of the tools and features that TensorFlow has to take
you from when you've actually got your model to all the
way through production. Don't worry, there is
no test at the end. So for those of you who are just
trying to keep up right now, track these words,
store somewhere in the back of your head
that this is all available. For the rest of you where
you've already got a model and you're looking for more
that you can do with it, pay attention now. All right. So Laurence went through an
image classification problem. In slides, we love image
classification problems, because they look
nice on slides. But maybe your data isn't an
image classification problem. What if you've got categorical
data or text-based data? TensorFlow provides
a number of tools that allow you to take
different data types and transform them
before loading them into a machine learning model. In particular, for
example, here, maybe we've got some user
clickstreams, right? And we've got a user ID. Now, if we fed that directly
into a deep learning model, our model would expect that
that is real valued and numeric. And it might think that user
number 125 has some relation to user 126, even though in
reality, that's not true. So we need to be able
to take data like this and transform it into data
that our model can understand. So how do we do that? Well, in TensorFlow,
one of the tools that we use extensively inside
of Google are feature columns. These are configurations
that allow you to configure transformations
on incoming data. So here, you can see we're
taking our categorical column, user ID, and we're
saying, hey, this is a categorical column
when we pass in data for it. And we don't want the model to
use it as a categorical column. We want to transform this, in
this case, into an embedding, right? So you could do a
one-hot representation. Here, we're going to do an
embedding that actually gets learned as we train our model. This embedding and other columns
that you have can then get directly fed into Keras layers. So here, we have a
Dense Features layer that's going to take all
these transformations and run them when we
pass our data through. And this feeds directly
downstream into our Keras model so that when we pass
input data through, the transformations
happen before we actually start learning from the data. And that ensures
that our model is learning what we want
it to learn, using real-value numerical data. And what do you
do with that layer once you've got
it in your model? Well, in Keras, we provide
quite a few layers. Laurence talked you through
convolutional layers, pooling layers. Those are some of the
popular ones in image models. But we've got a
whole host of layers depending on what
your needs are-- so many that I couldn't fit them
in a single screenshot here. But there are RNNs,
drop out layers, batch norm, all sorts
of sampling layers. So no matter what
type of architecture you're building, whether
you're building something for your own small use case
and image classification model, whatever it is, or the latest
and greatest research model, there are a number
of built-in layers that are going to make
that a lot easier for you. And if you've got a custom
use case that's actually not represented in
one of the layers, and maybe you've got
custom algorithms or custom functionality, one of
the beauties of Keras is that it makes it
easy to subclass layers to build in your
own functionality. Here, we've got a Poincare
normalization layer. This represents a
Poincare embedding. This is not provided
out-of-the-box with TensorFlow, but a community member
has contributed this layer to the TensorFlow
add-ons repository, where we provide a number of
custom special use case layers. It's both useful, if you
need Poincare normalization, but also a very good example
of how you might write a custom layer to handle
all of your needs, if we don't have that
out-of-the-box for you. Here, you write the call
method, which handles the forward pass of this layer. So you can check out the
TensorFlow add-ons repository for more examples
of layers like this. In fact, everything in
Keras can be subclassed, or almost everything. You've got metrics,
losses, optimizers. If you need functionality that's
not provided out-of-the-box, we try to make it easy for you
to build on top of what Keras already provides, while still
taking advantage of the entire Keras and TensorFlow ecosystem. So here, I'm
subclassing a model. So if I need some custom
forward pass in my model, I'm able to do that
easily in the call method. And I can define custom training
loops within my custom model. This makes it easy to do-- in
this case, a trivial thing, like multiply by a magic number. But for a lot of
models where you need to do something that's
different than the standard fit loop, you're able to
customize in this way and still take advantage
of all the tooling that we provide for Keras. So one of the problems
with custom models and more complicated
models is it's hard to know whether
you're actually doing what you
think you're doing and whether your
model is training. One of the tools we provide
for Keras, and TensorFlow more broadly, is TensorBoard. This is a visualization tool. It's web based, and
it runs a server that will take in the
data as your model trains so that you can see real
time, epoch by epoch, or step by step, how
your model is doing. Here, you can see accuracy
and loss as the model trains and converges. And this allows you to track
your model as you train and ensure that you're
actually progressing towards convergence. And when you're using
Keras, you can also see that you get the full graph
of the layers that you've used. You can dig into those and
actually get the op-level graph in TensorFlow. And this is really helpful
in debugging, to make sure that you've correctly
wired your model and you're actually
building and training what you think you are training. In Keras, the way
you add this is as easy as a few lines of code. Here, we've got our TensorBoard
callback that we define. We add that to our
model during training. And that's going to
write out to the logs, to disk, a bunch of
different metrics that then get read in by
the TensorBoard web GUI. And as an added bonus, you
get built-in performance profiling with that. So one of the tabs
in TensorBoard is going to show you
where all of your ops are being placed, where you've
got performance bottlenecks. This is extremely
useful as you begin to build larger and
more models, because you will see that performance
during training can become one of the
bottlenecks in your process. And you really want
to make that faster. Speaking of performance,
this is a plot of how long it takes ResNet-50,
one of the most popular machine learning models for
image classification, to train using one GPU. Don't even ask how long
it takes with one CPU, because nobody likes
to sit there and wait until it finishes. But you can see that
it takes a better part of a week with one GPU. One of the beauties
of deep learning is that it is very
easily parallelizable. And so what we want to
provide as TensorFlow are ways to take this training
pipeline and parallelize it. The way we do that in
TensorFlow 2.0 is we're providing a series of
distribution strategies. These are going to make it
very easy for you to take your existing model code. Here, we've got a
Keras model that looks like many of
the others you've seen throughout this talk. And we're going to distribute
it over multiple GPUs. So here, we add the
mirrored strategy. With these few
lines of code, we're now able to distribute our
model across multiple GPUs. These strategies have been
designed from the ground up to be easy to use and to
scale with lots of different architectures and to give
you great out-of-the-box performance. So what this is actually doing-- here, you can see that with
those few lines of code, by building our model under the
strategy scope, what we've done is we've taken the model,
we've copied it across all of our different devices. In this picture, let's
say we've got four GPUs. We copy our model
across those GPUs, and we shard the input data. That means that
you're actually going to be processing the
input in parallel across each of your
different devices. And in that way,
you're able to scale model training approximately
linearly with the number of devices you have. So if you've got four GPUs,
you can run approximately four times faster. What that ends up
looking like-- on ResNet, you can see that we
get great scaling. And just out-of-the-box, what
you're getting with that is that your variables are getting
mirrored and synced across all available devices. Batches are getting prefetched. All of this goes into
making your models much more performant during training
time, all without changing code when you're using Keras. All right. And mirrored strategy with multi
GPUs is just the beginning. As you scale models, as we
do at Google, for example, you might want to use multiple
nodes and multiple servers, each of which have
their own set of GPUs. You can use the multiworker
mirrored strategy for that, which is going
to take your model, replicate it across
multiple machines, all working synchronously
to train your model, mirroring variables
across all of them. This allows you to train your
model faster than ever before. And this API is
still experimental, as we're developing it. But in TensorFlow 2.0, you'll be
able to run this out-of-the-box and get that great performance
across large scale clusters. All right. So everything I've
talked about so far falls under the heading
of training models. And you will find that
a lot of model builders only ever think about
the training portion. But if you've got a
machine learning model that you're trying to
get into production, you know that's
only half the story. There's a whole other
half, which is, well, how do I take what I've
learned and actually serve that to customers or to
whoever the end user is, right? In TensorFlow, the way
we do that is you're going to have to serialize
your model into a saved model. This saved model becomes the
serialized format of your model that then integrates with
the rest of the TensorFlow ecosystem. That allows you to deploy
that model into production. So for example, we've got a
number of different libraries and utilities that can
take this saved model. For TensorFlow
Serving, we're going to be able to take
that model and do web-based serving requests. This is what we use at Google
for some of our largest scale systems. TensorFlow Lite is for
mobile development. TensorFlow.js is a
web-native solution for serving your models. I'm not going to
have time to go over all of these in the
next few minutes, but I will talk about TensorFlow
Serving and TensorFlow Lite a little bit more. But first, how do you
actually get to a saved model? Again, in TensorFlow 2.0,
this is going to be easy and out-of-the-box where you're
going to take your Keras model, you call .save. And this is going to write
out the TensorFlow saved model format. This is a serialized
version of your model. It includes the entire graph,
and all of the variables, and weights, and everything
that you've learned, and it writes that out to
disk so that you can take it, pass it to somebody
else, let's say. You can load it
back into Python. You're going to get all of
that Python object state back, as you can see here. And you could continue to
train, continue to use that. You could fine-tune
based on that. Or you could take that model
and load it into TF Serving. So TensorFlow Serving responds
to gRPC or REST requests. It acts as a front end
that takes the requests, it sends them to your
model for inference. It's going to get
the result back. So if you're building a web
app for our rock, paper, scissors game, you
could take a picture, send it to your server. The server is going to ask
the model, hey, what is this? Send back the answer, based
on what the model found. And in that way, you get
that full round trip. TensorFlow Serving is
what we use internally for many of our largest
machine learning models. So it's been optimized to
have low latency and high throughput. You can check it out
at TensorFlow.org. There's an entire suite
of production pipelining and processing
components that we call TensorFlow Extended, or TFX. You can learn more about
those at TensorFlow.org, using that handy dandy
QR code right there. And maybe you've got a model,
and you've got your web app. But really, you want
it on a phone, right? Because the future is mobile. You want to be able
to take this anywhere. So TensorFlow Lite
is the library that we provide for converting
your saved model into a very tiny, small footprint. So that can fit on
your mobile device. It can fit on embedded devices-- Raspberry Pis, Edge TPUs. We now run these models across
a number of different devices. The way you do this is you
take that same saved model from that same model code
that you wrote originally. You use the TF Lite
converter, which shrinks the footprint of that model. And then it can be loaded
directly onto device. And this allows you to run
on-device, without internet, without a server in the
background, whatever your model is. And you can take
it, take TensorFlow, wherever you want to be. Now, we've run through,
really quickly, from some machine
learning fundamentals, through building your
first model, all the way through some of the
tools that TensorFlow provides for taking those and
deploying those to production. What do you do now? Well, there's a
lot more out there. You can go to google.dev. You can go to
TensorFlow.org where we've got a great
number of tutorials. You can go to GitHub. This is all open source. You can see the different
libraries there, ask questions, send PRs. We love PRs. And with that, I'd
like to say, thank you. LAURENCE MORONEY:
Thank you, very much. KARMEL ALLISON:
Laurence, back out. [APPLAUSE] [MUSIC PLAYING]