[MUSIC PLAYING] SARA ROBINSON: Hello, everyone. Welcome to Intro
to Machine Learning on Google Cloud Platform. My name is Sara Robinson. I'm a developer advocate on
the Google Cloud Platform Team, and I focus on machine learning. You can find me on
Twitter at @SRobTweets. So let's dive right
in by talking about-- what is machine learning? So at a high level,
machine learning involves teaching computers
to recognize patterns in the same way
that our brains do. So as humans, it's
really easy for us to distinguish between
a cat and a dog, but it's much more
difficult to teach a machine to be able to do the same thing. In this talk, I'm going
to focus on what's called supervised learning. This is when,
during training, you give your model labeled input. And we can think of almost
any supervised learning problem in this way. So we provide labeled
inputs to our model, and then our model
outputs a prediction. Now, the amount that we know
about how our model works is going to depend
on the tool we choose to use for the job and
the type of machine learning problem
we're trying to solve. So that was a high
level overview, but how do we actually get
from input to description? Again, this is going to
depend on the type of machine learning problem
we're trying to solve. So on the left-hand
side, let's say you're solving a generic
task that someone has already solved before. In that case, you don't
need to start from scratch. But on the other
side of the spectrum, let's say you're solving
a custom task that's very specific to your data set. Then you're going to
need to build and train your own model from scratch. So more specifically,
let's think of this in terms of
image classification. So let's say we want
to label this image as a picture of a cat. This is a really common
machine learning task. There's tons of
models out there that exist to help us label images. So we can use one of
these pre-trained models. We don't need to
start from scratch. But let's say, on
the other hand, that this cat's name is
Chloe, this is our cat, and we want to
identify her apart from other cats across
our entire image library. So we're going to need to train
a model using our own data from scratch so that it
can differentiate Chloe from other cats. More specifically, let's
say that we want our model to return a bounding box showing
where she is in that picture. We're also going to need to
train a model on our own data. Let's also think about this
in terms of a natural language processing problem. So let's say we have this
text from one of my tweets, and I want to extract the
parts of speech from that text. This is a really common natural
language processing task, so I don't need to
start from scratch. I can utilize an existing
NLP model to help me do this. But let's say, on
the other hand, that I want to take
the same tweet. And I want my model
to output these tags. I want my model
to know that this is a tweet about programming,
and more specifically is a tweet about Google Cloud. I'm gonna need to train my
model on thousands of tweets about each of these
things so that it can generate these predictions. So many people see the
term, machine learning, and they're a little
bit scared off. They think it's something
that's only for experts. Now, if we look back
about 60 years ago, this was definitely the case. This is a picture of the
first neural network invented in 1957 called a perceptron. And this was a device that
demonstrated an ability to identify different shapes. So back then, if you wanted
to work on machine learning, you needed access to extensive
academic and computing resources. But if we fast
forward to today, we can see that just in the
last five or 10 years, the number of products at Google
that are using machine learning has grown dramatically. And at Google, we want
to put machine learning into the hands of any
developer and data scientist with a computer and a
machine learning problem that they want to solve. That's all of you. We don't think machine
learning should be something only for experts. So if you think about how you're
doing machine learning today, maybe you're using a
framework like Scikit-learn, XGBoost, Keras, or
TensorFlow, maybe you're writing your code in
Jupyter Notebooks, maybe you're just experimenting
with different types of models, building proofs of
concept, or maybe you've already built a model
and you're working on scaling it for production. So what I want you to take
away from this talk is that no matter what your existing
machine learning toolkit is, we've got something to help
you on Google Cloud Platform. And that's what I'm gonna
talk to you about today. So we have a whole spectrum
of machine learning products on GCP. On the left-hand side,
we have products targeted towards application developers. To use these, you need little to
no machine learning experience. On the right-hand
side of the spectrum, we have products that
are targeted more towards data scientists and
machine learning practitioners. So the first set of
products I'll talk about is our machine learning APIs. These are APIs that give you
access to pre-trained models with a single REST API request. As we move toward the middle,
we have a new product, which I'm super
excited about, that we announced earlier this year
in January called AutoML. It's currently an alpha,
and the first AutoML product we've announced
is AutoML Vision. So this let's you build a
custom image classification model trained on your own
data without requiring you to write any of the model code. As we move further to the right,
towards more custom models, if you want to build
your model in TensorFlow, we have a service called
Cloud Machine Learning Engine to let you train and
serve your model at scale. Then a couple of months ago,
we announced an open source project called Kubeflow. This lets you run your
machine learning workloads on Kubernetes. And then finally, our most
do-it-yourself solution is-- let's say you
have a machine learning framework other than TensorFlow
and you want to run it on GCP. You can do this using Google
Compute Engine or Google Kubernetes Engine. But we don't have
tons of time today, so we're going to focus
just on three of these. And let's dive right in,
starting with machine learning as an API. On Google Cloud Platform,
we have five APIs that give you access
to pre-trained models to accomplish common
machine learning tasks. So they let you do things like
analyzing images, analyzing videos, converting
audio to text, analyzing that text further,
and then translating that text. I'm going to show you
just two of them today. So let's start
with Cloud Vision. This is everything the
Vision API lets you do. So at its core, the Vision
API provides label detection. This will tell you-- what
is this a picture of? So for this image, it might
return elephant, animal, et cetera. Web deception goes
one step further. It will search the web
for similar images. And it will give you labels
based on what it finds. OCR stands for Optical
Character Recognition. This will let you find
text in your images, tells you where that text is
and what language it's in. Logo detection will identify
common logos in an image. Landmark detection will
find common landmarks. Crop hints will suggest crop
dimensions for your photos. And then explicit
content detection will tell you-- is this
image appropriate or not. This is what a request to
the Vision API looks like. So, again, you don't
need to know anything about how that pre-trained
model works under the hood. You pass it either a URL to
your image in Google Cloud Storage or just the
base64 encoded image, and then you tell it what
types of feature detection you want to run. Now, it's just a REST
API, so you can call it in any language you'd like. I have an example in Python here
using our Google Cloud client library for Python. I create an
ImageAnnotatorClient. And then I run label detection
and analyze the response. I'm guessing many of you
heard about ML Kit announced a couple days ago. So if you want to use
ML Kit for Firebase, you can easily
call the Vision API from your Android or iOS app. And this is an example
of calling it in Swift. So I don't like to get
too far into a talk without showing a live demo. So if we could
switch to the demo. And what we have here
is the product page for the Vision API. Here we can upload
images and see what the Vision API response. So I'm going to
upload an image here. This is a selfie that
I took seeing Hamilton. I live in New York. I was super excited to have
scored tickets to Hamilton. And I wanted to
see what the Vision API said about my selfie. So let's see what
we get back here. It's cranking. Live demos, you never
know how they'll go. There we go, OK. So we can see that it
found my face in the image. So it's able to identify
where my face is, different features in
my face, and it's also able to detect emotion. So we can see that joy
is very likely here. I was super excited
to be seeing Hamilton. What I didn't notice at
first about this image is that it has some text in it. When I sent it to
the Vision API, I didn't even notice
that there was text here, but we can see that
the API, using OCR, is able to extract that
Playbill text from my image. And then finally
in the browser, we can see the entire JSON
response we get back from the Vision API. So this is a great way
to try out the API. Before you start
writing any code, upload your images
in the browser, see the type of
response you get back, and if it's right for your app. And I'll provide a link
to this at the end. So that is the Vision API. If we could go
back to the slides. Next, I'm gonna talk
about the Natural Language API, which lets you analyze text
with a single REST API request. So the first thing
it lets you do is extract key
entities from text. It also tells you whether your
text is positive or negative. And then if you want to get
more into the linguistic details of your text, you can use
the syntax analysis method. And finally, the newest
feature of this API is content classification. It will classify
your text into many of over 700 different categories
that we have available. Here is some Python code to
call the Natural Language API. It's going to look pretty
similar to the Vision API code we saw on the previous page. And again, we don't
need to know anything about how this model works. We just send it our text and get
back the result from the model. Now, let's jump to a demo
of the Natural Language API. So again, this is our products
page for the Natural Language API. And here we can enter text
directly in this textbox and see what the Natural
Language API responds. So I'm going to say,
"I loved Google I/O, but the ML talk was just OK." We'll see what it says. This is a review I
might find on a session. And let's say that I didn't want
to go through all the sessions, but I wanted to
extract key entities and see what the sentiment was. So we can see here it's
extracted two entities, Google I/O and ML talk, and we get
a score for each entity. The score is a value
from negative one to one that'll tell us overall
how positive or negative is a sentiment in this entity. So Google I/O got 0.8,
almost fully positive. We even get a link to the
Wikipedia page for Google I/O. And then ML talk got
a neutral score right in the middle of zero
because it was just OK. We also get the aggregate
sentiment for this sentence. And we can also look at
the syntax and details, see which words depend on other
words, get the parts of speech for all the words in our text. And then if our text was
longer than 20 words, we can make use of this
content categorization feature, which you can also
try right in the browser. And you can see a list
of all the categories that are available for that. So that is the
Natural Language API. If we could go
back to the slides. So I'm gonna talk briefly
about some companies that are using these APIs in production. Giphy is a website that
lets you search for GIFs and share them across the web. They use the Vision API's
optical character recognition feature to add search
by text functionality to all of their GIFs. So as you know, lots of
GIFs have text in them. Before they used the Vision API,
you couldn't search by text, and now they've dramatically
increased the accuracy of their search results. Hearst is a large
publishing company. And they're using the Natural
Language API's content categorization
method across over 30 of their media properties to
categorize their news articles. Descript is a new app
that lets you transcribe meetings and interviews. And they're using the Speech
API for that transcription. So all of these three companies
are using just one API. We have also seen
a lot of examples of companies combining
different machine learning APIs. So Seenit is a crowd
source video platform that got thousands of
videos uploaded daily, so they have no time to
manually tag those videos. So they're using a combination
of Video Intelligence, Vision, Speech, and Natural Language
to automatically tag all of that video content that
they're seeing on the platform. Maslo is a new mobile app. It's an audio journaling app. So you can enter your journal
entries through audio. And they're using the Speech
API to transcribe that audio. And then they're using
the Natural Language API to extract key
entities, give you some insights about
your journal entries, and they're doing all that
processing with Cloud Functions and storing the
data in Firestore. So that is the
Natural Language API. So all of the products
I talked about so far have kind of abstracted
that model for you. And a lot of times when
I present on the APIs, people ask me, those
APIs sounds great, but what if I want to train
them on my own custom data? So we have this new
product that I mentioned, called AutoML Vision. It's currently an
alpha, so you need to be whitelisted to use it. This lets you train an
image classification model on your own image data. And this is best
seen with a demo, so I'm gonna introduce it first. So for this demo, let's say
that I'm a meteorologist. Maybe I work at a
weather company. And I want to predict
weather trends and flight plans from images of clouds. So the obvious next question
is, can we use the cloud to analyze clouds? And as I've learned, there's
many, many different types of clouds, and they all indicate
different weather patterns. So when I first started
thinking about this demo, I thought, maybe I should
try the Vision API first and see what I get back. So as humans, if we look
at these two images, it's pretty obvious to us
that these are completely different types of clouds. The Vision API,
we wouldn't expect it to know specifically what
type of cloud these images are. It was trained across a
wide variety of images to recognize all sorts of
different products, categories, but nothing as specific as
cloud types for these images. So you can see, even
for these images of obviously different
clouds, we get back pretty similar labels-- sky, cloud, blue, et cetera. So this is where AutoML
Vision comes in really handy. AutoML Vision provides
a UI to help us with every step of training
our machine learning model, from importing the data,
labeling it, training it, and then generating
predictions using a REST API. So the best way to see it is
by, again, jumping to a demo. So here we have the
UI for AutoML Vision. Again, it takes us through every
step of building our model. So the first step here
is importing our data. To do that, we put our data
in Google Cloud Storage, we put our images in
Google Cloud Storage, and then we just
create a CSV, where the first column
of the CSV is going to be the URL of our image,
and then the next column will be the label or labels
associated with that image. One image can have
multiple labels. So then we upload
our images here. I've already done
that for this example. Let me move over to
the Labeling tab. And in this model, I've got
five different types of clouds that I'm classifying. We can see them here, and
we can see how many images I have for each one. AutoML Vision only requires
10 images per label to start training, but
they recommend at least 100 for high quality predictions. So the next step is to
review my Image labels. So I can see all my
cloud pictures here. I can see what label they are. If this one was incorrect,
I could go in here and switch it out. Well, let's say that I didn't
have time to label my data set, or I have a massive
image data set, I don't have time to label it. I can make use of this human
labeling service, which gives you access to a set of
in-house human labelers that will label your images for you. And then in just
a couple of days you'll get back a
labeled image data set. So the next part, once you've
got your labeled images, is to train your
model, and you can choose between a base
or advanced model. I'll talk about that
more in a moment. And to train it, all you do
is press this Train button. It's as simple as that. Once your model is trained,
you'll get an email. And the next thing
you want to do is you want to see how
this model performed using some common
machine learning metrics. So I'm not going to go
into all of them here, but I do want to highlight
the confusion matrix. So if it looks confusing, it's
called a confusion matrix, but let's take a look
at what it means. So for a good confusion matrix,
what we ideally want to see is a strong diagonal
from the top left. So what this is telling
us is when we uploaded our images to AutoML,
it automatically split our images for us
into training and testing. So it took most of our images,
used those to train the model, and then it reserved
a subset of our images to see how the model performed
on images that it had never seen before. So what this is telling us is
that for all of our altocumulus cloud images in our
test set, our model was able to identify
76% of them correctly, which is pretty good. Now on the Train tab, you saw
the base and advanced models, and I've actually trained both. So we're looking at the
advanced model here. I can use the UI to see how
different versions of my model performed and compare it. So we would expect that the
advanced model would perform better across the board. So let's take a look. So it looks like it did, indeed,
perform a lot better for most all of the categories here. You can see a 23% increase for
this one, 11% for this one, but, hey, wait. If we look at our
altostratus images, it looks like our advanced model
actually did worse on that. What this is
actually pointing out is that there may be some
problems with our training images here. So the advanced model
has done a better job of identifying where
there is potentially problems with our training
data set, because, remember, our model is only as good as the
training data that we give it. So if we look here, we see that
14% of our altostratus clouds are being mislabeled
as cumulus clouds. And if I click on this,
I can look and see where my model is
getting confused. And it turns out
that these are images that I could have potentially
labeled incorrectly. I'm not an expert
on actual clouds. Some of these images
have multiple types of clouds in them,
so they actually are pretty confusing images. So what the confusion
matrix can help me do is identify where I need to go
back and improve my training data. So that's the Evaluate tab. The next part, the most
fun part in my opinion, is generating
predictions on new data. So I'm going to take this
image of a cirrus cloud, and we're going to see
what our model predicts. Again, it has never
seen this image before. It wasn't used during training. And we'll see how it performs. There we go. So we can see that our
model is 99% confident this is a cirrus cloud, which
is pretty cool considering it's never seen this image before. So the UI is one way that
you can generate predictions once your model
has been trained. It's a good way to try it out
right after training completes. But chances are
you probably want to build an app that's going
to query your trained model. And there's a couple
different ways to do this. I want to highlight
the Vision API here. So if you remember
the Vision API request from a couple
of slides back, you'll notice that this
doesn't look much different. All I need to add is this custom
label detection parameter. And then once my model trains,
I get an ID for that trained model that only I have
access to or anybody that I've shared
my project with. So if I have an app,
let's say it's just detecting whether or not
there is a cloud in an image. Then let's say I want
to upgrade my app. I want to build a custom model. I don't have to change much at
all about my app architecture to now point it to
my custom model. I just need to modify the
request.JSON a little bit. So that is AutoML Vision. If we could go
back to the slides. A little bit about
some companies that are using AutoML Vision
and have been part of the alpha. Disney is the first example. They trained a model on
different Disney characters, product categories, and colors. And they're integrating that
into their search engine to give users more
accurate results. Urban Outfitters is
a clothing company. They've got a similar use case. They trained a
model to recognize different types of clothing
patterns and neckline styles. And they're using that to create
a comprehensive set of product attributes. And they're using it to
improve product recommendations and give users better
search results. The last example is the
Zoological Society of London. They've got cameras
deployed all over the wild. And they built a model
to automatically tag all of the wildlife that they're
seeing across those cameras so that they don't need
somebody to manually review it. So the two products I've
talked about so far, the APIs and AutoML,
have entirely abstracted the model from us. We don't know anything
about how that model works under the hood. But let's say you've got
a more custom prediction task that's specific to
your data set or use case. So one example is-- let's say you have just launched
a new product at your company, you're seeing it posted
a lot on social media, and you want to identify
where in an image that product is located. You're gonna need to train
a model on your own data to do that. Or let's say you've got
a lot of logs coming in, and you want to analyze
those logs to find anomalies in your application. These would all require a custom
model trained on your own data. We've got two tools
to help you do this-- TensorFlow for building your
models and Machine Learning Engine for training and
serving those models at scale. So from the beginning,
the Google Brain team wanted everyone in
the industry to be able to benefit from all of
the machine learning products they were working on. So they made TensorFlow an
open source project on GitHub. And the uptake has
been phenomenal. It is the most popular machine
learning project on GitHub. It has, I believe, over
90,000 GitHub stars. I actually need to
update this slide. It just crossed over
13 million downloads. And because it's
open source, you can train and serve your
TensorFlow models anywhere. So once you've built
your TensorFlow model, you need to think about
training it and then generating predictions at
scale, also known as serving. So if your app
becomes a major hit, you're getting thousands of
prediction requests per minute, you're going to
need to find a way to serve that model at scale. And again, because
TensorFlow is open source, you can train and serve your
TensorFlow models anywhere. But this talk is about
Google Cloud Platform, so I'm gonna talk about our
managed service for TensorFlow. So on Cloud Machine
Learning Engine, you can run distributed
training with GPUs and TPUs. TPUs are custom hardware
designed specifically to accelerate machine
learning workloads. And then you can also
deploy your trained model to Machine Learning
Engine, and then use the ML Engine API to access
scalable online and batch prediction for your model. One of the great
things about ML Engine is that there's no lock-in. So let's say that I want
to make use of ML Engine for training my model, but then
I want to download my model and serve it somewhere else. That's really easy to do, and
I can do the reverse as well. So I'm going to talk about
two different types of custom models using TensorFlow running
on Cloud Machine Learning Engine. The first type of custom
model I'm gonna talk about is transfer learning,
which allows us to update an existing trained
model using our own data. And then I'm gonna talk about
training a model from scratch using only your data. So transfer learning is great
if you need a custom model, but you don't have quite
enough training data. So what this lets us do is it
lets us utilize an existing pre-trained model that's
been trained on hundreds of thousands, maybe
millions, of images or text to do something similar to
what we're trying to do. And then we take the weights
of that trained model and update the last couple
of layers with our own data, so that the outcome is output
that's generating predictions on our own training data. So I wanted to build an
end-to-end example showing how to train a model and then
go all the way to serving it and building an app
that generates predictions against it. I know a little
bit of Swift, so I decided to build an iOS app that
could detect breeds of pets. This is a GIF of what
the app looks like. So you upload a
picture of your pet, it's able to detect where
the pet is an image, and what type of breed it is. To build that model, I use the
TensorFlow Object Detection API, which is a library that's
built on top of TensorFlow. It's open source to what
you do, specifically object detection, which is identifying
a bounding box of where something is in an image. I trained the model on
Machine Learning Engine. I also served it on
Machine Learning Engine. And then I used a
couple of Firebase APIs to build a frontend for my app. So this is a full
architecture diagram. And the iOS client is
actually a pretty thin client. What it's doing is
it's uploading images to Cloud Storage for Firebase. And then I've got a
Cloud Function that's listening on that
bucket, so that function is going to be triggered
anytime an image is uploaded. It then downloads the
image, base64 encodes it, sends it to Machine Learning
Engine for prediction. And then the
prediction that I get back is going to be a
confidence value, a label, and then bounding box data
for where exactly that pet is in my image. So then I write that new
image with the box around it to Cloud Storage,
and then I store the metadata in Firestore. So here's an example showing
how the frontend works. So this is a screenshot
of Firestore, and we can see that whenever
my image data is uploaded to Firestore, I write the new
image with the box around it to a Cloud Storage bucket. So that was an example
of transfer learning. But let's say that we
have a custom task, and we've got enough
data to build and train a model entirely from scratch. For this, I'm going to show you
a demo I built of a model that predicts the price of wine. So I wanted to see, given a
wine's description and variety, can we predict the price? So this is what an example input
and prediction for our model would be. So one reason this is
well-suited for machine learning is I can't
write rules to determine what the price of
this wine should be. So I can't say that
any wine with vanilla in the description
and Pinot noir is going to be a mid-price wine,
whereas a fruity chardonnay is always going to be a cheap wine. I can't write rules to do that. So I wanted to see if I could
build a machine learning model to extract insights
from this data. As I mentioned, because
I'm training this model from scratch, there's no
existing model out there that does exactly this task
that I want to perform. I'm gonna need a lot of data. I use Kaggle to get the data. Kaggle is a data
science platform. It's part of Google. If you're new to machine
learning and looking for interesting data
sets to play around with, I definitely recommend
checking out Kaggle. So Kaggle has this
wine reviews data set. I found this data set on
150,000 different kinds of wine. For each wine, it has a lot
of data on the description-- the points rating,
the price, et cetera. So in this I'm just using
the description, the variety, and the price for this model. So the next step is
to choose the API I'm going to use to
build my model, which TensorFlow API, and then the
type of model I want to use. So I chose to use the
tf.keras API here. So Keras is an open-source
machine learning framework created by Francois Chollet,
who works at Google, and the TensorFlow API includes
a full implementation of Keras. So Keras is a higher-level
TensorFlow API that makes this easy for us to
define the layers of our model. You can also use a
lower-level TensorFlow API if you'd like more control
over the different operations in your model graph. So I chose to use
the tf.keras API, and then I chose to build
a wide and deep model to solve this task. Now wide and deep is
basically just a fancy way of saying I'm going to represent
the inputs to my model in two different ways. I'm going to use a
linear model, that's the wide part that's good
at memorizing relationships between different inputs. And then I'm going to use
a deep neural net, which is really good at
generalizing based on data that it hasn't seen before. Put it another way, the
input to our wide model is going to be
sparse wide vectors, and the input to our
deep model is going to be dense, embedding vectors. Now let's take a look at
what the code for our model will look like. So the first step, this is all
using the Keras functional API, is to define the wide model. And to represent my
text in my wide model, I'm gonna use what's called a
bag of words representation. So what this does is it takes
all the words that occur across all my wine descriptions, and
I chose to use the top 12,000. This is kind of a hyperparameter
that you can tune. So each input to my model is
going to be a 12,000-element vector with ones and zeros,
indicating whether or not the word from my vocabulary
is present in that specific description. So it doesn't take into
account the order of words or the relationship
between words. This wide model just takes
into account whether or not this word from my
vocabulary is present in that specific description. So this type of input is
going to look like this. It's going to be a
12,000-element bag of words vector. The way I'm representing
my variety in my wide model is pretty simple. I've got 40 different wine
varieties in my data set, so this is just going to be
a 40-element one-hot vector, with each index in the
vector corresponding to a different variety of wine. Then I'm going to concatenate
these input layers. And output to my
model is just a float indicating what the
price of that wine is. So if I just wanted
to use the wide model, I could take the wide
model that I have here and run training and
evaluation on it using Keras, but I found that I
had better accuracy when I combined wide and deep. So the next step is to
build my deep model. And for my deep
model, what I'm using is called an embedding layer. What word embeddings
let us do is they define the relationships
between words in vector space. So words that are
similar to each other are going to be closer
together in vector space. There's lots of reading out
there on word embedding, so I'm not going to focus
on the details here. The way it works is I can
choose the dimensionality of that space. Again, that's another
hyperparameter of this model, so that's something
I should tune and see what works best for my data set. So in this case, you can see
I use an eight dimensional embedding space. And obviously I can't feed the
text directly into my model. I need to put it into a format
that the model can understand. So what I did was I encoded
each word as an integer. So this is what the
input to my deep model is going to look like. And obviously the inputs
need to be the same length, but not all of my descriptions
are the same length, so I'm using 170 as a length. None of my descriptions
are longer than 170 words. So if they're shorter,
I'll just pad that vector with zeros at the
end, as I did here. The output is going
to be the same. We're still
predicting the price. And then using the
Keras functional API, it's relatively
straightforward to then combine the outputs from
both of these models and then to find
my combined model. And so now it's time to
run training and evaluation on that model. I chose to do that using
Machine Learning Engine. So the first step here
is to package my code and put it in Google
Cloud Storage. This is pretty simple. So I put my model code
in a trainer directory, I put my wine data
in a data directory, and then I have
this setup.py file, which is defining
the dependencies that my model is going to use. And then to run
that training job, I can use gcloud, which
is our Google Cloud CLI, for interacting with a
number of different Google Cloud products. So I run this gcloud command
to kick off my training job, and then I save my model
file, which is a binary file, to Google Cloud Storage. And let's see a demo of
generating some predictions on that model. So for this demo, I'm
going to use this tool called Colab, Google Colab. I'll have a link
to it at the end. And this is a cloud-hosted
Jupyter Notebook that comes with a free GPU. And so here, I've
already trained m. Model, and what I've done
is I saved that model file in Google Cloud Storage here. And now I'm going to generate
some predictions to see how it performs on some raw data. Let me make this a
little bit bigger. So here, we're just
importing some libraries that we're going to
use, unloading my model. Next step is to load the
tokenizer for my model. So this is just an index
associating all the words in my vocabulary with a number. And then I'm loading
my variety encoder too. I'm gonna ignore that warning. And here, I'm going to
load in some raw data. So what I have here is data
on five different wines. I've got the description,
the variety, and then the associated price
for each of these wines. And I wanted to show
you what the input looks like for each of these. So now we need to
encode each of these into the wide and deep format
that our model is expecting. So to do that, we have
this vocabulary look-up. I'm just printing a
subset of it here. So this is going to be
12,000 elements long for the top 12,000
words in our vocabulary. And Keras has some
utilities to help us extract those top words. And then my variety encoder
is just a 40-element array that looks like this, with
each index corresponding to a different variety. So let's see what the input
to our wide model looks like. So the first thing is our text. And this is a bag
of words vector. So it's a 12,000-element vector
with zeros and ones indicating the presence or absence
of different words in our vocabulary. And then the variety matrix-- I'm just printing it
out for the first one. So I believe that first one was
a Pinot noir, which corresponds with this index here. And if we look up here,
we can confirm that, yes, that was a Pinot noir. So that's the input
to our wide model. And then our deep model, we're
just encoding all of the words from that first
description as integers and padding it with
zeros since this wasn't a very long description. And then all I need to do
to generate predictions on my Keras model
is call .predict. And what I'm doing
here is I'm now going to loop through
those predictions and see how the model performed
compared to the actual price. So we can see that it
did a pretty good job on the first one,
46 compared to 48. This one was about $30
off, but it was still able to understand that this
was a relatively higher priced wine. And it did pretty well on
the rest of these as well. So I'll have a link
to this at the end. Colab requires zero setup, so
you can all go to this URL, enter your own
wine descriptions, and see how the model performs. If we can go back to the slides. I'm going to talk briefly
about a few companies that are using TensorFlow and
Machine Learning Engine in production to
build custom models. The first example
is Rolls-Royce. They're using a custom
TensorFlow model in their marine division
to identify and track the objects that a vessel
can encounter at sea. Ocado is the UK's largest
grocery delivery service. So as you can imagine,
they get tons and tons of customer support
emails every day. And they built a custom model. It's able to take those emails
and predict whether it requires an urgent response
or maybe the email doesn't require a
response at all. And they've been able to save
a lot of customer support time using that model. And then finally, Airbus
has built a custom image classification model to identify
things in satellite imagery. So I know I covered a ton
of different products. I wanted to give you a
summary of what resources all of these products require. So these are just
four resources that I thought of that
you need to solve a machine learning problem. There's probably more
that I don't have on here. First is training data. How much training
data do you need to provide to train your model? How much model code do you need? How much training or
serving infrastructure do you need to provision? And then overall, how long is
this task going to take you? So if we look at
our Machine Learning APIs, great thing
about these-- you don't need any training data. You can just start generating
prediction on one image. You don't need to write
any of the model code. You don't need to provision
any training or serving infrastructure. And you can probably get started
with these in less than a day. Very little time. If we move to AutoML, the
cool thing about AutoML is that you will provide some
of your own training data, because you're going
to be creating a more customized model. And it will take
a little more time because you will need to
spend some time processing those images, uploading
them to the Cloud, and maybe labeling them. You still don't need any model
code or any training or serving infrastructure. And then finally, if we think
about a custom model, built with TensorFlow, running
on Cloud ML Engine, you will need a lot
of training data. You will have to write a lot
of the model code yourself. And you'll need to
think about whether you want to run your training
and serving jobs on premise, or if you want to use a managed
service like Cloud ML Engine to do that for you. And obviously this process will
take a little bit more time. Finally, so if you
remember only three things from this presentation--
so I know I covered a lot. First thing, you can
use a pre-trained API to accomplish common machine
learning tasks, like image analysis, natural language
processing, or translation. Second thing, if you want to
build an image classification API trained on your own
data, use AutoML Vision. I'm really interested to hear
if you have a specific use case for AutoML Vision. Come find me after. I will be in the
Cloud Sandbox area. Last thing, for custom
machine learning tasks, you can build a TensorFlow
model with your own data and train and serve it on
Machine Learning Engine. So here is a lot of great
resources covering everything I talked about today. I'll let all of you take
a picture of that slide. The video will also
be up after, so you can grab it from there as well. And that's all I've got. Thank you. [MUSIC PLAYING]