Intro to machine learning on Google Cloud Platform (Google I/O '18)

Video Statistics and Information

Video

Captions Word Cloud

Captions

[MUSIC PLAYING] SARA ROBINSON: Hello, everyone. Welcome to Intro to Machine Learning on Google Cloud Platform. My name is Sara Robinson. I'm a developer advocate on the Google Cloud Platform Team, and I focus on machine learning. You can find me on Twitter at @SRobTweets. So let's dive right in by talking about-- what is machine learning? So at a high level, machine learning involves teaching computers to recognize patterns in the same way that our brains do. So as humans, it's really easy for us to distinguish between a cat and a dog, but it's much more difficult to teach a machine to be able to do the same thing. In this talk, I'm going to focus on what's called supervised learning. This is when, during training, you give your model labeled input. And we can think of almost any supervised learning problem in this way. So we provide labeled inputs to our model, and then our model outputs a prediction. Now, the amount that we know about how our model works is going to depend on the tool we choose to use for the job and the type of machine learning problem we're trying to solve. So that was a high level overview, but how do we actually get from input to description? Again, this is going to depend on the type of machine learning problem we're trying to solve. So on the left-hand side, let's say you're solving a generic task that someone has already solved before. In that case, you don't need to start from scratch. But on the other side of the spectrum, let's say you're solving a custom task that's very specific to your data set. Then you're going to need to build and train your own model from scratch. So more specifically, let's think of this in terms of image classification. So let's say we want to label this image as a picture of a cat. This is a really common machine learning task. There's tons of models out there that exist to help us label images. So we can use one of these pre-trained models. We don't need to start from scratch. But let's say, on the other hand, that this cat's name is Chloe, this is our cat, and we want to identify her apart from other cats across our entire image library. So we're going to need to train a model using our own data from scratch so that it can differentiate Chloe from other cats. More specifically, let's say that we want our model to return a bounding box showing where she is in that picture. We're also going to need to train a model on our own data. Let's also think about this in terms of a natural language processing problem. So let's say we have this text from one of my tweets, and I want to extract the parts of speech from that text. This is a really common natural language processing task, so I don't need to start from scratch. I can utilize an existing NLP model to help me do this. But let's say, on the other hand, that I want to take the same tweet. And I want my model to output these tags. I want my model to know that this is a tweet about programming, and more specifically is a tweet about Google Cloud. I'm gonna need to train my model on thousands of tweets about each of these things so that it can generate these predictions. So many people see the term, machine learning, and they're a little bit scared off. They think it's something that's only for experts. Now, if we look back about 60 years ago, this was definitely the case. This is a picture of the first neural network invented in 1957 called a perceptron. And this was a device that demonstrated an ability to identify different shapes. So back then, if you wanted to work on machine learning, you needed access to extensive academic and computing resources. But if we fast forward to today, we can see that just in the last five or 10 years, the number of products at Google that are using machine learning has grown dramatically. And at Google, we want to put machine learning into the hands of any developer and data scientist with a computer and a machine learning problem that they want to solve. That's all of you. We don't think machine learning should be something only for experts. So if you think about how you're doing machine learning today, maybe you're using a framework like Scikit-learn, XGBoost, Keras, or TensorFlow, maybe you're writing your code in Jupyter Notebooks, maybe you're just experimenting with different types of models, building proofs of concept, or maybe you've already built a model and you're working on scaling it for production. So what I want you to take away from this talk is that no matter what your existing machine learning toolkit is, we've got something to help you on Google Cloud Platform. And that's what I'm gonna talk to you about today. So we have a whole spectrum of machine learning products on GCP. On the left-hand side, we have products targeted towards application developers. To use these, you need little to no machine learning experience. On the right-hand side of the spectrum, we have products that are targeted more towards data scientists and machine learning practitioners. So the first set of products I'll talk about is our machine learning APIs. These are APIs that give you access to pre-trained models with a single REST API request. As we move toward the middle, we have a new product, which I'm super excited about, that we announced earlier this year in January called AutoML. It's currently an alpha, and the first AutoML product we've announced is AutoML Vision. So this let's you build a custom image classification model trained on your own data without requiring you to write any of the model code. As we move further to the right, towards more custom models, if you want to build your model in TensorFlow, we have a service called Cloud Machine Learning Engine to let you train and serve your model at scale. Then a couple of months ago, we announced an open source project called Kubeflow. This lets you run your machine learning workloads on Kubernetes. And then finally, our most do-it-yourself solution is-- let's say you have a machine learning framework other than TensorFlow and you want to run it on GCP. You can do this using Google Compute Engine or Google Kubernetes Engine. But we don't have tons of time today, so we're going to focus just on three of these. And let's dive right in, starting with machine learning as an API. On Google Cloud Platform, we have five APIs that give you access to pre-trained models to accomplish common machine learning tasks. So they let you do things like analyzing images, analyzing videos, converting audio to text, analyzing that text further, and then translating that text. I'm going to show you just two of them today. So let's start with Cloud Vision. This is everything the Vision API lets you do. So at its core, the Vision API provides label detection. This will tell you-- what is this a picture of? So for this image, it might return elephant, animal, et cetera. Web deception goes one step further. It will search the web for similar images. And it will give you labels based on what it finds. OCR stands for Optical Character Recognition. This will let you find text in your images, tells you where that text is and what language it's in. Logo detection will identify common logos in an image. Landmark detection will find common landmarks. Crop hints will suggest crop dimensions for your photos. And then explicit content detection will tell you-- is this image appropriate or not. This is what a request to the Vision API looks like. So, again, you don't need to know anything about how that pre-trained model works under the hood. You pass it either a URL to your image in Google Cloud Storage or just the base64 encoded image, and then you tell it what types of feature detection you want to run. Now, it's just a REST API, so you can call it in any language you'd like. I have an example in Python here using our Google Cloud client library for Python. I create an ImageAnnotatorClient. And then I run label detection and analyze the response. I'm guessing many of you heard about ML Kit announced a couple days ago. So if you want to use ML Kit for Firebase, you can easily call the Vision API from your Android or iOS app. And this is an example of calling it in Swift. So I don't like to get too far into a talk without showing a live demo. So if we could switch to the demo. And what we have here is the product page for the Vision API. Here we can upload images and see what the Vision API response. So I'm going to upload an image here. This is a selfie that I took seeing Hamilton. I live in New York. I was super excited to have scored tickets to Hamilton. And I wanted to see what the Vision API said about my selfie. So let's see what we get back here. It's cranking. Live demos, you never know how they'll go. There we go, OK. So we can see that it found my face in the image. So it's able to identify where my face is, different features in my face, and it's also able to detect emotion. So we can see that joy is very likely here. I was super excited to be seeing Hamilton. What I didn't notice at first about this image is that it has some text in it. When I sent it to the Vision API, I didn't even notice that there was text here, but we can see that the API, using OCR, is able to extract that Playbill text from my image. And then finally in the browser, we can see the entire JSON response we get back from the Vision API. So this is a great way to try out the API. Before you start writing any code, upload your images in the browser, see the type of response you get back, and if it's right for your app. And I'll provide a link to this at the end. So that is the Vision API. If we could go back to the slides. Next, I'm gonna talk about the Natural Language API, which lets you analyze text with a single REST API request. So the first thing it lets you do is extract key entities from text. It also tells you whether your text is positive or negative. And then if you want to get more into the linguistic details of your text, you can use the syntax analysis method. And finally, the newest feature of this API is content classification. It will classify your text into many of over 700 different categories that we have available. Here is some Python code to call the Natural Language API. It's going to look pretty similar to the Vision API code we saw on the previous page. And again, we don't need to know anything about how this model works. We just send it our text and get back the result from the model. Now, let's jump to a demo of the Natural Language API. So again, this is our products page for the Natural Language API. And here we can enter text directly in this textbox and see what the Natural Language API responds. So I'm going to say, "I loved Google I/O, but the ML talk was just OK." We'll see what it says. This is a review I might find on a session. And let's say that I didn't want to go through all the sessions, but I wanted to extract key entities and see what the sentiment was. So we can see here it's extracted two entities, Google I/O and ML talk, and we get a score for each entity. The score is a value from negative one to one that'll tell us overall how positive or negative is a sentiment in this entity. So Google I/O got 0.8, almost fully positive. We even get a link to the Wikipedia page for Google I/O. And then ML talk got a neutral score right in the middle of zero because it was just OK. We also get the aggregate sentiment for this sentence. And we can also look at the syntax and details, see which words depend on other words, get the parts of speech for all the words in our text. And then if our text was longer than 20 words, we can make use of this content categorization feature, which you can also try right in the browser. And you can see a list of all the categories that are available for that. So that is the Natural Language API. If we could go back to the slides. So I'm gonna talk briefly about some companies that are using these APIs in production. Giphy is a website that lets you search for GIFs and share them across the web. They use the Vision API's optical character recognition feature to add search by text functionality to all of their GIFs. So as you know, lots of GIFs have text in them. Before they used the Vision API, you couldn't search by text, and now they've dramatically increased the accuracy of their search results. Hearst is a large publishing company. And they're using the Natural Language API's content categorization method across over 30 of their media properties to categorize their news articles. Descript is a new app that lets you transcribe meetings and interviews. And they're using the Speech API for that transcription. So all of these three companies are using just one API. We have also seen a lot of examples of companies combining different machine learning APIs. So Seenit is a crowd source video platform that got thousands of videos uploaded daily, so they have no time to manually tag those videos. So they're using a combination of Video Intelligence, Vision, Speech, and Natural Language to automatically tag all of that video content that they're seeing on the platform. Maslo is a new mobile app. It's an audio journaling app. So you can enter your journal entries through audio. And they're using the Speech API to transcribe that audio. And then they're using the Natural Language API to extract key entities, give you some insights about your journal entries, and they're doing all that processing with Cloud Functions and storing the data in Firestore. So that is the Natural Language API. So all of the products I talked about so far have kind of abstracted that model for you. And a lot of times when I present on the APIs, people ask me, those APIs sounds great, but what if I want to train them on my own custom data? So we have this new product that I mentioned, called AutoML Vision. It's currently an alpha, so you need to be whitelisted to use it. This lets you train an image classification model on your own image data. And this is best seen with a demo, so I'm gonna introduce it first. So for this demo, let's say that I'm a meteorologist. Maybe I work at a weather company. And I want to predict weather trends and flight plans from images of clouds. So the obvious next question is, can we use the cloud to analyze clouds? And as I've learned, there's many, many different types of clouds, and they all indicate different weather patterns. So when I first started thinking about this demo, I thought, maybe I should try the Vision API first and see what I get back. So as humans, if we look at these two images, it's pretty obvious to us that these are completely different types of clouds. The Vision API, we wouldn't expect it to know specifically what type of cloud these images are. It was trained across a wide variety of images to recognize all sorts of different products, categories, but nothing as specific as cloud types for these images. So you can see, even for these images of obviously different clouds, we get back pretty similar labels-- sky, cloud, blue, et cetera. So this is where AutoML Vision comes in really handy. AutoML Vision provides a UI to help us with every step of training our machine learning model, from importing the data, labeling it, training it, and then generating predictions using a REST API. So the best way to see it is by, again, jumping to a demo. So here we have the UI for AutoML Vision. Again, it takes us through every step of building our model. So the first step here is importing our data. To do that, we put our data in Google Cloud Storage, we put our images in Google Cloud Storage, and then we just create a CSV, where the first column of the CSV is going to be the URL of our image, and then the next column will be the label or labels associated with that image. One image can have multiple labels. So then we upload our images here. I've already done that for this example. Let me move over to the Labeling tab. And in this model, I've got five different types of clouds that I'm classifying. We can see them here, and we can see how many images I have for each one. AutoML Vision only requires 10 images per label to start training, but they recommend at least 100 for high quality predictions. So the next step is to review my Image labels. So I can see all my cloud pictures here. I can see what label they are. If this one was incorrect, I could go in here and switch it out. Well, let's say that I didn't have time to label my data set, or I have a massive image data set, I don't have time to label it. I can make use of this human labeling service, which gives you access to a set of in-house human labelers that will label your images for you. And then in just a couple of days you'll get back a labeled image data set. So the next part, once you've got your labeled images, is to train your model, and you can choose between a base or advanced model. I'll talk about that more in a moment. And to train it, all you do is press this Train button. It's as simple as that. Once your model is trained, you'll get an email. And the next thing you want to do is you want to see how this model performed using some common machine learning metrics. So I'm not going to go into all of them here, but I do want to highlight the confusion matrix. So if it looks confusing, it's called a confusion matrix, but let's take a look at what it means. So for a good confusion matrix, what we ideally want to see is a strong diagonal from the top left. So what this is telling us is when we uploaded our images to AutoML, it automatically split our images for us into training and testing. So it took most of our images, used those to train the model, and then it reserved a subset of our images to see how the model performed on images that it had never seen before. So what this is telling us is that for all of our altocumulus cloud images in our test set, our model was able to identify 76% of them correctly, which is pretty good. Now on the Train tab, you saw the base and advanced models, and I've actually trained both. So we're looking at the advanced model here. I can use the UI to see how different versions of my model performed and compare it. So we would expect that the advanced model would perform better across the board. So let's take a look. So it looks like it did, indeed, perform a lot better for most all of the categories here. You can see a 23% increase for this one, 11% for this one, but, hey, wait. If we look at our altostratus images, it looks like our advanced model actually did worse on that. What this is actually pointing out is that there may be some problems with our training images here. So the advanced model has done a better job of identifying where there is potentially problems with our training data set, because, remember, our model is only as good as the training data that we give it. So if we look here, we see that 14% of our altostratus clouds are being mislabeled as cumulus clouds. And if I click on this, I can look and see where my model is getting confused. And it turns out that these are images that I could have potentially labeled incorrectly. I'm not an expert on actual clouds. Some of these images have multiple types of clouds in them, so they actually are pretty confusing images. So what the confusion matrix can help me do is identify where I need to go back and improve my training data. So that's the Evaluate tab. The next part, the most fun part in my opinion, is generating predictions on new data. So I'm going to take this image of a cirrus cloud, and we're going to see what our model predicts. Again, it has never seen this image before. It wasn't used during training. And we'll see how it performs. There we go. So we can see that our model is 99% confident this is a cirrus cloud, which is pretty cool considering it's never seen this image before. So the UI is one way that you can generate predictions once your model has been trained. It's a good way to try it out right after training completes. But chances are you probably want to build an app that's going to query your trained model. And there's a couple different ways to do this. I want to highlight the Vision API here. So if you remember the Vision API request from a couple of slides back, you'll notice that this doesn't look much different. All I need to add is this custom label detection parameter. And then once my model trains, I get an ID for that trained model that only I have access to or anybody that I've shared my project with. So if I have an app, let's say it's just detecting whether or not there is a cloud in an image. Then let's say I want to upgrade my app. I want to build a custom model. I don't have to change much at all about my app architecture to now point it to my custom model. I just need to modify the request.JSON a little bit. So that is AutoML Vision. If we could go back to the slides. A little bit about some companies that are using AutoML Vision and have been part of the alpha. Disney is the first example. They trained a model on different Disney characters, product categories, and colors. And they're integrating that into their search engine to give users more accurate results. Urban Outfitters is a clothing company. They've got a similar use case. They trained a model to recognize different types of clothing patterns and neckline styles. And they're using that to create a comprehensive set of product attributes. And they're using it to improve product recommendations and give users better search results. The last example is the Zoological Society of London. They've got cameras deployed all over the wild. And they built a model to automatically tag all of the wildlife that they're seeing across those cameras so that they don't need somebody to manually review it. So the two products I've talked about so far, the APIs and AutoML, have entirely abstracted the model from us. We don't know anything about how that model works under the hood. But let's say you've got a more custom prediction task that's specific to your data set or use case. So one example is-- let's say you have just launched a new product at your company, you're seeing it posted a lot on social media, and you want to identify where in an image that product is located. You're gonna need to train a model on your own data to do that. Or let's say you've got a lot of logs coming in, and you want to analyze those logs to find anomalies in your application. These would all require a custom model trained on your own data. We've got two tools to help you do this-- TensorFlow for building your models and Machine Learning Engine for training and serving those models at scale. So from the beginning, the Google Brain team wanted everyone in the industry to be able to benefit from all of the machine learning products they were working on. So they made TensorFlow an open source project on GitHub. And the uptake has been phenomenal. It is the most popular machine learning project on GitHub. It has, I believe, over 90,000 GitHub stars. I actually need to update this slide. It just crossed over 13 million downloads. And because it's open source, you can train and serve your TensorFlow models anywhere. So once you've built your TensorFlow model, you need to think about training it and then generating predictions at scale, also known as serving. So if your app becomes a major hit, you're getting thousands of prediction requests per minute, you're going to need to find a way to serve that model at scale. And again, because TensorFlow is open source, you can train and serve your TensorFlow models anywhere. But this talk is about Google Cloud Platform, so I'm gonna talk about our managed service for TensorFlow. So on Cloud Machine Learning Engine, you can run distributed training with GPUs and TPUs. TPUs are custom hardware designed specifically to accelerate machine learning workloads. And then you can also deploy your trained model to Machine Learning Engine, and then use the ML Engine API to access scalable online and batch prediction for your model. One of the great things about ML Engine is that there's no lock-in. So let's say that I want to make use of ML Engine for training my model, but then I want to download my model and serve it somewhere else. That's really easy to do, and I can do the reverse as well. So I'm going to talk about two different types of custom models using TensorFlow running on Cloud Machine Learning Engine. The first type of custom model I'm gonna talk about is transfer learning, which allows us to update an existing trained model using our own data. And then I'm gonna talk about training a model from scratch using only your data. So transfer learning is great if you need a custom model, but you don't have quite enough training data. So what this lets us do is it lets us utilize an existing pre-trained model that's been trained on hundreds of thousands, maybe millions, of images or text to do something similar to what we're trying to do. And then we take the weights of that trained model and update the last couple of layers with our own data, so that the outcome is output that's generating predictions on our own training data. So I wanted to build an end-to-end example showing how to train a model and then go all the way to serving it and building an app that generates predictions against it. I know a little bit of Swift, so I decided to build an iOS app that could detect breeds of pets. This is a GIF of what the app looks like. So you upload a picture of your pet, it's able to detect where the pet is an image, and what type of breed it is. To build that model, I use the TensorFlow Object Detection API, which is a library that's built on top of TensorFlow. It's open source to what you do, specifically object detection, which is identifying a bounding box of where something is in an image. I trained the model on Machine Learning Engine. I also served it on Machine Learning Engine. And then I used a couple of Firebase APIs to build a frontend for my app. So this is a full architecture diagram. And the iOS client is actually a pretty thin client. What it's doing is it's uploading images to Cloud Storage for Firebase. And then I've got a Cloud Function that's listening on that bucket, so that function is going to be triggered anytime an image is uploaded. It then downloads the image, base64 encodes it, sends it to Machine Learning Engine for prediction. And then the prediction that I get back is going to be a confidence value, a label, and then bounding box data for where exactly that pet is in my image. So then I write that new image with the box around it to Cloud Storage, and then I store the metadata in Firestore. So here's an example showing how the frontend works. So this is a screenshot of Firestore, and we can see that whenever my image data is uploaded to Firestore, I write the new image with the box around it to a Cloud Storage bucket. So that was an example of transfer learning. But let's say that we have a custom task, and we've got enough data to build and train a model entirely from scratch. For this, I'm going to show you a demo I built of a model that predicts the price of wine. So I wanted to see, given a wine's description and variety, can we predict the price? So this is what an example input and prediction for our model would be. So one reason this is well-suited for machine learning is I can't write rules to determine what the price of this wine should be. So I can't say that any wine with vanilla in the description and Pinot noir is going to be a mid-price wine, whereas a fruity chardonnay is always going to be a cheap wine. I can't write rules to do that. So I wanted to see if I could build a machine learning model to extract insights from this data. As I mentioned, because I'm training this model from scratch, there's no existing model out there that does exactly this task that I want to perform. I'm gonna need a lot of data. I use Kaggle to get the data. Kaggle is a data science platform. It's part of Google. If you're new to machine learning and looking for interesting data sets to play around with, I definitely recommend checking out Kaggle. So Kaggle has this wine reviews data set. I found this data set on 150,000 different kinds of wine. For each wine, it has a lot of data on the description-- the points rating, the price, et cetera. So in this I'm just using the description, the variety, and the price for this model. So the next step is to choose the API I'm going to use to build my model, which TensorFlow API, and then the type of model I want to use. So I chose to use the tf.keras API here. So Keras is an open-source machine learning framework created by Francois Chollet, who works at Google, and the TensorFlow API includes a full implementation of Keras. So Keras is a higher-level TensorFlow API that makes this easy for us to define the layers of our model. You can also use a lower-level TensorFlow API if you'd like more control over the different operations in your model graph. So I chose to use the tf.keras API, and then I chose to build a wide and deep model to solve this task. Now wide and deep is basically just a fancy way of saying I'm going to represent the inputs to my model in two different ways. I'm going to use a linear model, that's the wide part that's good at memorizing relationships between different inputs. And then I'm going to use a deep neural net, which is really good at generalizing based on data that it hasn't seen before. Put it another way, the input to our wide model is going to be sparse wide vectors, and the input to our deep model is going to be dense, embedding vectors. Now let's take a look at what the code for our model will look like. So the first step, this is all using the Keras functional API, is to define the wide model. And to represent my text in my wide model, I'm gonna use what's called a bag of words representation. So what this does is it takes all the words that occur across all my wine descriptions, and I chose to use the top 12,000. This is kind of a hyperparameter that you can tune. So each input to my model is going to be a 12,000-element vector with ones and zeros, indicating whether or not the word from my vocabulary is present in that specific description. So it doesn't take into account the order of words or the relationship between words. This wide model just takes into account whether or not this word from my vocabulary is present in that specific description. So this type of input is going to look like this. It's going to be a 12,000-element bag of words vector. The way I'm representing my variety in my wide model is pretty simple. I've got 40 different wine varieties in my data set, so this is just going to be a 40-element one-hot vector, with each index in the vector corresponding to a different variety of wine. Then I'm going to concatenate these input layers. And output to my model is just a float indicating what the price of that wine is. So if I just wanted to use the wide model, I could take the wide model that I have here and run training and evaluation on it using Keras, but I found that I had better accuracy when I combined wide and deep. So the next step is to build my deep model. And for my deep model, what I'm using is called an embedding layer. What word embeddings let us do is they define the relationships between words in vector space. So words that are similar to each other are going to be closer together in vector space. There's lots of reading out there on word embedding, so I'm not going to focus on the details here. The way it works is I can choose the dimensionality of that space. Again, that's another hyperparameter of this model, so that's something I should tune and see what works best for my data set. So in this case, you can see I use an eight dimensional embedding space. And obviously I can't feed the text directly into my model. I need to put it into a format that the model can understand. So what I did was I encoded each word as an integer. So this is what the input to my deep model is going to look like. And obviously the inputs need to be the same length, but not all of my descriptions are the same length, so I'm using 170 as a length. None of my descriptions are longer than 170 words. So if they're shorter, I'll just pad that vector with zeros at the end, as I did here. The output is going to be the same. We're still predicting the price. And then using the Keras functional API, it's relatively straightforward to then combine the outputs from both of these models and then to find my combined model. And so now it's time to run training and evaluation on that model. I chose to do that using Machine Learning Engine. So the first step here is to package my code and put it in Google Cloud Storage. This is pretty simple. So I put my model code in a trainer directory, I put my wine data in a data directory, and then I have this setup.py file, which is defining the dependencies that my model is going to use. And then to run that training job, I can use gcloud, which is our Google Cloud CLI, for interacting with a number of different Google Cloud products. So I run this gcloud command to kick off my training job, and then I save my model file, which is a binary file, to Google Cloud Storage. And let's see a demo of generating some predictions on that model. So for this demo, I'm going to use this tool called Colab, Google Colab. I'll have a link to it at the end. And this is a cloud-hosted Jupyter Notebook that comes with a free GPU. And so here, I've already trained m. Model, and what I've done is I saved that model file in Google Cloud Storage here. And now I'm going to generate some predictions to see how it performs on some raw data. Let me make this a little bit bigger. So here, we're just importing some libraries that we're going to use, unloading my model. Next step is to load the tokenizer for my model. So this is just an index associating all the words in my vocabulary with a number. And then I'm loading my variety encoder too. I'm gonna ignore that warning. And here, I'm going to load in some raw data. So what I have here is data on five different wines. I've got the description, the variety, and then the associated price for each of these wines. And I wanted to show you what the input looks like for each of these. So now we need to encode each of these into the wide and deep format that our model is expecting. So to do that, we have this vocabulary look-up. I'm just printing a subset of it here. So this is going to be 12,000 elements long for the top 12,000 words in our vocabulary. And Keras has some utilities to help us extract those top words. And then my variety encoder is just a 40-element array that looks like this, with each index corresponding to a different variety. So let's see what the input to our wide model looks like. So the first thing is our text. And this is a bag of words vector. So it's a 12,000-element vector with zeros and ones indicating the presence or absence of different words in our vocabulary. And then the variety matrix-- I'm just printing it out for the first one. So I believe that first one was a Pinot noir, which corresponds with this index here. And if we look up here, we can confirm that, yes, that was a Pinot noir. So that's the input to our wide model. And then our deep model, we're just encoding all of the words from that first description as integers and padding it with zeros since this wasn't a very long description. And then all I need to do to generate predictions on my Keras model is call .predict. And what I'm doing here is I'm now going to loop through those predictions and see how the model performed compared to the actual price. So we can see that it did a pretty good job on the first one, 46 compared to 48. This one was about $30 off, but it was still able to understand that this was a relatively higher priced wine. And it did pretty well on the rest of these as well. So I'll have a link to this at the end. Colab requires zero setup, so you can all go to this URL, enter your own wine descriptions, and see how the model performs. If we can go back to the slides. I'm going to talk briefly about a few companies that are using TensorFlow and Machine Learning Engine in production to build custom models. The first example is Rolls-Royce. They're using a custom TensorFlow model in their marine division to identify and track the objects that a vessel can encounter at sea. Ocado is the UK's largest grocery delivery service. So as you can imagine, they get tons and tons of customer support emails every day. And they built a custom model. It's able to take those emails and predict whether it requires an urgent response or maybe the email doesn't require a response at all. And they've been able to save a lot of customer support time using that model. And then finally, Airbus has built a custom image classification model to identify things in satellite imagery. So I know I covered a ton of different products. I wanted to give you a summary of what resources all of these products require. So these are just four resources that I thought of that you need to solve a machine learning problem. There's probably more that I don't have on here. First is training data. How much training data do you need to provide to train your model? How much model code do you need? How much training or serving infrastructure do you need to provision? And then overall, how long is this task going to take you? So if we look at our Machine Learning APIs, great thing about these-- you don't need any training data. You can just start generating prediction on one image. You don't need to write any of the model code. You don't need to provision any training or serving infrastructure. And you can probably get started with these in less than a day. Very little time. If we move to AutoML, the cool thing about AutoML is that you will provide some of your own training data, because you're going to be creating a more customized model. And it will take a little more time because you will need to spend some time processing those images, uploading them to the Cloud, and maybe labeling them. You still don't need any model code or any training or serving infrastructure. And then finally, if we think about a custom model, built with TensorFlow, running on Cloud ML Engine, you will need a lot of training data. You will have to write a lot of the model code yourself. And you'll need to think about whether you want to run your training and serving jobs on premise, or if you want to use a managed service like Cloud ML Engine to do that for you. And obviously this process will take a little bit more time. Finally, so if you remember only three things from this presentation-- so I know I covered a lot. First thing, you can use a pre-trained API to accomplish common machine learning tasks, like image analysis, natural language processing, or translation. Second thing, if you want to build an image classification API trained on your own data, use AutoML Vision. I'm really interested to hear if you have a specific use case for AutoML Vision. Come find me after. I will be in the Cloud Sandbox area. Last thing, for custom machine learning tasks, you can build a TensorFlow model with your own data and train and serve it on Machine Learning Engine. So here is a lot of great resources covering everything I talked about today. I'll let all of you take a picture of that slide. The video will also be up after, so you can grab it from there as well. And that's all I've got. Thank you. [MUSIC PLAYING]

Info

Channel: Google Cloud Tech

Views: 83,459

Rating: undefined out of 5

Keywords: type: Conference Talk (Full production);, pr_pr: Google I/O, purpose: Educate

Id: gVz9jKE_9iU

Channel Id: undefined

Length: 39min 19sec (2359 seconds)

Published: Thu May 10 2018