Introduction to Amazon SageMaker

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

- Amazon SageMaker helps data scientists and developers to prepare data and build, train, and deploy machine learning models quickly by bringing together purpose-built capabilities. These capabilities allow you to build highly accurate models that improve over time without all the undifferentiated heavy lifting of managing ML environments and infrastructure. Let's see how SageMaker helps you develop great machine learning models by way of an example where we build a model that creates a musical playlist, curated to the listener's taste. First, you need lots of data. SageMaker lets you connect and load your data from sources such as Amazon S3 and Amazon Redshift in just a few clicks from SageMaker Studio. You can use this data to train your model. Models learn complex and subtle patterns to let you map inputs to predicted outputs. So you'll need a large quantity of metadata about songs, such as length, beats per minute, genre and rating. Next, you'll need a strong set of features. Data in its raw form usually doesn't provide enough or optimal information to train a model. And so to maximize the signal and reduce the noise in the data, you need to convert and transform it into features through a process known as feature engineering. For instance, beat and genre could be combined into a feature called "danceability". Creating features can take over 80% of model development time. Instead, you can use SageMaker Data Wrangler to convert, transform, or combine raw tabular data into features in a fraction of the time it typically takes. With a single click, you can save these features to SageMaker Feature Store, which lets you check in and check out features. The surface lets you create multiple versions of features and you can add descriptions and search for features, which helps your teams understand and use them for other models. You can retrieve an entire data set for training. Once your model is deployed, you can retrieve individual features to make low latency predictions, such as predicting in real time that the user wants to listen to more songs with "danceability", like Abba's "Dancing Queen." Next, great models can be used in different situations if they are trained on a balanced set of features and data. You use SageMaker Clarify to help ensure that training data is well-balanced, which means that the possible values for the features are well-represented in the data and that the accuracy of the trained model is roughly the same across different subsets of the data, such as different musical genres. For example, if you had a preponderance of blues music in your training set, the model would probably create a lot of blues playlist. That's fine if all you do is listen to the blues. But your model will be even more accurate if you use an evenly balanced set of features representing dozens of different genres for training. So you can use SageMaker Clarify to identify potential bias in your training data. This will help you ensure your model is trained across a range of musical genres, leading to more accurate predictions. You can also use SageMaker Clarify to inspect individual predictions to understand how each feature plays a role in the prediction. This allows you to check that the model isn't overly reliant on features that are underrepresented in the data. One of the great things about machine learning is that models can improve over time, not just based on new data as it becomes available, but also by incorporating the learnings from tools like SageMaker Clarify and SageMaker Debugger to systematically identify sources of error or slowness and remove them from your model. With this approach, you can condense hundreds of thousands of hours of real-world experience into just a few retraining iterations, so your models can improve quickly. And since you want to continually improve the model by rebuilding it regularly, you can take advantage of Amazon SageMaker Pipelines, which provides continuous integration, continuous deployment capability to automate the entire machine learning development process and replay it with a single click. This decreases the time between model improvements and delivers better models more quickly. Amazon SageMaker provides tools which every developer is familiar with, visual editors, debuggers, profilers, and CI/CD, all wrapped into the Amazon SageMaker Studio integrated development environment for machine learning. Get started right away with your machine learning project from SageMaker Studio.

Info

Channel: Amazon Web Services

Views: 136,942

Rating: undefined out of 5

Keywords: AWS, Amazon Web Services, Cloud, AWS Cloud, Cloud Computing, machine learning, deep learning, artificial intelligence, AI, ML, amazon sagemaker, recommendations, personalization, ml platform

Id: Qv_Tr_BCFCQ

Channel Id: undefined

Length: 4min 47sec (287 seconds)

Published: Fri Feb 19 2021