Deep Learning Pipelines: Enabling AI in Production

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

yeah so my name is sue Anna's Maddy introduced me and today I'll talk a little about a deep learning kind of the state of the world a little bit and deep learning pipelines which is a kind of theater bricks answer to the problems people face when they're doing deep learning so kind of an overview as I said we'll talk a little about the current state of the world the philosophy behind deep learning pipelines and kind of the tools in there that help you with doing end-to-end workflow with deep learning and a little bit about the future so we can start with what is deep learning actually how many people here have maybe like played with deep learning or okay so deep learning is just really a set of machine learning techniques that use these layers that transform numerically but so it can be used for any normal machine learning tasks like classification regression or learning arbitrary function mappings right and it was really popular in the 80s and back then they were called neural networks but they came back in kind of these massive forms and the reason they have become feasible now is thanks to the advances in being able to collect a lot of data and store them and use them using in a better computation techniques and hardware and we've seen a lot of success of deep learning and feels that have to deal with complex data like you know game playing like alphago with a complex state space or you know working with things like images text and speech but deep learning can be often challenging and to sort of summarize they're kind of three parts one is that to train a single model can actually take a lot of compute resources and time compounded by the fact that it actually takes a lot of engineering hours to design a get network and then it and so on so if you combine those two then you could take months to actually build a great model for your task and because these models are really complex it can take a lot of often labeled data to do this to train a great model which you may not have access to depending on you know your circumstances and what makes these parts worse are you know a set of kind of smaller problems I guess one is that it's actually tedious - or difficult to do tree computations and a lot of these deep learning frameworks now there is that it's just there is no exact science around deep learning so it takes a lot of tweaking and experimentation to get a good model and third is the there are many different frameworks out there for doing deep learning like tensor flow and next I applied towards and so on and they tend to be low level api's that require quite a bit of learning curve to really know how to use them and I mentioned earlier you know complex models means you need a lot of data so these things have kind of led to limited adoption and Industry of deep learning and when you hear about kind success of deep learning it's usually these industrial giants like Google or Facebook or Amazon that have seen a lot of success because they have those resources time and data so the question we asked was how can we accelerate this you know I wrote - better availability to everyone to the deep learning and that's why we build deep learning pipelines and kind of the theme we went with here is to focus on the ease of use and integration and it's an open source data breaks library I'll have some like links for you know how to access it and stuff and also without sacrificing performance while keeping the ease of use so if we go back to this slide of the challenges we talked about and kind of look at each item and what we really want right in this world is that at least for the common workflow and tasks that are benefited by deep learning and it shouldn't be you know difficult to Detroit this these computations it should be easier to scale tweaking takes a lot of time so hopefully you know you don't want to do a lot of tweaking the API should be simpler and higher level for at least common things and we don't want to always have to have a lot of data to solve these problems because you may have problems that are kind of new in your application so how we address these things in a deep learning pipelines is are the following so for scaling we use as you might imagine apache spark to scale at the computation for common tasks for requiring little tweaking we try to leverage you know as many well-known model architectures for the specific problems as possible and then to make it easier to write we integrated with ml the pipeline's API and using the pipeline concepts we'll talk about a little bit later to kind of capture these workflows with very few lines of code and finally we can leverage there are lots of pre-trained models for a specific tasks out there for deep learning and we can leverage them and essentially tweak the model a little bit we'll see examples of this to get great models with little data so I will actually kind of to demonstrate these points or I'm gonna do a little demo sorry I need to type a little okay and we're going to build a like visual recommendation a I fancy animation and actually this recommendation I take 10 minutes to train and it's gonna use seven lines of code and spark and it's actually gonna require zero labels and we'll see how that happens and the particular recommendation tasks we're going to deal with today is like visual recommendation of shoes so given a particular pair of shoes we're going to find shoes that are similar visually partly because I like shoes and this is this is a real actual task that is used in the world right okay so let's see how we can do this all right let's see if this works okay it is mirrored alright okay now that I'm out of like presentation mode I can actually see it okay so in this notebook I've loaded a data set of shoes there are 34 thousand of them they come from a website with a lot of shoes I actually used spark to scrape the data which was fun that's a separate story okay so we can look at the shoes in there and and there's a pretty good variety of shoes from like heels to athletic running shoes tall boots and those and so on so now you know as a data scientist given a pair of shoes I want to find other shoes that are similar right so how would I go about doing that one natural algorithm to consider it is a nearest neighbor so for those who are maybe not as familiar nearest neighbor and works by assuming that your data points live in a reasonable representation space such that things that are closer in that space are more similar than things that are further away so this actually requires no labels that's how we got away with Sarah labels but you do need a good representation of the data points such that like this close means similar is true right and and that's gonna be key to our success today so let's look at what we have here so I've also loaded here kind of this like vector representation of the images which is just a vector 255 is for all the white space on the shoes and then one important thing is that actually these shoes live in these five hundred thousand dimensional spaces which is pretty large and one consequence of that is gonna be in a comparisons might be a little expensive between the points and what's worse is given a pair of shoes you have to look at all the thirty four thousand shoes in your dataset to find let's say top ten more similar things and that can be very expensive luckily ml live has an approximate nearest neighbor algorithm called locality sensitive hashing or LSH and that essentially finds kind of these approximate neighborhoods for each data point so that you only have to search in that neighborhood to find your top K you know similar shoes in this case instead of the entire data set so so that's pretty simple to write here and let me start running this and here we've written kind of like a small helper function to do this we're gonna use an LS H algorithm called bucket at random projections LSH which just helps you compute Euclidean distances quickly between things and here we just make the object and call fit and we just need to call approximate neighbors on this with the set of candidates which will be all the shoes the key which is this you know array representation of the images and we'll find the top ten close issues and here what we're running this on is actually you know we haven't done any deep learning we're just trying it on raw images to see how reasonable that is it seems like it could be reasonable because the picture has all of the data we need visually so here is the result from that the top image here is the original shoe I was looking at and these are the results and they're not horrible they're kind of like they're kind of like same color ish like we didn't get any boots so that's good but I started with these like summer sandals and got winter slippers and that's probably not actually productionize about yet all right so what are we gonna do obviously we're gonna use deep learning so okay all right so I'll actually explain exactly what's going on with this deep learning section a little bit later in this slice but essentially what we're gonna do is we're going to use deep learning to change the representation of the shoes from the raw space to something that's going to be more reasonable that the deep learning models can capture so to do that the this is from deep learning pipelines it provides a tool called deep image feature Iser that takes the image and then outputs some higher representation of that image that's gonna contain more useful things like edges and texture and so on so we can have this feature Iser if he tries the image data I'm not gonna run this here because that this is the thing that takes ten minutes in that ten minutes to do it but we're going to now run the exact same LSH algorithm to find the nearest neighbors on this you know hire represent deep deep learning features okay so here's the result this looks much better alright now we actually have other sandals and not only are they kind of the same shape we can actually see that it finds shoes that are a little bit maybe differently shaped but have the same stylistic details like this crochet and they're you know similarly heeled sandals and so on so that was cool okay I actually have no idea how much time I have okay well okay and here are some other examples from like that sample dataset we had in the beginning it's really pretty good yeah and I actually did like no selection those shoes at all just like picked random ones and they were this good okay so that was cool so now I'm going to talk a little bit about you know kind of the different parts of deep learning pipelines a little bit more about what actually happened there and on and to do that it's useful to look at sort of the typical deep learning workflow which is really really similar to typical machine learning workflow so right first you'll load the data but in deep learning a lot of the times it will be more complex data like images and text or a time series and then as usual you might do something that active work to figure out you know what's in this date I said what's going on and then you probably want to start you know trying to train some models for deep learning usually you have to choose an architecture create one and then optimize the weight in the neural network and then you're probably evaluate the results and then potentially retrain based on what you see and how much you like it and then finally apply the model to some unseen data that you want to actually use it on so we have these stuffs and in deep learning pipelines we kind of have some tools for each of these stuffs that we can talk about currently the library focuses a lot on images and it's kind of I think of it as like showcasing one vertical with some of the tools that can be used that can be generalized to others and I'll talk about that a little later so for example for images and it provides image loading functionality and spark which you know spark is missing and then for interactive work it has kind of built-in way used to apply pre-trained models for images for example and then for training or as tools for transfer learning and just really tuning before applying the models it has kind of easy ways to apply the models in batch using spark or deploy the models as equal so today I'll talk about how these parts they're the other parts we did have a deep dive session at the last spark some in Europe so if you're interested you can look at that and you know that kind of talks about everything more in detail and I'll have the link at the end okay so let's talk about these stuffs a little bit more in detail so loading data as I mentioned it has support for at images in SPARC and dealing with loading and dealing with images it's just a simple function where you can just say read images with the directory that contains the images and you'll get out of the data frame that has the image data in it and what it provides is image schema which is a way to represent these images in SPARC directly as well as you know this reader and the kind of conversion functions from to and from numpy so you can kind of work with python libraries as well in between if you need and again kind of the most of the tools i will describe in the later steps of the workflow will be on these images mostly and i think this is good news which is like the image support will actually go into spock 2.3 from this library so that should be kind of native to spark from then and this is joint work with Microsoft okay so now that you've loaded this data you know what can you do you can do some interactive work which can be in the form of kind of applying popular models and essentially what this does is it it makes some of these popular pre-trained models accessible through MLF transformers if you're not familiar with ml of transformers it's just a thing that applies some function to the data and we'll see more of that later and here all you have to do is call this make this object call it the deep image predictor and then apply this and specified I don't have the specifying a model name there's kind of a set of you know nice models especially for image classification and then just call that on the image data frame that you got from just loading that right and what this will return would be like for example a set of labels for things that could be in this image so this might be kind of nice a nice way to explore your data set or if you happen to actually have a problem for example you know classifying ants versus bees versus ladybug that's in this pre-trained network you could just use it right away okay so so next I'll talk about training and transfer learning so so what we just saw was just applying pre-trained models that are out there you know that was developed by the community of like researchers and engineers working on deep learning and that could be applicable to your data step of most in most cases that's probably not sure right you're probably want to classify shoes or some other thing so in that case training from scratch as we talked about before requires a lot of data a lot of time a lot of engineering hours and resources and I think one of the really powerful ideas in deep learning is transfer learning which comes from the idea that in our media representations that are learned in deep learning for one task can be useful for other related tasks so briefly here is how transfer learning works here's a representation of a neural network let's say it's trying to do an image classification that classifies things like pandas and raccoons and cars and other things right and the truth is there are you know a lot of great models out there for this particular task that have been trained on millions of models you know over months and years and decades of research and the really powerful part about this is that the the models have gotten is so good that the intermediate layers in the model have come to actually represent kind of higher concepts about the image that might be you know edges or textures or even kind of hole objects in there in the layer later layers of the model so what we can do is okay get rid of the layer that tells you what the labels are take instead one of these intermediate layers that have a really good representation of the image data and then if you have a new task stick on a probably a much simpler classifier on top of it using the output of the network as the new new set of features and then trained on your maybe new data set for example here you might be like flower classification which is more specific than what the model was originally trained to do right so for example you might represent something like this well in deep learning pipelines it's represented as this MLM pipeline actually that may not make much sense if you don't know in my lab we're out of time for us so but um but essentially all you have to do then is create a feature razor which feet Rises the image again using a model of your choice there are other models than the inception v3 in this library but we like it and they you just stick on a classifier like logistic regression after put it in a pipeline this is another Hamlet concept and then train is so now you can train a new image classifier for your tasks in just four lines of code which is pretty horrible okay so transfer learning kind of in its normal definition tends to be for classification tasks for things that are maybe like a similar task with image classification but in a new domain but I think of other forms of leveraging these learners present the representations as transfer learning as well and that's sort of the shoe example we saw earlier where we took that representation that was learned in deep learning and then used not just you know classification at the end of it but you can use in other similarity based machine learning algorithms like clustering or distance computation and this kind of lets you do other things like duplicate detection or anomaly detection and use it for recommendation or you know search result diversification kind of things so there you know wider I think applications of that okay so as I was saying we'll skip hyper prior tuning and evaluation and as the last set of tools I want to talk about now after you have a model how do you apply that model to your data and we'll focus here mostly on batch prediction so that's prediction which is just you know off line you have a very large data set and you have this deep learning model how do you apply it efficiently right and the spark should be really good at this right this is a very embarrassingly parallel task where you just supply this thing to the data and distribute matter or parallel and one I think useful concept to finally actually know in ml live is that a model is a transformer and then I live and the transformer is really just it takes a data frame and then it outputs a data frame with another column attached to it so it looks something like this where you create a transformer object you tell it kind of which columns you're gonna apply whatever is happening in situ you tell what okay it does work okay you tell it where to put the result of that computation of applying whatever it is to the info column and then in this case we might have a transformer that takes a model specification like what is this deep learning model and then again you can just apply that to some data frame that contains this input column in there and then you should get the output in this outfit column alright so I think that's a like simple enough concept so I think what is interesting about this kind of batch prediction and model application and deep learning pipelines is that there's sort of a hierarchy of these transformers in deep learning pipelines and they all take in as input image and then I'll put some vector output okay it will become more clear why it's interesting hopefully soon so actually you know these deepest image predictor and feature Iser that we talked about earlier that gave you kind of these pre-canned answers or the features asian mechanism are also just these transformers that take in an input image then I'll put a vector that was either the label probabilities or the features right internally they're actually represented or they call another transformer though that is weirdly named named image transfer because it just takes a model name and then it's going to instantiate that model inside so these things just call that with the model name and then underneath that it's actually powered by like some other transformers that are closer and closer to the deep learning frameworks that are actually used underneath like Kerris or tensor flow right so I think the reason I think this is interesting is because it kind of exposes these tools at different levels of abstraction you can kind of go from not knowing anything about deep learning and just applying these models at the top level all the way to the bottom where you're you're probably training your own models and then feeding that into these transformers to apply that am I like severely over time okay okay and this is just for images but we're actually you know kind of trying to build similar things in other domains like text where you might see kind of the same kind of pattern that is ultimately all backed by kind of the same back-end underneath some of these are things that are actually being built in right now and you can check them out the animation working okay yeah and again kind of these levels I think represent one of the philosophies that you know deep learning pipelines tries to address which is that these common tasks should be easy using you know pre-built solutions or like things that are 80% there for things like transfer learning and the advanced tasks where you need to do more modeling should also be possible for example by being able to distribute easily using spark so I think I already talked about some of these in progress things for text and so on so then you can imagine sort of the direction of the deep learning pipelines where we can apply this sort of philosophy for yeah easier you know common things should be easy advanced things should be possible so you know we're considering things like non-image day or domains like video text speech and so on we are working on some distributed training things to make that advanced part more possible and also considering kind of so poor for backends that are not just tensorflow but you know MX not pi towards and so on and if you have any feedback about kind of much directions but to hear them actually that's it yeah okay there are a bunch of links I think this list yeah any questions yeah yeah so in that in this particular case right like that we put for example logistic regression but you could really put anything there that could do classification so potentially using deep learning again it could give you a more powerful classifier I think in a lot of these cases where the base model is really really good at representing that space you don't necessarily need a more complex model like deep learning afterwards but you know something you could try yeah person mm-hmm yeah we don't have anything kind of in the works right now definitely right like those are kind of common models that would be nicer to have like better either training or using them there's some embedding PRS out there you can actually check them out that are currently being worked on yeah I think we could discuss it a little more if you have you know some some ideas are on that yeah yeah the the question was with applying something like PCA before applying the deep learning model help I guess it kind of depends on your domain I think like the general philosophy are on deep learning is really to try to help you so you don't have to do things like you know dimensionally reduction like that beforehand right like I should kind of try to handle it for you you know ideally the architecture is good enough that it can kind of do those kinds of dimensionality reduction things inside yeah any other questions to all our speakers here tonight they are you know they have amazing accomplishments and even significant contributions to technology computer science diversity and of course I've had to spark so thank you so much and thank you all of you for sharing your evening with us thank you we hope to see you at future events so please come back

Info

Channel: Databricks

Views: 6,488

Rating: 4.9183674 out of 5

Keywords: apache spark

Id: PHVtFQpAbsY

Channel Id: undefined

Length: 32min 52sec (1972 seconds)

Published: Mon Nov 20 2017