Deep Learning Frameworks 2019

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
deep learning frameworks assemble hello world it's suraj and there are so many deep learning frameworks out there how are you supposed to know which one to use I'm gonna compare ten of the most popular deep learning frameworks in this video across a wide variety of metrics from ease of installation to performance to popularity on github after reviewing the merits and drawbacks of each of them we'll be able to come to some kind of reasonable conclusion at the end so let's start with the obvious one tensorflow out of all the deep learning frameworks tensorflow is without a doubt the most popular in terms of developer activity on github google created it to help power almost all of its massively scaled services like Gmail and translate then open sourced it for the rest of us nowadays recognizable brands like uber Airbnb and Dropbox have all decided to leverage this framework for their own services currently it's best supported client language is Python but there are also experimental interfaces available in C++ Java and go and because it's so popular it has bindings for other languages like C sharp and Giulia created by the open source community having such a massive developer community has resulted in tensorflow having rich detailed documentation not only from its official website but from various third-party sources from around the web this documentation covers its various features like tensor board tensor board lets developers monitor the model training process via various visualizations and it's a crucial part of its suite another crucial part is tension flow serving which allows developers to easily serve their models at scale in a production environment and includes distributed training tensorflow light even enables on device inference with low latency for mobile phones but despite all of this tensorflow is pretty low level you have to specify a lot of magic numbers like the number of layers in your network the dimensions of your input data and this requires a lot of boilerplate coding on the developers part which can be tedious and difficult by default tensorflow lets developers create static computation graphs at compile time we must define it then run it meaning all the conditions and iterations in the graph structure have to be defined before it's run if we want to make any changes in the neural network structure we have to rebuild it from scratch it was designed this way for efficiency but a lot of the newer neural architectures dynamically change so this default define and run mode of tensorflow is counter intuitive and can make debugging difficult they did add a define by run option called eager execution later on but it's not native expected to be even better in TF 2.0 which is about to release most of the time tensorflow is compared to the PI torch library a native define by run framework PI torch was created by Facebook to help power its services and it's now used by brands like Twitter and Salesforce unlike tensorflow though your income hi torches default defined by run mode is more like traditional programming while training a PI torch model for each iteration in an epoch a computational graph is created after each iteration the graph is freed meaning more available memory because it defines the graph in a forward pass versus a define then run framework like tensor flow back propagation is defined by how the code is run and every single iteration can be different pi torch records the values as they happen in our code to build the dynamic graph as the code is run pi torch also nails debugging we can use common debugging tools like PDB or pi charm and the modeling process is simple and transparent pi torch has declarative data parallelism features a lot of pre-trained models and has modular parts that are relatively easy to combine and just like tensorflow it allows for distributed training on the flip side however pi torch lacks model serving in the well-thought-out way that tensorflow does and lacks interfaces for monitoring and visualization like tensor board but you can connect pi torch to tensor board via some third-party libraries like tensor board x if we look at various papers from 'nor it's the biggest AI summit of the year it's clear that researchers tend to prefer pi torch to tensor flow that's because it's best for prototyping or small-scale projects when it comes to larger cross platform deployments tensor flow seems to be the better option but I should also note that the popular cafe 2 framework introduced by Facebook in 2017 is built for mobile and large-scale deployments in production environments and was recently merged into PI torch this gives pi torch production grade scalability curiously deepmind perhaps the most prominent AI research lab in the world doesn't use pi torch they use their own framework called SONET which is built on top of tensor flow deep mines developers spent a lot of time having to acquaint themselves with the underlying tension flow graphs in order to correctly architect their applications but with SONET the creation of neural net was made easy because it first constructs Python objects which represent some part of a neural network then separately connects these objects into the computation graph these modules simplify the training process and can be combined to implement higher-level networks developers can also easily extend SONET by implementing their own modules this makes switching between models easier but let's put the research versus production pipeline debate aside for a second what if you're just a beginner and just want to learn how all this stuff works the minimalist python-based library called Karros can be run on top of tensor flow or microsoft's c NT k Karros has support for a huge range of neural network types and makes prototyping dead simple and the code is very readable that's the reason I use it as a teaching tool so often in my videos it's really easy on the eyes building a massively complicated deep learning model can be done in just a few lines of code it has built-in support for training on multiple GPUs and can be turned into tensor flow estimators and trained on clusters of GPUs on Google cloud but the downside of it being so high level is that it's not as customizable it's also constrained to the libraries it's built on like tensorflow and c NT k so less functionality than a lower level library like tensorflow but easier to learn Karros is the best learning tool for beginners all right let's move on to MX net Jeff Bezos I mean Amazon's deep learning framework MX net has been adopted by AWS parts of Apple dar rumored to be using it and it offers API is in a huge variety of languages natively even perl where MX net excels is in its ability to scale linearly more so than tensorflow the CTO of Amazon published inch marks for MX nets training throughputs using the inception v3 image analysis algorithm and claimed that the speed-up obtained by running it across multiple GPUs was very linear across 128 GPUs MX net performed 100 times faster than a single GPU MX net has a high performance imperative API which is pretty awesome it's got the simplicity of chaos and it's dynamic like pi torch which makes debugging a lot easier unlike pi torch however MX net supports hybridization as part of its gluon interface the hybrid block class seamlessly combines declarative programming like tensor flow and imperative programming like pi torch to offer the benefit of both users can quickly develop and debug models with imperative programming and switch to efficient declarative execution by simply calling hybrid block hybridize will notice MX nets advantage in symbolic api's when training on many GPUs in some specific cases gluon is 3x faster than PI torch but take this with a grain of salt as benchmarks depend on so many factors and its integration with AWS is unbeatable because the it's Amazon's own pipeline let's not forget Microsoft though CNT K or Microsoft cognitive toolkit is a DL framework that supports Python C++ C sharp and Java it's got support for CNN's and RN ends and it's used in Skype Xbox and Cortana it's targeted toward letting developers easily build models for products and speech and image problems and it offers support for Apache spark it's the easiest of all the frameworks to integrate into Azure Microsoft's cloud offering and one thing in particular I like about CNT k is that it handles passing sequences of varied link better than the other frameworks in TF you have to do padding masking and sometimes even write your own soft max function that ignores masked elements in pi torch the scenario is less painful with functions like pack padded sequences but you still have to pad at the beginning masking in general makes your model vulnerable to errors in CNC K you just have to pass the sequence without any padding or requiring a mask later on and everything is taken care of it handles sequences of variable length internally some of the criticisms of CNT K include its strict license as they have not adopted conventional open source licenses like GPL ASF or MIT the community seems to consist of mostly windows developers who would like to include machine learning models in either desktop or mobile applications also shout-out to cheyna a framework created by a Japanese star it's similar to PI torch and that it has a native imperative API but it's difficult to debug the community is relatively small but it's supported by giants like IBM Intel and in my fantasies Mechagodzilla it can be run on multiple GPUs with little effort and the main use case we've seen of it thus far is in speech recognition machine translation and sentiment analysis if your core programming languages Java definitely take a look at deep learning for Jay it's written mainly for Java and Scala and supports a huge variety of neural networks it was made for enterprise scale and works with Apache Hadoop and spark on distributed CPUs and GPUs also their documentation is stellar Java isn't very popular among machine learning projects so it's hard to integrate it with other ML libraries perhaps the main utility here is that Android apps are usually written in Java thus this would be a good choice if you'd like to build a full stack Java pipeline which includes Android devices and speaking of mobile shout out to core ml it's not a framework that's made to build models necessarily but it does help you bring existing models built in other frameworks to Apple devices and last but not least let's talk about Onix a pokemon with a pretty high HP but also the open neural network exchange format it was developed in partnership between Microsoft and Facebook they both decided there was a need for interoperability in the AI tools community since the developers often find themselves locked into one framework or ecosystem onyx enables more of these tools to work together by allowing them to share models the idea is that you can train a model with one tool sack and deploy it using another for inference and prediction to ensure this kind of interoperability we must export our model into the Onix format which is a serialized representation of the model in a protobuf file overall choosing the perfect framework for ADL project can be hard you have to take into account many factors like the type of architecture you'll be developing with which programming language you're going to use the number of tools you need etcetera here are my conclusions if you're a beginner to programming in general use careless as it's still the easiest library to learn from if you'd like to build a production great application and deploy it to Google Cloud use tender flow if you'd like to do research use high torch but also check out SONET if you prefer deploying to AWS use MX net if you want to deployed it a zero u c NT k if you're a Java developer deep learning 4 J is your best bet I don't think chain errs got anything unique compared to the other frameworks and once you've already started building a model use onyx to use tools from other framework ecosystems with it oh and anything iOS related you can leverage core ml for what's your favorite framework let me know in the comments section and please subscribe for more programming videos for now I've got to define and run so thanks for watching
Info
Channel: Siraj Raval
Views: 163,382
Rating: undefined out of 5
Keywords: siraj raval, deep learning, tensorflow, pytorch, chainer, deeplearning4j, sonnet, deepmind, google, coreML, onnx, facebook, programming, coding, github, developer, code, education, framework, library, computation, machine learning, artificial intelligence, AI
Id: SJldOOs4vB8
Channel Id: undefined
Length: 13min 8sec (788 seconds)
Published: Mon Jan 21 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.