Research to Production: PyTorch JIT/TorchScript Updates - Michael Suo

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
alright hey everyone i'm michael i work on the pi torch and i'm here to talk about what we've been working on in the last year and we're releasing with 1.3 pi torch it so let's start with some background what is the pi torch jit well in a sentence it's a compiler and language infrastructure for machine learning so that's a bit of a mouthful and most of the time when I say that people don't really know what I'm talking about so I think a better way of describing the JIT is functionally and what it delivers to you all and for that it's our technical strategy for delivering PI torch to a production audience we chose this path because we wanted to take the best aspects of pipe torch and adapt them for a new audience and a new group of users with a new set of requirements so when we looked at the requirements that production users came to us with it boiled down to two main elements the first is portability PI torch program should be decoupled from any specific runtime environment we love the flexibility the ease of use you know the debug ability that Python gives you as a host language for PI torch but running Python is difficult and impossible in a number of places that are really important to production users like multi-threaded inference servers where the Gil will be a problem like on the phone or on a car and the second requirement is performance so when we have access to the full model we can perform a variety of optimizations layer fusion quantization sparsa fication and these optimizations are they're possible to implement imperative ly in pipe torches eager mode but often can be done automatically or with very little effort when you have access to a fully structured representation of your model so translating these requirements into a technical problem we found we needed a system that could first capture the structure of pipe towards programs and while preserving is much of the flexibility and power of Python as we could and second we wanted to use that structure to perform optimizations in a scalable way we wanted to give users the ability to write their own transformations and optimizations on top of a general purpose platform so those two problems correspond to the two main components of the the PI torch JIT for capturing PI towards programs we have torch script a static subset of the Python language specialized for machine learning applications and for optimizing the structure of models we have a just-in-time compiler that can navigate the expressiveness and Dyna Missa tee of Pi torch code and use runtime information to optimize your models so let's take a look at torch script first as I said it's a subset of Python one difference is that it's statically typed although if you don't add a type information annotation are really sophisticated type inference algorithm will say I guess it's a tensor and our general workflow is depicted on the side here you just prototype your model in regular PI torch it's a regular NN module you have access to all your standard Python debugging tools and tools for development any Python libraries and once you're ready you simply call torch script on an instance of your module to convert it into a torch script model so you if you've used the JIT before 1.3 you might notice that this is a new API it requires a lot less manual work and annotations from you you just develop your model in PI torch and we'll take care of parsing and compiling your code into torch script with that single call to script so we try really hard to support pythonic language constructs if statements are just if statements you don't need to do anything special if statements across Python values will work side effects will be preserved as you wrote them lists should behave like lists and so on you don't need to do anything special to get them to work so once you have your model in torch script it looks something like on the right you will never really see this but this is our structured intermediate representation of a PI torch model it preserves all the behavior you wrote in the model but it's in a form that's a little easier for us to optimize and otherwise manipulate you can run this intermediate representation in our lightweight thread-safe interpreter available anywhere you can run c++ and the regularity and structure of our intermediate representation makes it easy to write custom transformations and pattern matches on your model and we want to call out that this stuff is not just for inference if your Impa tensors required great we'll make sure our optimizations preserve the auto grab semantics and your backwards will give you correct results just as if you had used Python as your host language so to give you a sense of what kind of models are expressible in torch script we can use recurrent neural network grammars as a case study so our NGS are used to perform semantic parsing of sentences for task oriented dialogue like maybe a virtual assistant or a conversational AI might perform and you know here we have a paper that one of our teams presented at nurse last year which proposed some improvements on top of the base RN ng that can achieve state-of-the-art results on task oriented parsing so this model is a really great example of what torte script can provide you because it's tremendously complex it's got highly dynamic behavior based on the input and generally our team found it extremely difficult to rewrite or port to a static graph language so when shipping to production our assistant team basically managed two versions of the model they had one written in PI torch that the researchers experimented on and improved constantly and one written in C++ that was a copy of the one written in Python but you know had all the pet foals the pitfalls that you might expect you know the implementations were hard to keep in sync sometimes they were hard to debug semantic differences between the models and as the research team pushed forward it was hard to get research results directly to production the team had to own the deployment of the C++ model which was a huge pain point as well so this was a really great use case for torch script in the PI torch JIT it allowed the team to write their model once in a single form that was easy to experiment on and easy to production eyes and I'd like to give a shout out to she song Zhou who actually did the work of porting this model to torch scripts so I'll show you a little bit of it here here's a bridge version of the forward pass it's not really important to understand the details of the code don't worry too much about reading it but the important takeaway here is it's just PI torch code it has some small tweaks here there and some type annotations but fundamentally it's the same thing you would write in PI torch eager mode so over the last year we've been adding and expanding to torch scripts support for the Python language we know that it's a really big part of why people like PI torch is straightforward and simple it is to express your ideas using a regulated imperative programming model you know it's it's just Python right and we want to preserve that property throughout the whole workflow from research to production to deployment the experience of using PI tortes should feel unified and consistent across all our API so even as the way you're using PI torch changes so you may have to tighten the screws a little bit at a type annotation here tweak the model a bit there but the patch should be smooth and well-defined it shouldn't be like a jump we're okay we have to go to production time to rewrite our model so diving in a bit I'm just highlighting some of the language features that this model uses that are fully supported by torch script and the JIT runtime for example we have here complex control flow we have if statements nested in for loops nested in while loops we put a list append here inside the loop and the side-effect works exactly as you would expect as if you were writing Python speaking of we have first-class support for common Python data structures like lists and dictionaries here you see we're keeping a list of plans sorting them pulling plans out to execute them using a slice so all of the common operations work and similarly you can define your own classes it helps you organize your data in a in a regular object-oriented way here the plan object is presented as a bag of data you know it's got some state and a and a less than operator defined magic methods also work as you would expect now if you're used to using PI torches most people in the audience are everything I showed you might seem obvious you know yeah Python has ADD loops and lists and classes since the very beginning but talking to our production users who are used to you know cramming models into a really restrictive graph based paradigm the idea that you can use these features is really mind-expanding you really can get the expressivity and flexibility of pi torch with the portability and performance that a graph based approach can give you so what's next for the JIT well we'll definitely keep pushing on language features Python is a big language and even the subset used in PI towards programs will take some time to fully support but beyond that we want to make sure that the JIT is available as a platform for building new tools and features for pi torch and the community in general so you'll hear about some of these efforts today Lin and shrimp some of them Demetrio we'll get up here after me and tell you about quantization which will soon use the JIT to do automatic quantization of your model David will be here later talking about our mobile launch which uses the JIT as a basis for a lightweight interpreter that can go on the device and we're in the early stages of building out extension points to lower torch script models to graph compilers and accelerators like TVM glow and excel a which will help us deliver bigger performance wins and target new hardware so please try it give us feedback it's all available today torch script is pretty widely used within places like Microsoft Facebook and later Sydney from uber we'll be giving a talk about how they use torch script to put their full prediction model on the car but even so it's still early days in every bit of feedback every bug report every you know this was frustrating and weird and we didn't really understand why this happened is really valuable to us we have tutorials and documentation up on the PI torch website to get you started and we're always monitoring the JIT label on github for feature requests and bug reports and just you know interested queries so thank you [Applause] [Music]
Info
Channel: PyTorch
Views: 6,913
Rating: undefined out of 5
Keywords: AI, Artificial Intelligence, Machine Learning, ML, Facebook, PyTorch, PyTorch 1.3, PyTorch Developer Conference, PTDC19, PTDC, Developer Conference, Michael Suo, TorchScript, JIT Compiler, Production AI, Production ML, AI deployment, ML deployment
Id: St3gdHJzic0
Channel Id: undefined
Length: 10min 5sec (605 seconds)
Published: Wed Nov 06 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.