Talk: Dustin Ingram - Static Typing in Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi I'm Dustin I'm a developer advocate at Google I'm also the organizer for PI Texas conference which is will be in Austin Texas October 24th or 25th this year if you want to come I also work on a bunch of Python things including Python package index but I'm not here to talk to you about any of that today instead I'm here to talk to you about static typing and I wanna start with a pop quiz so is Python dynamically or statically typed so maybe you think yeah Python is a dynamically typed language so it's dynamically typed maybe you think this is a talk about static typing in Python it's probably statically typed or maybe you think this is actually a trick question and it is this is a trick question so the answer is that Python can be dynamically typed but can optionally be as statically typed as you want it to be and that answer might not make sense to you and if it doesn't that's okay you're watching the right talk this steps to understand answer to that question are first we need to talk about types in Python then type systems in general dynamic typing in Python static typing in Python and then once we understand that we can talk about how to use static typing when you should use static typing and also maybe when you shouldn't use static typing so let's talk about types and let's specifically talk about the type built-in in Python so imply that if I type type of 42 I get a response class int I can do type before T 2.0 I'll get a class float type of foo is class string and type of foo and bar is a class list and you might say oh I recognize these these are the built-ins that I can use to change one type into another type and you'd be right right we could set a variable a equals 42 and we can change it into a float by calling float on it and cast it to a string and then we can cast it to a really ugly list so maybe you've had a weird bug where a string actually got turned into a list that looks like this guess what that is a type error you are in the right place it would seem like these are just the only types available to us right in float string but these are actually just classes that have corresponding built-ins in Python so basically we're doing class matching here when we do something like is instance 42 in we're seeing it 42 is an instance of the int class int is just a built-in here there are other type classes too that don't correspond - built-ins like for example none type I'm sure we've got them that a lot in our Python code all right none type is the type of none but we don't use none type to describe it similarly there's a function type but we don't use function to define function we say death func etc etc there's also lots of other ones like the ellipses type and these all don't correspond directly to built-ins the place we get them from is what if we import types we can get this giant list of all the types that are available to us many of them live in the types module and all these can actually be used to instantiate a new type so for example a function we could instantiate a new function with the function type open close friends with some arguments in there but we don't really do that and it's for a good reason it would end up looking really messy instead we'll do it the normal way with def function name so when we say the python has a dynamically typed language it means a few things it means first of all that variables can be any type and that the type of the variable can change over the course of the runtime of your program so for example here I can import random and set a equal to the value of random choice that contains a list of three things the integer 42 the flow 42 and the string 42 now if I call type a on the result of that variable what type is a and the answer is that we don't know this is determined non-deterministic it could be a string it could be an integer or it could be a float depends on what Ram that choice decided to choose dynamic typing also means that the arguments and the return types of a function can be any type as well in Python if we write a function like this from negate of ABC and returns a plus B plus C if we've read a function like this how do we know that we're getting the types that we're expecting so if you solve this function and you didn't know what it does what would you expect the argument and return types to be so you might guess that they're integers and that would work right we could call fraud negate with one two and three and returned the result of the sum of those three numbers but so it strings endless and basically anything else this supports the addition operator here too right we could pass it three strings and it would eventually just concatenate all these three strings together and return that value what we can't do here is mix integers and strings together so we can't pass one two as integers and foo as a string because that's a type error we can't use the plus op to combine integers and strings all right so that function is confusing how could we possibly fix it one thing we could do is this we could write really long and detailed dock strings describe what the parameters and the return type of the function are so if you do this I would ask yourself a shirt company paying you enough to do this because this is a lot of work and at the end of the day this doesn't actually even have any control over the runtime of our program right this is just documentation it doesn't ensure that our developers are actually calling this function with the correct parameters that you described in the doc string another thing we could do is we could assert on the the argument types and the return type of everything that's passed our function so for each of the parameters we assert that the type is the type that we're expecting we do our business logic and then we're assert a certain on the return type and then return it so we don't do this either and actually I gave this talk once before and I said nobody does this at all and someone in the audience is well yeah actually we do do this and so while this is valid right and this is type checking there are some problems with it first of all there's a little bit of overhead here every year certian that we have to do here it slows down our program just a little bit and you can optimize that out at runtime but it's not that common the other problem here is that what if you forget to write an assertion right then your your program is missing some some type checking so we don't do this what do we do instead instead what we do is called duck typing duck typing means that if it walks like a duck and it quacks like a duck it is probably a duck so this means that we determine the types of a function or a variable based on how it's used in each of these you can kind of guess what foo and bar are based on what's being done with it so in the first example bar is some kind of iterable it might be a list or a string or not possibly even a dict and foo is a list comprehension so that's going to come out and being a list in the second example bar is being compared to zero so it's probably something like an int or float and in the last example it's a little bit ambiguous actually right is bar a function or class like what is foo gonna be here it actually could end up being anything so now let's talk about static typing stack typing meaning as in it a variable type is defined and is not going to change and there are actually lots statically typed languages here's some examples of the same function we're just looking at in those languages so here is an example of from making in C here it is in Java it's a dead giveaway with a public static int blah blah blah blah this one is rust you can store tell it's rust because rust has really fine-grained control of the integer types the u8 here is an unsigned 8-bit integer and this last one is typescript javascript itself doesn't have types but typescript is an implementation of type javascript and this is a dead giveaway that is typescript or javascript because in JavaScript everything is all all numbers with just the same type they are a number so we can kind of put languages into two categories first dynamically typed languages and statically typed languages and we got to put a little asterisk here next to python because that's the content of this talk and technically ruby is gonna get optional static type checking when ruby 3 comes out later this year but I'm gonna leave it off the list for now so earlier I said Python is dynamically typed but can optionally be as statically typed as you want to be now the thing to note here is that this wasn't always true and the story of static typing in Python is also kind of the story of static typing at Dropbox Dropbox is a little company with millions and millions of lines of Python code and basically at some point in their history they decided that having that much untyped Python code was actually a liability right it was slowing down their developers and making things harder for them to create new features and do the actual development but there were some other things that led up to it as well and I'll talk about those here the first was pept 3107 function annotations we got this in 2006 with Python 3 and this let us write a function like this and add some extra metadata to annotate the arguments in the return value of the function so the thing to note here is that this has zero effect on the execution of the function and we can put whatever we want in as these annotations all of this gives us is an attribute on the function called dunder annotations that gives us the result of evaluating all of these annotations so here it would be X beYOU be the sum of 5 and 6 and 11 seem to be the empty list and the return type would be 9 so anything compiling this function or interpreting this function has access these annotations but they're not this isn't maybe super useful and the pep itself sort of listed a whole lot of uses for this annotation thing they basically all boil down to we can do something that looks like static typing with this and also maybe we could stick some documentation but basically the purpose was let's do something that looks like static typing so if we think about this through the lens of saiping we could write annotations like this where the annotation itself is the type of the argument that we're expecting and the type of the return type but still this doesn't give us any way to actually evaluate whether this function is being used correctly elsewhere it's still just metadata it also doesn't give us a way to imitate variables we can only annotate functions their arguments and return types so around the same time as this pet was authored a you collect Oslo a PhD candidate at University of Cambridge was doing his PhD research and his research was on the unification of statically typed and dynamically typed languages sounds pretty interesting he wanted to use the same language for everything from a very tiny script to a sprawling multi-line code base and his researchers also focused on the gradual growth from an untyped prototype to a statically typed product meaning you don't have to do it all one so you could start with some small parts of your code base statically typed and slowly grow and add static typing to the rest your code base sounds pretty cool so he published his research in 2011 and his thesis was basically this adding a static type system to a dynamically typed language can be an evasive change that would require coordinated modification of the existing programs virtual machines and development tools however optional pluggable type systems do not affect the runtime semantics of programs and thus they can be added to a language without affecting existing code and tools this sounds really great so a Pike on us in 2013 you can introduced mypie and if you've heard of my pie before this is probably not what you're thinking of in his abstract you could describe some my pie as an experimental variant of Python that supports writing programs that seamlessly mix dynamic and Static typing this is not really true when we talk about my pie today in his research he didn't really wasn't able to use an existing language to determine how to do this he couldn't use Python the way what what he did was he created his own language instead which isn't really that crazy for theoretical research and this is what the variant looked like it could be compiled to Python and it kinda actually looked a little like Python if you squint at it the issue is even with function annotations which existed in Python the time I thought couldn't support everything that was necessary to be completely statically typed itself Yuka said he eventually presented his project at PyCon and afterwards he talked to Geetha there awesome about it PDF elf Python and Guiteau convinced him to drop the custom syntax and just stick to straight Python three let's just do this in Python three so my PI also included a static type checker for this Python variant which was modified to check Python instead and that's when we actually think of as myapi today the next thing that happened in 2014 was we got pet 483 which is a theory of type-ins this was Guiteau putting down some ideas about how static typing should work in python one thing this pet describes is optional typing which means that adding annotations shouldn't affect the runtime of your program an annotated function should run the same as an undated function and I think this sort of comes from some lessons we learned from the Python two-to-three migration basically adding static typing shouldn't get in the way of your program another thing describes is gradual typing basically the idea let's not try to do this all at once gradual typing allows one to ante only a part of a program and does leverage the desirable aspects of both dynamic and static typing at the same time in addition and describes a way to do variable annotations building on pet 3107 to give us a way to annotate more than just functions so this means that with our original function we could add a type annotation to the actual bhisma's variable here it introduced type hinting for Python - because functional annotations didn't exist before Python 3 but even those stuck in the past Khan deserve static typing so in Python 3 we can write a nice slow function like this where all the argument types return type are annotated but in Python 2 we have a way to write the same function same type annotation but just use type comments instead finally it also introduced some special type constructs so these are fundamental building blocks that we need to do static typing and these build on the existing types like int float string none type to give us some new types like any which is consistent with any type at all Union which would be the combination of one or more types optional which is an alias for the union of type and nun type a tuple which would be you know a couple whose items have those types callable etc so this lets us write a function like this where from decay takes first an integer then an integer and then the third argument here could be an integer or it could be a float depends on how the user calls it the pep also gives us container types container classes are things like lists and dictionaries that contain other objects so container types are really important because they let us define the type inside of the container class so for example I could create a user as variable here that it's typed as containing only integers I can add integers to it and that's fine but if I try to append something like a string it will fail and similarly for a dictionary I can type the key and the value so here examples contains keys of strings values of integers so I can set some guy equal to 42 and that's fine but I can't use the key of an integer this pepp also gives us generic types for when a class or function behaves in a generic manner so if I wanted to type a function as taking an iterable and I don't care if it's a string or a list or basically anything that can iterate over I can use the interval generic here and finally we have type Alice's which allow us to be more succinct with our types so this means that if we wanted to be more like JavaScript we could create a number type that's the union of all number types available to us and we can annotate our function with it the next prep we get is pet 484 which is type int in 2014 and this standardizes everything in pet 483 and my PI's behavior it introduces the typing module to provide standard definitions fundamental building blocks and tools it also introduces a lot of details about edge cases and specific use cases essentially it's how to build a type checker in Python and it leans really heavily on what my PI is already doing at this time in Python 35 we got pet 484 support and the typing module in 2016 we got pet 526 syntax for variable annotations so the existing comment style annotations was great for python 2 but not great elsewhere so this lets us take a comment style annotation like this and move it actually into part of the variable declaration one of the problems with common style annotations was that we couldn't define a type for a variable and not specify an initial type so here now with pet 526 we can just say the captive variable string and has no initial value and similarly for class variables we can now do those in line like this so with Python 36 we got pet 526 support and we almost had everything to do static type checking and Python except for a type checker now there are two types of type checkers static and dynamic static type checkers don't actually run your code they look at it at rest when it's not being evaluated dynamic checkers however would check types of while your application is actually running so at this point my PI had transitioned from being a Python variant to just a type checker and it was available on PI bi you could just pip install it so you could pip install my PI you could write some type annotated function and you would run it and my PI would tell you if you are using your types incorrectly so there are actually a bunch of type checkers besides my PI at this point my PI is mostly owned by Dropbox Google has PI type Facebook has PI are Microsoft has PI right pycharm actually has a type checker built into it and really most of these can be integrated with your editor so you can sort of mix and choose as you'd like there are also others that don't support the pet 44 spec and also a bunch of dynamic type checkers that run at runtime so disclaimer I work at Google and the inevitable question that I get here is if they all support pet 484 what's the difference between them why would I choose one over the other so I've mostly used my PI and PI type so let's talk about the differences between those two the differences between my PI and PI type come down to sort of two areas one is cross function inference and the other is runtime lenient and I'll talk about these in a little more detail so this is what I'm talking about with a cross function inference here we have a function G and when we call that function it's calling another function f now if we try to run this example we'll get a type error because we're trying to combine a string and an integer if we run this with my PI it doesn't produce an error and the reason is because my pike doesn't have the ability to infer types across multiple function calls like this whereas if we run this with PI type we'll get the error that we expect and we'll see that this is going to produce a runtime error the other difference runtime lenience sort of comes down to a difference in velocity between these two tools so pi type is going to allow any operation that will succeed at runtime and doesn't contradict any existing annotations so here we have a function f the Annotated is returning a list of strings and in that function we create a list that contains a string we add an integer to it which would be sort of like mixing types but then we do return a list of strings if we run this we get the answer to expect there's no runtime type error we get Python 2020 if we were honest with Pi type it says that there's no errors this isn't going to produce a type error at runtime but if we run this with my PI it says that we're trying to append an integer to a list that's supposed to have types of strings and this fails so you might be saying why though when and why should I use static typing and first I'll say when you shouldn't use static typing and the answer is basically never you should use static typing liberally as much as possible it's never really gonna hurt one thing I will say is that sag typing is not a replacement for unit tests and there is a bit of an argument here because sometimes when you look at unit tests they kind of just look like you're just testing the input and output types of your function but really unit tests are kind of bad replacement for our type system and in reality you probably actually just need both unit tests and static typing so when should you use static typing basically as much as possible you should use static typing when you're millions lines of scale so if your Dropbox Google Facebook you've probably already invested a lot of money in static typing because it's the lack of static typing like I said before is a serious liability for Python the scale yuccas said at Dropbox is scale which had millions of lines of Python the dynamic typing and python made code needlessly hard to understand and started to seriously impact productivity so I made a little graph for you here as the lines of code in your codebase increase your desired annotations is also going to increase but the ease of adding type annotations is going to go down so you're probably here this is kind of where you should migrate this is probably actually where you're gonna migrate so keep that in mind you should use static typing when your code is confusing so let's be honest we've all written confusing code you can kind of think of annotations as machine verified documentation so if you feel the need to document the input and the output of a function that you're writing you should probably just be statically typed you should also use static typing when your code is for public consumption for example if it's a module on pi PI adding type annotations helps developers know how to use your API and helps IDs know how to soon your ApS also if your users are already using static typing they will love you for it you should use static typing before migrating or doing a big refactor basically add static types to all the mission critical parts of your application and then go and do your migration refactor and see if any of your type annotations start to fail that'd be a place where you probably find some bugs you can also use static typing to just experiment with static typing it doesn't hurt start with some small dusty corner your code base or start with the most mission-critical part of your entire application add a little bit static typing add a static type trigger and see what happens so how do use static typing in Python in just five easy steps step one migrate to a Python greater than equal to three point six this is optional you can do type comments in any version of Python but you should probably migrate anyways let's be honest step two install a type checker locally and integrate it into your editor I don't care which one and you can even install more than life if you really wanted to step three start optionally typing your code base start with your hairiest files ordered the easiest to type files don't try to do it all at once and remember it's gradual for a reason pick critical areas and then start there for run type checker with your linting and run linting in CI which you should probably be doing anyways and five convince all your co-workers to join you if you need help convincing them you can just show them this talk on youtube thanks for watching you can follow me on twitter at di underscore codes i also want to give a huge thank you to the PyCon staff for everything they've dealt with and everything they've done to bring this conference online they absolutely deserve your thanks as well so be sure to let them know you appreciate all the hard work that they're doing see you next year
Info
Channel: PyCon US
Views: 13,109
Rating: 4.967051 out of 5
Keywords:
Id: ST33zDM9vOE
Channel Id: undefined
Length: 21min 49sec (1309 seconds)
Published: Wed Apr 22 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.