Ned Batchelder - Facts and Myths about Python names and values - PyCon 2015

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello y'all ready for the next talk whoo all right so our next speaker probably familiar to many of you is Ned Batchelder you probably know him from his multitude of other awesome PyCon talks he's the driving force behind the Boston Python meetup group he is a huge presence on IRC he's going to make faces at me as I go down all the awesome stuff he's done he wrote coverage pie and he's now responsible for teaching billions of people awesome stuff at open edx thank you I'm going to try to break with tradition of a few of my previous PI contacts and most of our keynote errs and not swear on stage that's going to be my goal we're going to see if I can do it so I'm here to talk to you about facts and myths about names and values in Python first a quick word about openedx that ned helpfully mentioned our session shares also named Ned confusingly openedx is educating the world it is all open source you can find out more at openedx organ we're having an open space today 3:45 in room 510 B but on to Python names and values so python is a very simple language when you come to it it often works just as you'd expect if you've worked in other languages so it works like those other languages do until it doesn't so there are surprises that will get you if you come to Python with sort of an intuitive understanding of what might happen underlying mechanisms are very very simple but the effects can be surprising and I know personally I used Python for about 10 years without being able to describe some of the things I'm going to describe to you now so it can you can get along for quite a long while without understanding precisely what's going on but what I'm hoping to give to you today is an understanding of some fundamental mechanisms that will help you reason about your code and understand or hopefully even prevent some of those surprises and the things we're going to be talking about our names and values and assignment and mutability and as we go through this talk the facts in the headers of these slides are 100% facts simple statements about how Python works and the very first one is extremely simple names refer to values so the way variables work in Python is that they are a name that refers to a value that sounds like any other language an assignment simply makes the name refer to that value so for instance when we execute x equals 23 23 is an integer object which is drawn in this diagram with a circle around the 23 that tag shaped symbol with the X in it is the name and there's also it's like a tag hanging off the X but it's also kind of like an arrow and it points to the 23 so the X refers to the 23 and then the next time in your program when you use the name X python goes and finds the value it refers to and it uses that value so when you say exit prints at 23 very simple right I'm not I hope there's no one in the room who is baffled by this right I'm starting very simple let's just get the facts laid out in a line now many names can refer to one value so once we've assigned X equals 23 we can also say y equals x now note then we say y equals X Y is not referring to X Y is referring to the value that X refers to so Y and X are both names for the 23 neither one of them is the real name they are both equally valid names for the 23 and of course you can have many many more names any value can be referred to by as many names as would like to refer to it names are reassigned independently so if we have x equals 23 and we say y equals x when we say y equals 12 x equals 12 rather it makes X refer to 12 but Y is still 23 right if you had looked at these three lines of code anyone in the room if we put a print y I should have put a print Y there we know that Y is still 23 right making X be 12 doesn't somehow make Y also be 12 right and I know this seems very simple but you're they're going to be a point later in this talk where this fact is going to be very important and you're going to remember this fact and it's going to explain a thing that surprises lots of people in Python memory is managed dynamically which means that values exist until there are no more references to them so if we have X referring to the string hello when we make X refer to the string world the hello value now has no names referring to it and so it is going to be reclaimed and it is removed from the process and exactly how that happens and when that happens doesn't matter the important thing to a Python programmer is that as long as names are still referring to values the values are still there and once all the names are gone the value is completely inaccessible now here's the fact that many people don't know about Python and starts to bring in the surprises assignment never copies data and there's no asterisk on this slide it's really true it never copies data if you're thinking now and I've given this talk I've rehearsed this talk a number of times and there is always a person who when I chat with them beforehand I think this person knows every fall of this stuff they're not going to get anything out of this I can entertain them with some of my clowny jokes but they know all this stuff and then afterwards they come up to me and they say yeah you know there's that one case the word isn't quite true and no it is always true it never copies data so here's nums referring to a list one two three when we make say other equals nums we didn't make a copy of the list now we just as before when we had x and y being both names referring to the same integer now we have other and nums which are two names referring to the same list there is only one list and as a result when we get to that same state other equal nums and we look at the value nums and we use a method on it appending a four to it where we've changed the list now to add a four to that list and we print other other is also one two three four because there's only ever been one list so if we modify it with append all the names that we're referring to that list are going to see that change and this is where people get surprised right lots of people will come and write a simple program and they'll think I want to keep that old list so I'm going to give it a new name and then I'm going to modify the new name and the old name will still have the old data but there's only wouldn't been one list because assignment never copies data so this is what's known as mutable aliasing and happens when you have a mutable value so lists have methods on them to let you change the value in place when we used dot append on the list we didn't make a brand new list we had one list object which changed its value we had more than one name the value is changed and all the names see the change now in the three little lines that I've got here four lines it's very easy to see what's going on because the two names are right next to each other but as we'll see later it's very easy in a larger program to have the two or three or four names be widely spread apart and for the change to happen in one place and for the use of the data to happen in another and so the surprise can be much more long distance than I can show on this slide now there are also immutable values in mutable values I'm from New York City and people told me that you can't understand me when I say immutable values like that so I'm going to try to slow it down in mutable values our values that cannot be changed in place Python types the numbers instant floats and strings and tuples there are no methods on them that let you change their value and if you think there are you should go back and look again because there aren't so when we say x equals hello X is referring to hello Y is referring to haha hello now when we say x equals x plus they are what we're really doing is building an entirely new string hello there and then making X refer to it and as a result Y still is hello because there is no way to change the value in place so we didn't have all the conditions necessary for mutable aliasing so you don't get aliasing here this is one of the reasons that people really like immutable values and let languages that support them much more strongly than Python have some real advantages over Python but you can still make use of these values in Python and if you understand where you're mutating values and where you're not you can build programs that are easier to reason about part of the problem with talking about this code is I've been using the word change but change really has a couple of meanings so it's unclear what you mean or you can get sloppy in your thinking about what you mean and it can really help to drill in on exactly what you mean by change and use different words so when we say we're changing an int x equals x plus 1 as you know InSAR immutable they cannot change what we mean by x equals x plus 1 is that we are going to rebind the name X so X refers to an inch sometimes that's called it's bound to an int we are going to rebind x so we're going to take X plus 1 which is going to give us a brand new integer and we're going to make X refer to that other new object we're rebinding X when we change the list by appending to it we are actually mutating the list so this statement changes X this statement changes nums but this is rebinding X this is mutating nums they're very very different right nums dot append 7 doesn't make you a new object but x equals x plus 1 makes you a new end so we'll use change informally when we talk about things but when you have to reason about these things that can help be helpful to actually use those specialized words you can also rebind lists so we can also say numbs equals numbs plus seven numbs plus seven gives us a brand new list and then numbs equals that reminds that there is no way to mutate an int because they are immutable now some people will look at these problems and try to explain to you that while mutable objects and immutable objects are assigned differently and that is not true they are exactly the same for all values it's changing that's different for them and so you can get different effects so the aliasing that you can get by the differences in how you change values can give you different behavior and for some reason people like to attribute that change in behavior to the assignment operator statement but it's not the assignment it's the change there are some other assignment variance when we say X plus equals y conceptually that's the same as x equals x plus y but what it actually is is it calls a method on X called under I ad passing it Y which means that the type of X can decide what actually happens as it happens the way list works is that list implements dunder I add by using its own extend method to extend itself mutating itself in place and then returning itself which means that when you say num list plus gets four comma five what you're really doing is you're extending it with 4 comma 5 and then doing a no op assignment of itself unto itself so you have to know what you're doing right this operation num list plus equals it looks like an assignment operate a statement which is a rebinding and technically it is but your rebinding the name to the value it already had so you're really just mutating something in place so you really have to know what you're doing and by the way the Python Doc's don't mention this behavior at all at least I can't find it so that's a challenge to you find me we're in the Python documentation it explains that lists do this now references can be more than just names so for instance list elements are references I've been drawing the list nums West as a row of boxes with numbers in them but really each of those boxes is itself a reference to a value and the integers are floating around in the matrix somewhere being referred to by the L by the list elements and for example the same things that you can do with names you can do with list elements so when we say x equals nums plus sub 1 that makes X refer to the same thing that the middle element of nums refers to so now we've got num sub 1 refers to 2 and X also refers to that 2 and lots of different things are references so object attributes and keys and values and dictionaries or references and list elements of references and it all nests together very common and complicated ways anything that can appear on the left-hand side of an assignment statement is a reference and all of the things that we're learning about how names behave those references do the same thing so you can have many references to objects they don't refer to each other reassigning one doesn't reassign the others all those sorts of things and by the way lots of things are assignments so the assignment statement is the obvious way to make a name have a new value but in Python lots of statements behave exactly like assignments so this is the one we've been talking about x equals some value but when you do a for loop for X in something that's actually assigning to X over and over again and I'll show you some examples when you define classes or functions you are assigning to that name this is the most important one we're going to cover this quite in quite some depth all of the arguments and viewer functions are assignments to the local names in the function and so on and so forth now you don't get in too many cases where import X is something if they're worried about how assignment works but it is exactly the same so let's talk about for loops a little bit when you see a for loop like this for X and sequence to do something with X what's really happening is X is assigned the first element in the sequence and I'm waving my hands over the whole subscript of 0 thing the important thing is that a value comes out of the sequence and gets assigned to X just as if it had we had said X equals that value and then we do the thing with X and then we assign to X again and then we do the thing with X and that's how for loops work and I don't mean it's kind of like assignment again it really is an assignment to the name X and whatever behavior you would have gotten by literally writing out all the assignment statements that's what behavior you're going to get from the for loop to give you a concrete example by the way one of my previous talks was looped like a native which I did two years ago at PyCon which you can look up it goes into the four loops in iteration in much more detail let's say we have a list of numbers and I want to change that list to have all of the values be ten times more than they had been all right so I'll make my list of numbers one two three and then I'll start iterating over my list for X and num so now extra first to the first element of nums and then I'll compute X times ten and I'll assign that to X now you can see what just happened here right so what we had there was the vert that one that integer one was referred to by two separate names it was referred to by num Sub Zero and it was referred to by X and when we realigned X that only modified one of the names because remember one of those things that you thought was so simply didn't have to pay attention to that early slide reassigning one of the names doesn't reassign the other names now here we had two names referring to the integer and we only reassigned one of them and the one we actually cared about was left behind right so if we go through this we can see we're computing all these nice ten times numbers but nothing is changing the list and at the end when we print num it's still one two three right so you might have written this code on your first day of Python or maybe on your second week of Python who knows when and been confused by it but now you understand enough to see all those really simple statements I made at the beginning pile up to explain exactly why this is happening but let's talk about functions because functions is where you're actually going to get into trouble okay some of you in the audience we're thinking yeah that was that weird time that that list changed and I didn't know why I'm telling you why it's because of function arguments so num equals 17 when we call a function functions have formal parameters in their definitions so X here is a formal parameter and they have actual arguments num in this function call is the actual argument and what happens during a function call is that the actual arguments are assigned to the formal parameters so these two statements together effectively do x equals num and when I say effectively I should really say it's actually an assignment to that name X's assigns the value of num and I've drawn here in a dotted line the stack frame which holds the local names of the so as you know when you call a function you get a bunch of local names that fall out of scope when the function returns I'm drawing that as a dotted line around those names so now we have num and X both referring to 17 when we print X of course we'll find 17 it will print when we return from the function the stack frame goes away taking with it any names that were in that stack frame and those names now are removing references from values which may or may not go away depending on if they were the only names referring to the values and when we get back out of here the only thing that's left is our number is still 17 exactly you should expect let's talk about a more complicated example let's say I want to write a function that's going to append a value twice to the end of my list right so I get my usual numbers 1 2 3 I'm going to call append twice right so the formal parameter a list of numbers twice is going to get assigned the value from nums so now we have two names referring to the list and we're passing in 7 is the value when we append to a list we're going to add a 7 to the end of the list we're going to add a 7 to the end of the list again when we return the frame is gone the names are going away we come back out nums has been modified 1 2 3 7 7 right so our function works great we've appended to the list twice right and you can see why it works and why the list the list didn't get copied in and then get copied out it was just a reference that was used from the function to the caller let's write it again a bad way so here's a list of numbers we're going to call a pen twice bad exactly the same way and then in here we're going to use a list plus Val Val is going to make us a new list 1 2 3 7 7 we're going to assign that to a list and then we're going to return the frame goes away the names go away on any of the values that only had one reference go away and we come back out and we've done a lot of work for nothing right so we did all that work in there the local name went away and we've accomplished nothing yeah yes so how can we fix that so here we can call a pen twice good we do the same thing we make a new list for a list but now we're going to return the value right so we've made a new value inside the function the return statement is going to give us sort of a little reference there on the frame so it can hold on to that value that's coming back we assign the return value to nums so that we can get that new list and now when we print num as we have what we want right so we've made three stabs at writing this very simple function to append to a list right and here they are so the first one mutates its argument this is a mutating version of a pen twice the list you pass in is mutated in place this one is useless don't worry about that this one makes a whole new list and returns it the best advice I can give you from all this is the best way to avoid the surprise of mutable aliasing is don't mutate values write functions like this that make new lists and as Ned mentioned I hang out in the Python IRC channel and there's a lot of times that people come in with questions like this and the advice is always just make a new list make a new list just make a new list it will make your code a lot easier to reason about now a few more facts about names and values just to cover everything python is dynamically typed at least until whatever we hear about from Kyoto morrow afternoon we're all afraid about that but any name can refer to any value at any time right names have no types associated with them so X can be 12 and then it can be a string and then it can be a list and then we can actually the list doesn't even have to be homogeneous it can be an int and a string and an end and it wasn't until I actually was preparing this talk that I realized there's this very interesting duality that names have a scope they come and go with functions but they have no type and values have a type right the int is always an int but they have no scope so you've got all these values that in C terms live on the heap and all these names that come and go in your stack frame other languages names will have both scope and type but but Python sort of splits it cleanly in two and keeping this in mind that values that are created way above you in the call stack can be used by functions down below and vice-versa can help you understand how data is flowing through your program and help you write programs that won't have big surprises so here's some other topics that wouldn't have fit in all honesty this entire talk was spite driven development by seeing people say that python has no variables and I'm hoping that most of you are baffled by this statement which means that you haven't heard it which means that it's falling out of favor when people have said this what they mean is that the variables don't work like they do in C and this is a silly way to explain Python for a number of reasons and I would be glad to rant about it over a beer with any one of you another question that comes up from people coming from other languages is Python call-by-value or is it called by reference and the answer is there's a couple of answers one is neither one is kind of both one is you know why are you worried about it some people explain it by saying that that Python is called by value but all the values or references which i think is just making matters worse i like to say it's called by assignment really because that's what it is right when you call the function what you're doing is assigning to that name of us understand assignment then you understand how the call works if you saw Amy's talk earlier she did the the tech Tac Toe example much better than than this but making a 2d list is complicated and people will often get it wrong at first and it has to do again with sharing values rather than making new values and lastly if you're having questions about how your code works there's an awesome site called Python tutor where you can literally type in your own python code and step through it and it will make drawings like the ones I've got in this talk they're not as pretty but they're working on your code they happen automatically whereas the ones in this talk I their untold man-hours that went into them that's what I've got I'm happy to take your questions this is an awesome comic for those of you who don't have a question the there's one loop that goes 0 through 9 through that linked list and the other goes through the digits and another determined but mysterious order and you'll be interested to figure out what it is and don't forget about the open EDX open space yeah awesome so do if we have questions for Ned go ahead line up to that microphone right there in the center ask me a question hello hello really wonderful dog thank you I had a just a quick question just to sort of make sure I understood correctly if you have a list and you assign a member of that list to a variable and then later on you modify that member of the list independently if you do print you print out the variable you will actually get the modified version not the old value you'll cap the old one because right it's the same situation of having two names referring to a value and you've changed one of the names it doesn't change the other name okay there's there's no way in python to have a name refer to another name you can only have names referring to values that's the other thing I should have put on the this slide which is the this slide which is the arrows all have to go from the left side to the right side there's no arrows in and among the left side any other questions that nail it really you nailed it I nailed it Steph I'm waiting I could have slowed down and used the five minutes for more slides ten point Oh Russian judge loved it wait here so with those dotted dotted lines around the boxes there you show the values that are in the function definition going out of scope as soon as a function definition or the function exits yeah what happens when there's the I'm sure everybody's experience the pitfall of a mutable default value for a keyword Arg yeah so if you do default values are held on to by the function itself so if you get if you use a list as a default value for a function argument that is a value that is referred to by the function object itself and so it will stick around forever and it will be used for every function call that gets made and therefore it will grow indefinitely and surprise you thank you sure all right thank you enjoy the rest of the talks Thank You Ned thank you everyone
Info
Channel: PyCon 2015
Views: 75,634
Rating: undefined out of 5
Keywords:
Id: _AEJHKGk9ns
Channel Id: undefined
Length: 25min 20sec (1520 seconds)
Published: Sat Apr 11 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.