Transforming Code into Beautiful, Idiomatic Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Raymond Hettinger. Here's a helpful life rule, when this guy gives a talk, go to it. Stupid title? Don't worry, his content is consistently awesome.

👍︎︎ 28 👤︎︎ u/sittingaround 📅︎︎ Mar 22 2013 🗫︎ replies

Love this guy, it's great that he can use his time to both develop and teach Python.

👍︎︎ 25 👤︎︎ u/[deleted] 📅︎︎ Mar 21 2013 🗫︎ replies

What an amazing speaker!

Also, I had no idea that generators were actually faster. I thought they just spared memory. Themoreyouknow.

👍︎︎ 10 👤︎︎ u/[deleted] 📅︎︎ Mar 22 2013 🗫︎ replies

very informative and nice examples.

it seems the talk is not up on pyvideo.org (yet?) could someone plz link the slides?

edit helped myself -> https://speakerdeck.com/pyconslides/transforming-code-into-beautiful-idiomatic-python-by-raymond-hettinger

👍︎︎ 8 👤︎︎ u/DukeNucleus 📅︎︎ Mar 22 2013 🗫︎ replies

TIL about 'reversed''. Wow, this is a great video. Thanks!

👍︎︎ 9 👤︎︎ u/metaobject 📅︎︎ Mar 21 2013 🗫︎ replies

I'm pretty sure that around 7:25, instead of

for i, color in enumerate(colors):
    print i, colors[i]

he meant

for i, color in enumerate(colors):
    print i, color

Good video btw.

👍︎︎ 11 👤︎︎ u/adamnemecek 📅︎︎ Mar 22 2013 🗫︎ replies

Oh man, that simultaneous updating of multiple variables .. I was bashing my head thinking that there MUST be a better way of solving and updating physics variables, and there it is!

I feel like I've learned a new language, I've been holding onto C for so long. Are there any books or more videos with this sort of stuff?

👍︎︎ 4 👤︎︎ u/[deleted] 📅︎︎ Mar 22 2013 🗫︎ replies

This was probably my favorite talk of the whole conference. So much knowledge packed into such a short presentation.

👍︎︎ 4 👤︎︎ u/LinFTW 📅︎︎ Mar 22 2013 🗫︎ replies

After this talk, he became my personal hero ;-)

👍︎︎ 5 👤︎︎ u/Lisuyo 📅︎︎ Mar 22 2013 🗫︎ replies
Captions
you can be sure you're getting the straight story or they will correct me all right in the middle so this was labeled a novice taught an intermediate talk and an advanced talk altogether I've put some of each in its got a lot of code in it you don't have to memorize it as it goes by I'll actually give you all my slides they're already uploaded they are actually meant to be something to take back to work and use immediately go through all your code base and find everywhere I said don't do this do this instead and swap it and your code will be beautiful faster more idiomatic and more pythonic and less like C Java C++ and other languages not that those languages are bad they just have different idioms than we have oh I'm curious how many people in the audience have taken a class from me at one point Wow I've taught over a thousand engineers in the last year and I've gained an appreciation for what things are really beautiful in Python what works great for people so it looks like I've got a number of former students here how many prospective students none oh ok a lot of them good enough and as all of my students know practically every example starts out with Raymond equal read Rachel equal blue Matthew equal yellow in case you're wondering who they are they're right at the back door right there the lovely lady and little boy that is Matthew and our Rachel all right I was supposed to introduce myself Ravin Hettinger Python core developer if you would like to have fun of doing this we can play a game whenever I put up a construct and say this is awesome clap if you know whether or not I wrote it ok so I get to talk about some of the things I created there are some awesome things in there I didn't create your game is to decide which one so clap if you think you saw one that is neat and is something I wrote the other interesting thing on this slide is the at Raymond H I use my Twitter account differently than other people I try and teach Python through Twitter I don't tweet when I arrived at an airport or when I left or anything like that it is technical tweets and so I don't waste your time I teach Python in little 140 characters at a time which is a very interesting challenge because you can get one little example and sure enough someone will tweet back but in Python 3 it does are in version 2.4 it's 140 characters you don't get to put all the footnotes just saying so without further ado let's start at the beginning the novice part and work our way up pretty much in every other language I've used you use indices quite a bit to do array lookups pretty much when you use an index in Python unless it's a fairly exotic circumstance you're almost always doing it wrong we've worked really hard to see if there to make better ways to do it so you don't have to manipulate indices I'm also going to show you something more advanced very few of you probably know about the else clause on for loops I'll show you what it's for very few of you probably know that the inner built-in function can take two arguments I'll show it what what is for but overall our goal is to aim for fast clean idiomatic Python how do you loop over a range of numbers simple enough you make a list and loop over the list what is important about that example pythons 4 is not the same as it is in other languages if you someone comes to my class and says I know what a for loop does because I know see that is not what this does we probably should have named hours for each what it does is leap over collections it uses the iterator protocol it is in no way like the forward that you grew up with is there a better way well we can use the range function the range function the output of it is that list up above in other words these two things were do the same thing in exactly the same way many people come to Python see the second one and say to themselves it's the same as the for-loop that I learned in see our basic or some other language it's not what happens here is range produces that list then the for loop loops over the list I know what you're thinking if I do range 1 million that list is going to be kind of big and in fact on a 64-bit bill that will consume 32 megabytes of memory is that awesome do not awesome so there must be a better way it's called X range X range creates an iterator over the range producing the values one at a time which is better range or X range X range which is ugly X range X range is a horrific ly bad name we didn't know how bad until Python 3 came along in Python 3 we got rid of the old range and renamed X raised to range so I started going through my programs where I used X range everywhere and I took off the X and they were profoundly more beautiful I didn't know how ugly the X was until I got rid of it you'll like Python 3 better it looks better remember beautiful is better than ugly looping over a collection how would a see programmer do it well they would say for I equals 0 I listened in I plus plus look up the 8th color then they get to Python and say how do you do that in Python they do it this way some of you already know this how are you supposed to use this information you're supposed to after this talk go back and look through your entire code base and do a grep for that whenever you see that don't do that do this instead even in professional code bases with really good programmers as I go from company to company looking at their code I see the first one all over the place just fix it it's simpler easier to read better and in Python it's faster how to loop backwards oh by the way somebody clap for X range that predates me I go back almost 13 years as a 12 years as a core developer somewhere in there X range was just before me I'm not responsible for the X ok how to loop backwards a C programmer knows how to do this for I equal n minus 1 I greater than minus one count down by minus ones that translates directly into Python the idiom that you learned on the first day of class in C works exactly well in Python and it's grotesque it's horrific and it's what we had to do until we introduced a better way the better way is to use reverse which is faster the first one or the second one second one which one's more beautiful so I get one which do I see in a lot of code bases first why why would people do that the answer is they're gravitating toward back toward the mothership in the mothership that is where they came from prior to Python that was the way to loop backwards in almost every other language you know and because you've learned it you gravitate toward it you use indices all over the place and you try and write to for loops as if pythons form it the same thing I bet if we renamed it for each it wouldn't be as pretty but everybody would use it correctly that's how you look backwards I heard no claps what's special about reverse I did that okay looping over a collection and the indices at the same time see programmer have no problem because they are already looping over I they can print the ayahs ition and the ice color so the output of this would be zero red one green to blue three yellow how do you do it in Python without indices the answer is used to numerate good call Larry he wins I did a numerator so enumerate is a simple clean way to do it it's fast it's beautiful and it saves you from tracking the individual indices and incrementing them what's the Q here whenever you're manipulating indices directly you're probably doing it wrong just say it go scan your codebase you'll find this somewhere take it out replace it with enumerate it's fast and beautiful and readable how to loop over to collections that uh once every C programmer knows what to do take the shorter of the two lists the minimum of the links loop over the indices and look up the ice name in the ice color why would they do such a thing because it works in every other language they've ever learned what's the Python way zip I said really the Python way actually it goes back 50 years it was in the very first version of Lisp if you read the original paper that came out on Lisp it was in there zip has a deep history it is a proven winning performer it's what you want he is do you love it yeah I think it's clean and beautiful so anything wrong with it what's wrong with it a creek to loop over this it manifests a we took start of a to list it manifest a third list in memory that third list consists of tuples each of which is its own separate object with pointers back to the original in other words it takes far more memory than the original two lists combined this is no fun it doesn't scale what's this whole scaling and speed thing it used to be if you ask me how to make Python how to make any program go fast I teach you about loop unrolling and remembering previous calculations and whatnot but on modern processors only one thing matters is the code running in l1 cache because if it's a you have a cache miss the Intel optimization guide has this horrifying line in it that says the cost of a cache miss is that simple move becomes as expensive as a floating-point divide it can go from a half clock cycle to 400 to 600 clock cycles you can lose two and a half orders of magnitude by not being in cache if these lists are really big do you think that zip is going to fit into cache I don't think so there must be a better way and it is izip so I zip is better yep that was me I did that one I got in in just the right time yet a Raiders have just been created I like I'll make an iterator out of everything like wow it was really smart like no cuido made the iterators I just put them everywhere so it was his a brilliant idea and it's gone very far it's one of the things that makes Python beautiful and fast looping in sorted order so we can loop over collection by doing sorted colors it's pretty easy to take any for loop and just drop sorted in it and now you loop over it and sorted order Mertz okay yes I did sorted too okay so how to loop backwards reverse equal true simple enough how do you do a custom sort order this was the traditional way you made a custom comparison function that compared two keys and returned either minus one one our zero depending on whether it's less than equal greater Harold is just Graham missing but there's others who are not grimacing there's others say that's the way I learned it in C that's the way Q sort works and those of you who are older who you learned with Q sort and comparison functions you're going to have a hard time letting go of this in fact you'll fight with me you'll come with me over and you will try and invent examples of where you have to have a custom comparison function and you might not even listen to me when I tell you this horrifyingly slow it's no fun to write a function like that you can write one the shorter this gets the job done and how many times will this function be called well if you have a million items in a list and you're doing a sort the number of our comparisons is in log in a log of a million base 2 is 20 so that's 20 million comparisons is there a better way so that's 20 million calls to that compare function which is long and slow here's a better way sorted colors key equal lengths the key function gets called exactly once per key which is better 20 million or 1 million 1 million is the function shorter for key it's almost always shorter oh so those of you who grew up on comparison functions will probably argue with me and say I can invent a comparison function where you can't make a key function and if you get really creative and work really really hard at it after a hundred tries you're going to find one that I can't do I can't do easily although I do have a function that will convert back if necessary how do we know that key functions are sufficient who likes to sort all the time SQL people they sort all the time they're ordered by this that in the other do they passing custom compare functions you they have key functions ordered by some of relative frequency order by this field plus that field if they can get by with key functions you can to this code is shorter more beautiful faster and should you abandon your key key comparison functions absolutely in fact I've abandoned them for you and I ripped them out and they're no longer in Python 3 good by comparison functions I'll live how many of you knew all of that stuff already okay let's see if I can take you to someplace you haven't been before the traditional way to do a loop over of a function call that has a sentinel is you do a while true loop when we do an F read reading a block of 32 bytes at a time eventually we run out of data and when we do the what F read returns is a sentinel value an empty string whenever it's an empty string you can break out of the lip so we append the build the block so one at a time by the way how should you what you've the output of that is a big list of strings how should you connect them together join how should you not connect them together plus oh hey you guys are all on top of it so this is the traditional way you will see this code all over the place did you know that the inner function can take two arguments where the first argument is a function that you call over and over again and the second argument is a sentinel value the single value this says it will call read 32 over and over again looping one block at a time we get to use four lips which should have been called for each which are fast and beautiful instead of the while now some people would argue because I had to use a partial in here that is slightly less readable than the original but remember what I made here I didn't just have to hand it to a for loop you know the first one these two are equivalent in terms of what they do but I didn't have to use this with a for loop the moment you've made something interval you've done something magic with your code what have you done with your code as soon as something is iterable you can feed it to set you can feed it to sort it min max heap queue some many of the tools in Python consume iterators as soon as you've made something interval it works with all the rest of the Python toolkit so the part to concentrate on is not the for loop part it is the to argument form of itter add that to your toolkit if you've not seen it before it's a good time to learn in order to make it work you the first function has to be a function of no arguments how many arguments does f read take 1 how do you go from 1 to 0 partial partial take some function of many arguments to a function of fewer arguments if you haven't tried this before go home and play with it and learn something new welcome to the world of functional programming and the magic of this is there are lots and lots of functions and especially in older AP is that are intended to be called over and over again until they give you a sentinel value it's called a control break style of programming it used to be very widely our practice until there was a certain little hiccup there was an insurance company that's processing big decks of punch cards for insurance claims and they brentd the deck and then they stick in another deck more claims in other decks and at some point they needed to tell it to stop so they put in a control brake field a sentinel value that told it when to stop one day they ran through a deck of cards and stopped right in the middle they reran the deck stop right in the middle I stopped at the same card every time this is a true story I got it from programming pearls the cause was the claim came from Ecuador the capital is Quito and quit was the control Blake symbol and when it hit the city and said quit it did there's a reason we don't do this anymore it's the same reason that we don't terminate our strings with nulls anymore because sometimes with my stick nulls inside the string fair enough but if you encounter an API like that the two argument form of itter takes it out of the old world and into the new world of iterators who learned something new all right distinguishing multiple exit points in loops I didn't come up with this cuido didn't come up with this this goes back to the go to our Wars lots of people hung on to go to and wouldn't let it go until every unknown use case could be replaced so Donald Knuth sat down and I to Mies the most common use cases of the go twos and he came up with some structured equivalent that would do the same and so one problem was uh when you do a is something like a for loop you need a flag variable to say is something been found or not found now keep in mind this example one we already have a built in fine so you don't need this code to begin with number two I could have exited out early with a return so I know it's a little simplistic although it is newest example his point was typically code like this occurs intermeshed with other more complex code in their operations so that there is not a shortcut out the usual solution to the problem is to put in flag variables which slows down your code and makes it less readable we'll start with found equal false if we find the target value is true we will change the flag and then act on the F flag at the end there's a better way a way that is shocking and jarring to most people coming from other programming languages because they've never seen it before and it was nice ideas not quito's and now we actually have else clauses on for lips remember the for loop has essentially got an if inside and it's saying if I haven't finished the loop keep doing the body if I haven't finished the loop keep doing the body what construct is normally associated with if-else so what the else means is I finished the body is there any more of the body of the loop to do no else so you can think of it that way inside every for internally there's a an if and go to and this is the else associated with that if some people have a hard time remembering that way if I could go back in time and talk to cuido if you give me the keys to the chyme machine I would say back when you first made this language else was exactly the right term because it's what Knuth used and people knew at that time all fours had embedded and if and go to underneath and they expected the else but in the future no one will know that because we're all using for structured programming already so why don't you call it no break if it was called no break everybody would know what it did there are two ways to exit this loop you can finish it normally or you can break out search search to your house file for the keys there are two outcomes you find the keys and come out or you search all the rooms and they're no more two possible outcomes they're distinguished with the else if that was caused called no break even know what it did if we finish the loop and didn't encounter a break return minus one if we finish the loop normally return I who learned something new now you know where it came from Donald Knuth guess what Cueto was a reading when he came up with this Donald Knuth guess what whether he was thinking about the future knew he would have called it a no break at which point everyone would know what it did just like if we called lambda make function no one would say what is lambda do it would be called make function alright dictionary skills those have you been in my classes before you know I start out with dictionaries at the beginning and covering them the second day the third day and the last day because there's two kinds of people in the world people who've mastered dictionaries and total goobers all right they are the fundamental tools for expressing relationships linking counting and grouping here's your core dictionary skills how to loop over the keys for K and D nobody clapped I didn't do that one cuido did that one but I got two he was sitting on the wire what should the for loop would do with the dictionary half of the people wanted the for loop to return the key and the value at the same time the other half just wanted the key I went and researched what other programming languages did went back into small talk a grip through a lot of existing code doing counts and see what people most needed most of the time when they looped over I looked to see what was consistent with if you wanted to treat a list as a dictionary the indices of a list are parallel to the keys in a dictionary and I kind of laid out a argument and leave a tip the scales and that's why it's for K and D these are another way to loop over the keys yes you could just ask for the keys and loop over when should you do the second not the first it's when you're mutating out the dictionary and the first way if you want to mutate the dictionary you can't do that while you're iterating over in fact in any programming language for the most part if you mutate something while you're writing over it you're living in a state of sin and you deserve whatever happens to you in this case though DDOT Keys calls the keys argument it makes a copy of all the keys and stores them in the list at which point it is you are free to go mutate the dictionary and delete all the keys that's a start with our leaving just Matthew that's kind of the way it is around the house all the keys that started with Oregon and now it's just Matthew I brought him to Pike on people just come up to me and oh and they look it right at the baby I deleted all the keys starting with our okay next one more ways to stand quite forward if I pitched forward you'll know what happened okay looping over to the keys and values at the same time one way is to loop over the key and then look up the value it's it very fast no because it has to rehash every key and go do a look-up on it if you actually need the values there's a better way my items and so we're using tupple unpacking here if you need both loop over them directly no lookups are involved is there a better way yeah because items makes a big huge list the better way is enter items so iterate on so we'll return an iterator you might clap on the I didn't do that I think that one was berry berry Warsaw okay yay for berry okay another berry looping over the dictionary of in that's a from pairs in pairs but to loop over them together oh constructing a dictionary yes so zip is fantastic because as it assembles the pairs the dictionary constructor you might not have known will accept a list of pairs or any iterable of pairs so the easiest way to assemble these two into a dictionary is to i zip them together if you marvel at this one and I think you should the thing to the thought to go away with is the parts in Python fit beautifully together how do you take two lists and join them together seamlessly and construct a dictionary it is one two three four words of Python it doesn't get much more beautiful than that look back through how you're building up your dictionaries if you already have the inputs available izip is a fantastic way to do it let's i zip a better way do this then zip yes now it still has to make a couple in each iteration right no I went in put it inside I checked the reference counter after the dictionary has consumed the tupple we loop back around to make the next level we reuse the previous one so it can actually build this without entering any inter meaning intervening calls to the allocator it just takes one top one raises it over and over again in other words is this fast yeah absolutely well I still got 15 minutes left that's great how to count with a dictionary you guys probably know a number of ways to count when you teach people Python dictionaries show them get first oh and show them the most basic methods first so these are the most basic ways to loop over a dictionary loop over the colors check to see if the colors not there if it's not there adding since now a square bracket lookup is conditional it can fail if the key is not there raising a key here but in this case we've just put the key in so we know it's there this will this last line will always succeed this is a simple basic way to count everyone should know how to do it don't immediately start them with the most advanced thing because if a person can't do this they will be helpless on any more complex problem with our dictionaries should you start them right away with default dicks and whatnot you start them this way what's the next level of improvement over this if I don't want to use anything exotic and I want to use the core dictionary API those of you in my classes I threaten you all the time with math you will remain fatherless unless you know this particular method I threatened to keep people an extra day if they don't know this the method is get yeah set default sets in the case of accounting all we need is I get and so the code up above simplifies to this we get the color the colors missing return zero and add one to it we don't need a set default all we need is the zero we need to look up to not fail is there a more modern way yeah what is it default date when I answer a question on Stack Overflow and I put one of these first two please don't immediately go change it to default dick all you're doing is taking somebody who couldn't count in the first place and then into where they have to import the collections learn the distinction between a regular dictionary and default dictionary they have to know that about factory functions they have to know the int because no arguments producing the value of 0 oh and then when they get something back it's not actually a dictionary it's a default ticked and needs to be converted back for some use cases in other words if you hand this to a beginner you usually made them worse off make sure they know the first idioms of before they drive on but that said what I use is this or I use collections I'll counter okay how to group with the dictionaries this is an example I've used over and over again how do i group these names together I forgot what I've grouped them by their length of their first letter the traditional way use that a person should learn first is start with an empty dictionary the key of the dictionary is what you're grouping by so Raymond is of length 7 that will be the key and the value will be a list of all of the names of a length 7 if you'd like to accumulate a lot of points on stackoverflow know this because this question get asked gets asked about once a week what should you immediately take them to when someone wants to group the collections module mu how about show them set set default so this one the output of it by the way is Roger has five letters a Raymond Matthew Melissa and Charlie all have seven Rachel Anna juditha have six by the way if you're grouping by anything else you only need to change the key line maybe the key is name zero that will group people by the first letter name minus one key could be the number of e's in the name you can group by just in almost anything using this idiom there's a better way though it's set default we actually need set default because we want to return the list so we can append to it but we also need it to be inserted in set default is just like get but it has a side-effect of inserting a missing key for a long time this was the idiom for grouping in Python I think it is not particularly beautiful Python though how the word set default is really bad and everybody thinks it's awful but no one can think of a better name every other name we've ever experimented with had like about 50 letters in - well this goes into a dictionary it looks to see if a keys there if it's not it takes the default value and sorts it into and returns it so you can group with it that would actually be the best name but it's a little long and now the modern way transferred your transform your code into this default dict list that will create a new list and it is far more beautiful than this original six lines and slow four lines and fast that is the new idiom for how to group things in Python you must know how to do it but only must you know how to do it my presentation is intended is a checklist for you when you go back check out your code base and find out look where everywhere you're doing this or this replace it with this if you do all the replacements in my slide deck it'll narrow speed up your code quite a bit it'll make it more maintainable and more beautiful Oh an interesting one is a pop item how you might clap on this one or might not I can't remember what I made either pop or I made pop item but I'm not sure which one it was a long time ago it was my first or second contribution to Python was putting in Python it must have been the second because the first contribution was I volunteered to put Doc's drinks in all bunch of modules do you guys use any dark strings yeah I put ma I put about half of them in there originally they just were mostly empty is that a good way to join an open-source project yes if you go through putting doc strings everywhere one people love you for it - you make the code more usable but as a side-effect you actually learn what every module does or you can start another way how about you take the most popular important data structure a dictionary and mangle it and transform it into some radical way and changes performance characteristics this a good way to start now someone recently did I growled at them earlier is the only person I growled at at PyCon ok so a pop item I might or might not have put in it removes an arbitrary item the interesting thing about it is is atomic so you don't have to put locks around it so it can be used between threads to atomically pull out a task who learned something new all right linking dictionaries together this kind of code is reasonably common we have one dictionary which has some default values for some parameters in addition we coart parse and ferb are some command-line arguments that are optional so a user can specify the command-line the user our color or they cannot specify it and lastly I have a third dictionary which is not showing here the third dictionary is OS environ which is not really a dictionary but it looks like one which gets environment variables it is common to want to chain these together and the traditional way to do it one that I actually found in the standard library was you copy the dictionary full of defaults then you do an update from the environment that way you have some standard defaults and then if someone specifies an environment variable it overrides environment variables take precedence over the internal defaults but a command-line argument should take precedence over the environment variables this kind of code is very common how many of you have written some code like this ever is it the right way to do it well a copies data like crazy if you want your code to be fast don't copy like crazy so chain map has been introduced into Python 3 and it links them all together it leaves the three independent dictionaries and just looks up in the first one command line if it doesn't fault do it there it looks in the environment if it doesn't find it there it falls back to defaults this way is fast and beautiful and it's my config parser is no longer slow thanks for the applause alone that was me improving clarity I have so few minutes left but I have some leave 5 minutes for Q&A do you guys want to blow off your Q&A and get more of these slides cool no questions you don't even have time for a yes all right wherever you have positional arguments and indices that's nice you can do that in any language keywords and names are better the first way where you're using indices that's convenient for the computer and fast in the language like C but the naming things is how humans think so a way to improve your oh did we start with the answer on that one there we go oh that we're out of order then so the top one is the kind of code that I see all over the pace in client-side customer bases it calls a Twitter search Obama falls 20 and true raising the question what is the 20 the false and the true 2 you would have to have memorize the argument signature in order to check that a simple way to improve the readability of your code is to go find everywhere where you're making an obscure call like and just replace it with keyword arguments it's an easy thing to do it slows down your code just a little bit but really what are you trying to save microseconds are hours of a programmer time hours of programmer time those are the ones that cost you so this is a simple transformation in fact if you're a junior programmer just starting out you would like to improve your company's entire code base go through and do this everywhere make sure you don't do it in a middle of a tight loop but mostly everywhere and it will make the code better profoundly better and who will be the first consumer of this you because you're the new person to the company and don't know the code base you'll know it really well after you've done this it's an easy way to improve quote quality name tuples it used to be if you call doc test test mod it returns zero for at the tupple is that a good thing or a bad thing are you happy or sad when you get 0 for you don't know that is what it returned for most of its existence now it returns test results failed equals zero attempted equal for are you happy or sad which is a better output is the second output substitutable for the first sure name tuples are a subclass of tuples so they've they still work like a regular couple they just tell you what they say and how the way you make the name couple is simple you just say we're defining test results as having two fields failed and attempted easy enough this is a very easy way to improve your code base basically all over the place go put named tuples and now all of your output messages an error messages will be much more readable the person who benefits from this will be you there we go all right now is the name couple guy unpacking a sequences Raymond heading sure who's a young man in hex I can pull out the fields this way why would you do it this first way the answer is that's what you do in almost every other programming language that you know give yourself another five minutes for Q&A time there okay so people do this mainly because it works in all other programming languages and they do it out of habit the better way is the ones listed here septuple unpacking and it pulls out the four fields for you the second ones more readable and it's faster this is an easy change to make it's an easy thing to grep for everywhere you see bracket 0 brackets 1 brackets 2 brackets 3 you know what's going on replace it with unpacking your code is better and faster easy change so how do you do simultaneous data updates the traditional way to write the Fibonacci is to take a temporary variable for Y add up your new Y and then I use your temporary variable I hate this code I've written code like it a lot of times because I started with a 1967 version of Dartmouth basic and it was all I had but there's a better way you can use the topo packing and unpacking don't overlook how important it is it's profoundly important the problem with that ok well first I'll show you the correct solution the correct solution the way you'll often see it written is with simultaneous variable updates the Y and the X plus y use the old values of x and y to build the double then they get unpacked and stored in the variables the X and Y are state the state should be updated all at once if you don't update the state all at once and put it on multiple lines derp in between those lines the state is currently mismatched at one point x ry is the new Y and X is the old X this is a very common source of problems and plus the ordering matters here if I make up the order of these three it breaks the code and it's a hard error to see the last thing I don't like about it besides the risk of our order is it's too low level it's on the next slide I'll talk about taking an atom and breaking it into subatomic particles this has been broken into some atomic particles what does this say this says take wine store T X plus y 2 y and T 2 X the second one says update these variables according to those equations and so you transform one to the other the second way is a higher-level way of thinking it doesn't risk getting the order wrong and it's fast and up I thought please transfer warm code like that into this lest I let that go it's got a whole additional slide and a half okay two slides file for this this is such an important product don't underestimate the advantages for this if you break this out into a pieces you risk ordering problems also you are making it to atomic you are losing the ability to chunk the your thoughts and to think higher level thoughts for example a problem I give when I teach a scientist is I give them the function influence that influence of one planet over another all they have to do is plot the orbit of the planet and you have some people get nice elliptical orbits and other people where it just kind of zig zags away the ones who get it wrong are the ones who wrote exactly this code except they didn't use the temporary variables if the first thing they wrote is X is equal to X plus D X plus T they're toast at that point why they've updated the X and now this one gets computed with the new X rather than the old X is a very common problem the other half of the people who write these temporary variables how do they know to do that the answer is they've all been burned by this problem before the correct answer is this do the calculations on the right with the old values of the variable the old x the old Y the old DX old DX dy take their partials and then only then update all of the variables this is a profound importance not just for scientific computing I can give other examples where people are doing a simple mortgage calculation with the principal and interest and whatnot and it's a very first thing they do is update the principal by principle minus equal payment they're toast because their interest payments going to be raw the interesting thing to me is not that they get it wrong when they program in any language the interesting thing is I can give them the same problem in Excel and they always get it right why is it that people get it right in Excel and wrong in programming languages the answer is in Excel you take all the state on each row on month one here's the current principal interest in but not all months - it's this and people naturally refer to the they take their formulas and they refer up to the previous row all Excel people do this essentially they are doing exactly this operation you could view this as what's on the right is referring to the previous row and what's on the left is the new row and that gets I iterated in other words it's a very natural style of thinking please don't write code like that write code like that this is a big deal that we've got it in this language it will save you from a lot of trouble I've got one minute efficiency I'll do a fishin C fast basically just don't move data around unnecessarily concatenated strings in my classes I tell an Aggie choke or aw around this in order to hammer home this is quadratic behavior don't add your strings this way instead join them it sound like most of you knew that already go check your code base and see if your code base no set just saying updating sequences if you see a dill zero a pop zero or insert zero you're doing it wrong I go into a customer sites they say here's a million lines of code and it runs really slow can you make it go fast and see if I'm going to be able to read a million lines of code but 15 minutes later I come back said have sped up your code what did I do I grabbed for these three things pretty much everywhere they did a Dell name zero a pop zero or insert zero they were using the wrong data structure what's the correct data structure deck yeah I did that anyway a deck well let's you do a UH you can delete names zero efficiently a pop left efficiently are a up and left efficiently decorators and context managers I have no seconds left but we're going into a break time of five minutes whoa we got we have time for decorators and context managers which completely Rock we're out of a novice territory and into really good stuff these are fantastic tools for refactoring your code but good naming is essential because it provides macro like capability meaning you can hide all kinds of awful actions behind the macro or you can be very clear so remember the Spider Man rule with great power comes great responsibility all right so I want to factor out some administrative logic the business logic here is opening a URL and returning a web page the administrative logic is I'm cashing it in a dictionary that way if I go look at the same web page over and over again I just simply remember it you'll see code like this all over the place in Python which trauma was trying to catch through lookups what I don't like about it is it mixes the admin logic with the business logic and it's not reusable simple fix at cache it's actually the LRU cache in Python 3 I have back ported it for people who want to scan for the back port you can start using it today that said these things are pretty easy to write on your own so I really want to demonstrate the decorators here less than what I've written with this is reusable I can put at cache in front of any pure function a pure function being one that returns the same value every time you call it random dot random is that a pure function nope because it gives it a different value every time you call it pal is that a pure function yep same answers every time the business logic has been separated from the admin logic and I've gotten reusability the way I write it is with a simple caching a decorator it only takes a few lines I would like for your utilities directories to be full of little tools like this so that elsewhere new code you put egg cache in the problem is solved factoring out temporary contexts for decimal so we get a we copied the context change the decimal precision to 52 a calculation and restore the old context this is saving the old we're storing the new that happens over and over again there's a better way with the local context the context manager here makes a copy of the context puts it in place does the calculation restore it which is easier to get right the first or the second succulent it has reusable logic they are pretty much anytime you're set up logic and teardown logic or get repeated in your code you want a context manager to improve it it can profoundly clean up your codebase especially if you're doing this sort of thing all over the place ok the traditional way to open and close files P everybody knew they were supposed to do it they wouldn't do it you had to do the try finally C Python closes for you anyway but the simple way the new way is beautiful what did the with statement factor out for us it factored out the setup logic and teardown log it's set up the try final leg for us most of you knew that one already does your code base note go fix it how to use locks simple way to make a walk the old way it's acquired the lock do a try finally do you have to use a try finally absolutely if you don't you don't release the lock under some situation where an error happens in here what happens if you don't read the lock release a lock do puppies die every time dead puppies so that's what you're supposed to do but you had to indent twice pill up finally put colons people knew they were supposed to do it and probably wouldn't the new way is with walk I've actually separated the administration logic of getting the lock separate from the print statement I don't know who came up with context managers I need to go find them and thank them because they are wonderful ow factoring out temporary context by the way most of you guys knew with lock already right and you already knew this uh the way to close open and close files here's one you didn't know I would do an OS remove to file and then I catch an error there's another way to do this of course you can check and see if the file exists before you do it is that the right way no because it has a race condition in it and so this is the correct way it's also irritating here's a better way with ignored oh you've never seen ignored it says do this code how come you haven't seen ignored aren't you while watching my check-ins I made this check in a few days ago you guys aren't working off the head on Python three four yeah anyway if you want your own I put it on the next slide you put those handful of lines in your code hey it's like ten words of Python stick that new utils directory and you two can ignore exceptions it gets rid of the idiom for try except pass who learned something too cool alright more new things factored out temporary context did you know that help since it's helped to standard output so you're going to have to cut and paste it to sort in a file that's irritating can't we just redirect it sure you what you can do is open I off a file redirect standard output temporarily assign it to try finally on the help and then after you've done the help capturing the output to the file store it is this any fun new better way with redirect standard out to the file this time pretty much you can get back in the business of writing your functions with print statements and then wrap them with widths to send them to files and up to standard error send them somewhere else this restores the beauty of using our print everywhere and this is the context manager for if you don't have this and of course you don't because I haven't checked that one in yet that'll probably go in on Monday so Nick and I have to have a couple words about he's a very happy with this part what we are unsure of is whether we should say in fact you guys can just vote the proposal is if file object is equal to none how about we automatically create for you a string IO object capture the string so that you can do with a redirect standard out no arguments as s and then do a s get value and capture the string you want that all right Nick has decided okay Nick is the Nick is the man for context a lip so we negotiate everything there just like somebody wants something in collections they have to come to me they want something in Turtles they come to me alright for Jack he came to me and said no he tried to alter one of my combinatorics I'm now wishing I had said yes though alright so concise expressive one-liners this is the very last one but it's an interesting thought when people first come to Python we teach don't put too much stuff on one line there's an infinite amount of vertical space available to you in your code take advantage of that on the other hand don't take single units of thought and break them into subatomic particles so it actually makes your code harder generally I understand every single line but yes do you understand the gestalt of that my rule for what goes on one line in one logical line do you remember earlier when we did the couple unpacking with the planet positions that was one logical line even though I actually typed it on four so one logical line means one statement so my rule is what goes in one line is what you can express in a single English sentence give me the sum of the squares of all the numbers up to ten this is one way to do it you start with an empty list this is the way we used to do it in the olden times when I first came to Python I'd have you clap on some butts Alex put some in there I'm the one who made something go fast though so well I put all the optimizations in so it doesn't create a new object at every iteration it actually just keeps a running total in C there is a better way which is to use of the square brackets and put this in one line why is that better well the first one tells you exactly what to do step by step the second one says what you want it's more declarative it just says I want the sum of this I read it left-to-right I want the sum of the squares of I taken from one to ten the same way you would write it in mathematics ican't in the second way is better because it's a single unit of thought and the first one is too busy telling you how to do it and not what it's doing fair enough oh is there a better way take out the brackets generator expressions I did those yeah so yeah my contribution to Python I came along and saw these square brackets and I took have an eraser and I erased them and it made everything go a lot faster that creates a generator version of this instead of filling up memory making it go fast did you guys have a good time cool thank you all for coming to the presentation and do me an honor get take these slides go back to work and have somebody if not yourself look at every one of these find them in your codebase and put them in it'll make your code faster better and more beautiful thank you very much
Info
Channel: Next Day Video
Views: 944,087
Rating: undefined out of 5
Keywords: psf, pycon2013, talk, RaymondHettinger
Id: OSGv2VnC0go
Channel Id: undefined
Length: 48min 51sec (2931 seconds)
Published: Wed Mar 20 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.