Brandon Rhodes The Dictionary Even Mightier PyCon 2017

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
ladies and gentlemen as we know dictionary is very powerful and useful tools in Python and in Python point 3.6 a lot of new features I added and it is very mighty and put even mightier than you thought let's welcome Brandon wrote to give his talk the dictionary even might hear well thank you everyone this as many of you will have guessed is a follow-up to my 2010 several years ago PyCon talk the mighty dictionary in which I try to explain the operation benefits but dangers of a hash table to a PyCon audience a lot has happened since then none of which tempted me to do a follow-up talk until Python 3.6 which Rayment Ettinger is famously said is the first version he specifically said 361 that's a better language in 2.7 so it's clearly time that we talk about where the dictionary has gone in the last several years I chose it turns out in retrospect an interesting moment to do the original mighty dictionary talk February of right at the beginning of 2010 right before months before 2.7 went final and came out that means that my original talk purely looked at the dynamics of the hash table didn't talk at all about the improvements that were just about to arrive as Python 2.7 the last Harrah the two series shove quickly pulled in several innovations from the Python 3 series to make them at the last minute available in the last version of 2 so none of those were covered in my original talk and will be covered now we'll be covering several things that we're back ported to to 7 then we'll move ahead and look at things that you cannot take advantage of into seven and that you have to have a modern version of Python to take advantage of all right many of you will already have seen I believe they were already being talked about at the conference dictionary comprehensions change that are into peppermints baited in men was was a resolved the list comprehension had been around since Python - it was an innovation when python went from one six I think was the last one to 2.0 and it was a big deal you'd always had to make an empty list append append append to it before finally producing a a completed list with the list comprehension borrowed from several other languages you could instead put the for logic the looping logic inside of those square brackets and had the list created for you that was great for lists it gave us a way to build dictionaries but the way was slightly roundabout you had to build tuples key and value and it didn't look very much like a dictionary and of course it was slower because you had to make a tuple per item that was then thrown away once they were inserted into the dictionary generators came along and Python 2.4 at least you could take the angle brackets off but you were treating one inefficiency for another the generator you're not having to create an intermediate data structure a list but you're constantly having to thrash between running another bit of bytecode or C code in the generator and then running it another little bit of the dictionaries in it logic to absorb item by item the tuples that were being generated you're still creating and freeing tuples and you have the disadvantage of swapping between two different call stacks so often wound up not really being any faster though it was more conceptually convenient and saved a pair of angle brackets you don't have either of those problems of course once you introduce the dictionary comprehension it looks like a dictionary it has a colon as one would expect it doesn't create needless intermediate data structures lists or tunable it's readable it's smaller it's faster and what I think is its greatest feature it rarely gets mentioned it has it brings to Python a great symmetry if you ever talk Python 2.6 and for awhile I taught Python professionally when students learn the list comprehension some of them some students have imagination and will suddenly think wait a minute if you can put the four inside of the square brackets can you put it inside of the angle brackets and Python 2.6 would pop them on the heads and say no lists are special and as you know everything in a language that's special and different is an obstacle to the learner this means the old 2.6 syntax it didn't allow this it meant that Python wasn't a language at least in this area where you could learn by extrapolating I did this the other day so it's proud of myself I deduced the existence of an object that I had never thought of or seen before springtime hedges are starting to grow also I'm facing and dig the big reel of power cord out of the garage get it plugged in and then like loop it around myself in such a way that I can trim the hedges without cutting the cord into or electrocuting myself and suddenly I knew I owned a cordless drill I owned a corded hedge trimmer and I was suddenly absolutely certain that the universe I lived in included cordless hedge trimmers and you understand I was absolutely certain that they existed the moment that you know this is one of those moments were like I have introspected the universe and deduced a fact about it through the power of my mind and it was like this great moment it was just the QED to do the Amazon search and start reading reviews of the indeed existent concept of a battery-operated hedge trimmer it's wonderful when languages when systems when api's are set up where you can guess some behaviors from others where you learn the square brackets let you get it items in a list the same square brackets let you get to the items in a dictionary and what I love most about dictionary comprehensions is they allow our learners to extrapolate a corner of a language that they might suddenly guess in there indeed after Python taboo argued against it for several years was finally added to the language added in three dot o backported 22.7 really beautiful construct and then as sort of the last thing right as Python three was being invented was dictionary views this was again came right out of the gate in Python 3 Oh added 2 to 7 previously originally we'd had keys values and items that built ancillary data structures lists eater keys inner values inner items were added to the language when iterators were and they don't spin up a separate list but let you get to all the keys or all of the values and in the dictionaries order without having them copied in there into an intermediate structure and I honestly I kind of wasn't paying a lot of attention to the Python 3 transition since I still don't get to use it at work and the method who does the I kind of had thought when I saw weekís isn't a list anymore values isn't a list I thought they had just changed the itter versions and renamed them two keys values and items and then if I wanted if we want a list in Python 3 you asked for a list very pythonic because you're being explicit asking for what you want it turns out that's not what they did they asked an interesting question what if you just want to ask if something is present in the keys or present in the values the dictionary could answer is this value in the values with a quick loop over the values but there was no way to get to that information without creating a list of the values and looking through that or a setters as appropriate and so they thought shouldn't there be an object that implements contains four keys values and items that just can answer that question is X in Y by just looking directly at the interior structure of the dictionary in fact if we had such an object Y stopped it contains we could have it do all of the basic set operations for the keys you could ask questions like does this dictionary and this dictionary have any keys in common do they have any keys in the disjoint set between them and while you're at it you could of course add the ability to call thunder it her and also throw in the ability to loop over dictionaries which is all we really do anyway this was inspired by the Java collections framework these are called views as Y used v is the place holder there when you call keys values items these days you get a tiny little object with no storage of its own all it has is the address of its type in the address of the dictionary that it will use to go answer questions for you when you do set like operations or membership like operations on the view so keys values and items in Python 3 are view objects that is what was backported two to seven under these three methods I honestly had completely missed at the time and have never seen used view keys view values and view items on the one hand they put iteration which is typically what you want to do one step further away you can no longer say it her items you now have to say dot items and you're given this tiny little object that is the view and then you have to say Oh for loop over it and it calls it ER and creates the second object the iterator that had knows how far through the dictionary it's gotten so far so on the one hand the one thing you ever really want to do is a step further away but it's so conceptually clean and it enables a number of operations that otherwise aren't possible so the this is decided this was how through what would work where it's two layer first you get the view by saying keys or values then if you want to you get an iterator by trying to iterate that was backported two to seven but again under those different method names so that you weren't opted into a different behavior until your code was ready I'll burn a slide on the order dick sticked it was a proposed and accepted and kept 372 not added until 3-1 so we're moving a little further ahead in history here back ported 2 to 7 it preserves insertion order it's bigger slower in fact was implementing a linked list in Python I believe until 3.4 when someone then rewrote it in 3.5 just in time almost for it to be obsolete and it's interesting so we had since 3 1 to 7 if you needed it a the order dictate dictionary but it never tempted us to close the to open peps involving the order of dictionaries one kept said shouldn't keywords come be delivered to the collie in the same order they were written shouldn't a class dictionary have a natural order those two peps each require some kind of high-performance ordered dictionary ty because you don't want to slow down keyword arguments I mean those get passed all the time and you don't want to slow down the dicks that become the namespace of a class or the dunder dict that sits behind an object's attributes so we had order in a dictionary but even in the language we didn't use it for anything serious we kept the arbitrarily ordered a traditional dictionary instead because we just needed the speed so it exists but it's used for corner cases where people know that they care about order all right we're now moving out beyond the things ever available in Python to we're now heading into the middle years between now and then original talk first we'll talk about the key sharing dictionary proposed in Pepa 412 added to the language as Python 3.3 this answers a question of space imagine a small class whose dunder init accepts the name a port and a protocol and saves them each as attributes so the little dunder dict that sits behind the object has to store some value under each of those three keys here from mighty dictionary talk in 2010 is our friend the hash table an array index does all arrays are in RAM by an integer index and the rule is that because we are being given keys that are not themselves integers and if they are integers maybe not successive integers giving to be being given natan nonsense like the straight name how do I find the place in memory that's named the string name well the hash tables get that value whatever it is and they fish it which is kind of this act of violence where you clobber the value and just smash it into a bunch of bits 32 bits however big or small the value was you get out 32 bits no 32-bit platform and 64 on a 64-bit platform and you now if you need an integer index have as many bits as you could want for forming one this simple dictionary that's not at all visible on there isn't hmm I might fix that in a minute if I feel like live coding the we have as you can see eight slots ready to be populated and so eight slots we will need three bits in order to distinguish them we choose the 1 0 0 slot as the one that will hold the key name and whatever its value is and so that gets written into the dictionary we now arrive at our second one hashing the key port end of its hash is 101 so it takes the slot it happens next door and now this was a big topic in the original talk what happens if the third key happens to have a hash that's already been used 1 0 0 in this case and you'll remember that there's the technical term collision for what happens when we try to store proto and run smack into the fact that name is already taking that slot so we do some math involving the other bits and choose an emergency backup slot that it can live in as well so here it goes we're going to try to store proto I've collided and so it has to go somewhere else it's now a little more expensive to look up or to reset and the other values name and port you'll find right where you look for them proto you'll find after running into every time you look for it the fact that it's not sitting at the slot you'd expect imagine now that you have your second object of this type you store name you store port proto comes in collides and has to move to its separate slot someone named mark Shannon looked at this picture and said Wow over there on the left-hand side the first column in the second column are absolutely identical they look exactly the same we're storing exactly the same five blank spots and three hashes and exactly the same five blank spots and 3-game three um keys in both of these object thunder dictionaries I should mention that the the strings are not really inside the dictionary the integer 53 isn't in the dictionary what the dictionary really stores if you can imagine that you're looking at the matrix where it all becomes code what it actually stores our addresses of the string object that says name or port or the integer object that says 53 so whenever in my dictionaries I put a little string or integer I really mean the eight bytes always the same size in a 64-bit machine that tell you where in memory to go find the object but the weather proto is a string or the address of it it's the same top and bottom and mark Shannon thought about this class has just been created what if we wait for the first ever call to that classes dunder init it will create an instance of the object that looks like this what if we then go in and split apart and store in separate areas of memory the hashes and the keys and keep them forever and with the objects dunder dict only store the values because that first object created might go away sometime but we keep forever linked to the class that frozen set of hashes and keys so that as someone then creates a second instance of the object as someone then creates a third instance of the object you're only burning the space for storing the values you're only burning 1/3 the space and getting 2/3 savings because you're not over and over and over again in every object repeating the hash repeating the key when in so many classes they're exactly the same every time memory savings even on a tiny object that fits inside of an eighth entry five of which can be populated Python dictionary you're saving a hundred 28 bytes for every object if there's thousands or millions that adds up to real measurable savings now you do burn a little space because every dictionary ever created now needs 8 extra bytes so that if it is stored in split fashion it knows where to get to the values array that's separate in that case from the keys but when measurements were done the extra 8 bytes was easily made up for by the savings for common objects with repeated attributes in fact object-oriented programs I mean normal lists and dictionaries don't care about this so if a program uses a lot of numpy arrays or lists it doesn't benefit but object-oriented programs with lots of objects often use 10 to 20 percent less memory because of this innovation key sharing Python 3.3 what do you care about because of course you can't turn this on or off it's just going to happen yeah and only if you're on 3.3 or later the takeaway for you is that in your objects make sure that dunder init assigns every attribute you're ever going to use set them to none if you need to or to the empty string or whatever makes sense but yet the men because if you don't and during its lifetime one of your objects tries using a new attribute then its values will have to be thrown away a fold under dict will a traditional dictionary will have to be allocated for it and all of its attributes copied over if a single key single attribute is added it's not in that original prototypical set of keys then you lose the sharing but this is a habit you should already have had pi pi can't do a lot of optimizations if attributes come and go randomly they also recommend setting them all and under Annette so and it's good documentation I can read your dunder init function and know all of the attributes and not be very surprised later when I see one appear in the middle of a method so this was already a best practice this best practice now care is the additional benefit in Python three three key sharing all right for the next few minutes we get to delve into the exciting world of computer security of dealing with people who want to break your application this next thing randomly seeded hashes was not a pet it only warranted an issue in the Python bug tracker and it has to do with this how fast does a dictionary normally fill let's say I have five things I need to put in let's kind of watch the speed here stone bowl Oh collide cat doh eagle notice that sometimes a dictionary key takes a little longer to make it in because of the collision but winged collisions don't take place which on average isn't very often notice that the last two entities DOE and Eagle made it in as fast as the first ones new insertions unless they collide don't take any longer than the first one as we talked about in my original version of this talk the thousand or million item added to a dictionary should insert just as quickly as the first one which is the magic of a dictionary stone Oh OOP cat doh eagle that's the magic of the dictionary it turns out if the person using your web application isn't your friend they can take that benefit away what if they were to come up with a series of words and they have all the time in the world to do this before they hit your website where the second value is guaranteed to collide with the first and then the third is going to collide with the first two and then the fourth with the first three and so on maybe through thousands of potential keys then the rhythm doe cat eagle goes something like this a back buying cabbed deal ah easels that wasn't exactly the snappy rhythm that we saw before was it uh have you is what happens at a keyboard if you're loading information to do a database and instead of going you know first thousand people second thousand third thousand four thousand I remember once I was loading a table and it went first thousand second thousand third thousand and was like then two minutes before the four thousand were done because I'd declared a column unique but not added in index to let the dictionary as more and more people were added quickly check all the previous people for a duplicate if somebody chooses a set of keys where the fifth one takes five times as long and the tenth one takes ten times as long as the first insert then I am suddenly in a situation called accidentally quadratic where the amount of time I take will vary with the square of the number of items entered meaning thousand items should take about a million moments of work and there's actually a blog about this I highly recommend to you all accidentally quadratic tumblr.com it is an entertaining blog of one or two posts a month about all kinds of high end professionally designed software projects that it turns out in their corners have quadratic rather than linear or logarithmic behavior things browsers servers that it turns out you can make instead of going one two three four five through a set of items of work give you that sinking feeling by going one two three door and so forth how many of you have ever run something that had exactly that pattern it's it's I always know what's gone wrong typically when you see something slowing down like that so we had to fix the hash function how does a hash function work well one version of Python this is one that we used for a long time we in the main loop have the hash of the string so far as we're partway through it hashing it and to absorb the next character of information we multiply the current hash by a million and three magic number million and three and then we exclusive or in the next character of the string to slip in the lower byte whichever bits are set in the incoming character so we're feeding characters in each time multiplying by a million three and then pulling in 8 bits more of information from the next character you'll understand a bit about how this is is filling the complete 32-bit or 64-bit hash with information if you ever played around with multiplication as a kid either by hand or on a calculator where you might have discovered that by making a number yeah a thousand and one where all of the digits are just zeros or ones you could produce an output number that was kind of cool it was little copies of your original number trapped inside of this much much larger value for simplicity I will as they taught me in elementary school allied the rows that are just zeros for space here but you might have found that you could make even bigger numbers put some zeros and then another one and you now had another copy of your original number inside of the digits of the large output of the multiplication but you might have noticed that you can't put those ones too close together or all of a sudden the pretty little copies of your numbers start looking more interesting digits that aren't in the original will begin to show up all of a sudden you won't be able to predict what the next digit is because of course the ones care and move information leftward is interesting the asymmetry the ones carrying never bother the numbers to the right can make all kinds of assertions about the numbers to the right odd and even multiple five or not multiple of ten you can't do that facing leftward in the number because that carry that happens pushes new digits and chaos out of the direction of the higher order numbers when you add and therefore when you multiply which is simply a repeated addition so when we sit here repeatedly multiplying a number by a million three and then putting new bits in the bottom we are taking this number in binary and look at how it uses its ones two tucked in right at the bottom to make sure that two copies of the value that you've accumulated so far are just shifted by one bit and then forced to add with each other then a bit of a gap in the middle that's only sparsely populated with ones just bringing a few more copies of that number in and then four ones up at the top that make sure that four copies of the previous hash value are piled right next to each other in order to add and create all sorts of interesting carries and chaos as the bits slowly in the process of multiplication move leftward and that's of course why we or the incoming bytes into the bottom because the multiplication is constantly shoving upward into the high bits the entropy that we introduced by reading through the string in the lower bits that's katha mission since they learned we were interested in hash functions and computer scientists alike have done far more formal analyses than this kind of hand waving I'm doing to give you the feel of how a hash works we can now have a number of mathematical properties to rate hash functions and we can get things like Python - function and ask which properties it has how evenly distributes bits but this kind of idea of visualizing the multiplication hopefully will give you a little bit of an in dinked for how hash functions work and what the formalisms of improving that properties are about it's about whether the big crazy number you're using to shove information to the left does a good job of scattering around any patterns that might have been in the original values and this was a big problem with Python it always used this algorithm always started with the same value so if I knew you were using Python for your application I could sit at home and try to come up with values with strings that would have the same hash December we decayed this became famous in December of 2011 a security conference 28c three efficient denial of service attacks do s's on web application platforms platforms because python was not their big target I thought had this problem but so did almost every other language they in fact this is kind of fun it was 2011 so they made a list of the web application technology as they were about to discuss and to know how important each of them were they went and found a website that ranked programming languages by their importance to the web you'll be happy to know that our vulnerability to deliberately created hashes only affected 0.2% of the web according to the numbers they had at the time Python hash function they said could be broken they didn't actually illustrate how to do this but they said it could be broken computationally looking for possibilities with what's called a meet in the middle attack they could only find reasonable sized attack strings for 32-bit systems hadn't done it for 64 they did a web search to find out what our web framework was plone so they found that it had a max size of one megabyte if you post form data to it it'll throw it out and not even look at it it's over a megabyte so they played around with megabyte payloads where all of the and values where all the names were going to collide they were able to make a plum site spend seven minutes of CPU in order to parse a single one megabyte request such a site once you learned the trick was a very easy site to bring down just two months ago if you're interested in the cryptography of hash functions a really really neat blog post appeared of someone who has published their work and has figured out how to produce 64-bit hash collisions not the little 32-bit collisions of the original research go look for that on medium.com if you want to see how someone Robert gross he's learning cryptography and to have a target to practice on he took pythons old hash function put it through the wringer and figured out in a moderate amount of computing time how to get some lists of hundreds at least of keys that even on a 64-bit system hash to the same value great article so the immediate response we had to do something quickly was to sprinkle a little randomness around this basic hash algorithm we also decided to dignify 1 million and 3 with a pound defined statement apparently by adding a few random bits at the beginning every website every time Python launched would have a different way a different output for a given string so you could no longer count on your list of precomputed collisions working and then there was a bit of randomness a suffix added at the end or exclusive or 10 at the end that helped hide the prefix so it wasn't too easy to say if a site exposed its dictionary order in its jason return value made it harder to use that to guess what their secrets that particular server was using the main thing you'll notice is that your web applications can't easily be do s it's not a perfect solution but it makes it much much harder to create a situation where you just fall over doing seven minutes CPU work and your dictionary has always come out in a different order for exactly the same program exactly the same input suddenly the dictionary order goes random apparently in 3.3 and subsequent they made because this was a security issue bug-fix releases a two six seven three one and two but remember they're because they couldn't break existing tests and code you only get that protection against DLS attacks with the - capital R flag I believe there's also an environment variable that will turn on the randomness so they pointed out the useless uselessness of that the next year at the next instance of the same security conference at 29 C 3 they use it apparently successive integers hash flooding BOS reloaded was presented by three new researchers to outline how badly many languages had result responded to the original filing and complained they were not actually able I noticed two remotely watching a Python app figure out how to create collisions but they at least were able to recover the random key by calling hash in a Python program which suggested that information was being leaked that maybe could have been exploited remotely and more importantly these researchers solved the problem they did the work and introduced sip hash made it free for any language that wanted to use it which is is random each time you start the runtime up your get set of hashes you've ever seen before we'll never see again and no attacker should be able to reasonably guess what that secret is and generate collisions for you sip hash did earn a tap for 56 it was proposed everyone's that no because it's too slow Python devil await a minute if you do this to the bits if we combine this we do this many bytes at a time and finally we got it fast enough that they put it in and it is accepted as standard track for the language it's in three four three five three six you have default turned on all the time hard crypto protection so far as we know at the moment such things always change at the language level from those kind of do s attacks the lighter-weight sprinkle of randomness is there in three three and in many bug-fix releases of previous versions if you remember to turn them on with - capital R or an environment variable so that is why your dictionary suddenly went out of order if you were using Python 3 3 + 3 4 and 3 5 will get 2 3 6 in a minute internal change that I'll just describe very briefly pepp 509 added a private version number every day they burned 8 more bytes in every one of all our dictionaries they wit in and every dictionary has a version number and they have elsewhere in memory a master version counter and when you go and change a dictionary the master counter is incremented from a million to a million and 1 and that value a million and one is written into the version number of that dictionary so what this means is you can come back later and know if it's been modified without reading all of its maybe hundreds of keys and values you just look and see if the version has increased since the last time you were there a lot of the pain of trying to optimize Python is that anything can change at any time some ones that decide to monkey patch so code that looks like it should be referencing a built in you use all the time could instead be intercepted by someone having ejected a function of that name at the module level or they could edit the built-in built-ins module someone writing an optimized version of a function can now check for that has the Global's of the module changed look at its version number hidden in that dictionary you'll know whether it's been touched and if it hasn't you can keep using the optimized version of a routine that has inlined the built-in has the built-ins dictionary changed look at its version number so all sorts of optimizations that pythons dynamic nature had made impossible because you never know when someone might replace a built-in with something else are now possible because the dictionaries that holds the built-ins that hold class name spaces and that hold module namespaces now along with all of our other dictionaries have a version number it's internal I haven't seen an interface for users to get to it but it's an implementation detail of Python and 3 now to accelerate optimisation on that platform pepp 509 is where you can get more details I glossed over that quickly to talk about compact dictionaries the big big change in Python 3.6 that changes dictionaries forever a dictionary you'll recall fills up one item at a time there's collisions hopefully not on most of the values and at this point it's as full as it's going to get if you add another key to this dictionary that's more than two-thirds full so a new dictionary hash table is allocated of 16 twice as big and all the keys are reinserted before you're allowed to continue looking it back what it looked like when it just had five keys in it this is it this is as full as we're going to let it get and yet it still has noticeable blank space in it three of its rows are holding nothing 8 bytes on a 64-bit system of hash 8 bytes of address for that key 8 bytes of address for that value 72 bytes are still empty when we declare it so full that we're throwing it out and starting over with a more sparse data structure in 2012 Rayment Ettinger had an idea he introduced it on Python dev and immediately some improvements to it were suggested every problem in computer science can be solved with an extra layer of indirection what if we don't use those big 24-bit rows to remember which hash locations we've used what if instead we were simply to use 8 bytes and when it's time to start adding items to the dictionary what if we simply remembered in a little-bitty array of 8 bytes the location in a bigger or a list of where we had put the hash the key and the value instead of a dictionary having to keep entire 24-bit slots free as it now has reached 5 items and needs to go to 8 entries we instead can only have 3 bytes left free because of the eight bytes up at the top that we have available to remember the indexes of our keys we're only using five we're leaving three empty and so three bytes wind up empty wasted and unused right before the dictionary is ready to resize rather than I was at 72 so the old dictionary had lots of extra space the new one packs all of the keys and values into a short list that it grows and if it doubles in size then it simply reallocates all of the keys in a contiguous area 2015 they added this to pi pi which is allowed to do crazy things to be more efficient it's interesting a few things sped up significantly and four percent eight percent gains some benchmarks slowdown because we added a level of indirection which is the problem of solving problems in computer science it can make things slower some things got slightly slower slightly faster with one or two big wins but the big reason for doing this wasn't speed it was actually exciting that the speed had barely changed adding an extra level of indexes because we got much better memory usage another year late past and normal Python people couldn't benefit and Inaba Naoki I don't think I've met opened an issue with a patch that added it to Python three-point the moment three point five something he pained a whole month went by no one addressed the issue he kind of pinged and they said oh this is big it would have to be addressed on Python dev which often means it's not going to happen and then in September something magic happened for the first time the core Dez got together all in a room by themselves for a few days and had a core dev sprint and at the end of that a message suddenly appeared on this moribund issue we discussed a lot about your compact dick change and we all wanted in 2.6 but the code isn't ready yet well let's just push it into Python 2.3 before the deadline this weekend and we'll fix it up later they wanted it that bad and of course predictably immediately someone said isn't this premature who wasn't on the group of core devs but no they'd spent like an entire day talking it out how this affected the future and it made it in in Python to six the new dictionary is much faster iterations faster there aren't empty spaces to have to go past and it remembers the order because that big table it's adding to it's a simple append can you now depend on the order is the big question because they went ahead and for free got to accept a pet asking the key word args be ordered and asking that clastic be ordered and that metaclass inputs be ordered all those uses for dictionaries that had wanted order now had them they think the core devs I think often think that they're just helping users who might notice that dictionaries are now ordered but if you look on Stack Overflow order isn't something that people are going to stumble across it's something they already expect dictionaries to have the human mind naturally by its nature expects dictionary to have order if you've ever taught Python this is always a stumbling block Raymond thinks the guarantee might be almost inevitable but Beasley says this is permanent and I agree with him I say three point six is brought about dictionaries for humans I don't think they're ever rolling this one back because the normal programmer who doesn't even know about hash tables has always expected them to be ordered and will write code that depends on it I end with the wish slightly modified that I ended my last talk with may your hashes be unique your keys rarely collide and your dictionaries be forever ordered thank you very much hmm you
Info
Channel: PyCon 2017
Views: 24,055
Rating: 4.9543381 out of 5
Keywords:
Id: 66P5FMkWoVU
Channel Id: undefined
Length: 47min 22sec (2842 seconds)
Published: Sat May 20 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.