PyCon 2015 - Python's Infamous GIL by Larry Hastings

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

you the purpose of this talk I'm going to talk about the history of the Gil what problem existed out there how the guild solves it their applications of the solution how it's still affecting us today and wow it may change in the future the story starts in 1990 to 1992 Python was only two years old it was still a very young language it was still changing rapidly it had been released on the internet but it was still mainly used at CWI where cuida was working at the time now there's this big new technology on the horizon that people were talking about it was gonna be it was something the operating systems were gonna do for you and it was called threads threads if you don't understand they're kind of like processes where let's say that you had a process and you were running you're doing work you could create a new process and that could be doing work but they didn't share any state threats were kind of like that where you could have multiple processes kind of but they were sharing the same state they were sharing the same process information all the same global variables in fact Sun OS already had support for them it called them lightweight processes the new operating system from Microsoft that was still in beta at in 1992 it was called Windows NT and that had support through threads Linux was going to support four threads eventually but it actually took a couple of years from this point now back then in 1992 you didn't have computers that have multiple CPUs you only had one CPU so till an or multiple threads at the same time they weren't actually running at the same time one would run and then would stop and the next one would run but it was clear that multi-core computers were going to happen eventually and multi-threaded programs were exciting now so everyone one views threads people wanted to use Python people want to use Python with threads at the same time but this wasn't going to simply work you had to do some tinkering inside of Python in order to get threads to work properly for example Python had a bunch of global variables Python in C a global variable is kind of equivalent to a Python module attribute so there's only one of them and everybody can see it unless you hide it in which case it's local but there's only one of them if you add multiple threads and they were both trying to change a single module attribute at the same time you might have a race and something terrible might happen same thing with the C Python or excuse me the same thing with the sequel I'm also Python used something called reference counting to manage the lifetime of objects and believe me I'm going to tell you all about it so let's talk about what reference counting are and I'm going to show you how reference counting doesn't work for properly perhaps under a multi-threaded environment so here is the structure at the top of every object inside of the Python runtime every object that exists inside of Python has to start with these two fields this is a C structure it's kind of like a Python class the second thing is the type of the object that's a pointer to another class which has enormous amounts of information about the object but this first thing is unsigned in table ref count that is a reference count of the object let's a reference count is essentially the number of people who are interested in this object right now so if there are three people who are all holding references the object then that reference count is going to be three and you have to manage the reference count you have to add new references if you're a new person who's interested in the object and once you're bored with the object in you're not interested anymore and you're releasing your your interest in it you have to drop the reference count so there are these macros in craft from debt graph again this is C code this is kind of like a function call except it's actually more like a textual substitution so in craft increments the reference count for you deck ref decrements the reference count for you so deck ref is a little more complicated because there's a little code in there that says let's subtract one from the reference count and if the reference count is now zero let's called elraf which is going to call your finalizer Zittel called under dell if it's an object that has under dell and then it's going to release the memory which is how we reclaim the memory so that Python doesn't necessarily eat all of your memory in your computer although pythons known for having bad habits in that way now let's talk about what it looks like from the Python end here we have a very simple statement a equals a new empty list so we have a variable a and it has a reference to this empty list that's floating out in the world somewhere and the empty list has a reference count of 1 if we then say B equals a we're creating a new variable called B and assigning it a reference and the reference count has to go up to 2 now there's actually a way you can look reference counts inside of C Python there function call called sis get ref count and if you pass in a in this circumstance you're going to get a value of 3 now why is it 3 because a has a reference B has a reference and you create a new reference when you pass it into cyst get ref count which is only used as long as it's examining the object and then it drops this reference count so after sister Brett that get ref count exits the reference count is going to drop back into 2 again so that's how reference counting works and that's fine it's worked great in Python for decades but if you want to add support for threads you're gonna have a problem so let's take a look at what much a problem might be let's say that we have two threads I've conveniently named them thread 1 and thread 2 and they both have a reference to an object and they're both going to run and do their work and then they're both going to drop their references to this object so both of them are coming to called deck RAF in order to examine the problem we're gonna happen to go a little deeper so let's turn that into the C code that it turns into but actually even that is not deep enough so brace ourselves we're going to go into assembly language land it's not terribly hard as some of the language so I'm just not going to worry about it but let's say that object has a reference count of 3 and both of these threads are not going to drop their references of the object so I'm gonna go through this twice the first time it's going to work and the second time we're gonna have this race condition is going to fail so thread one comes along and says load the reference count into a X this is making a copy of a value pulling it out of memory and storing it in a X which is kind of like a variable inside of the CPU so we have this variable in CPU called a X and we're setting it to 3 because we pulled the 3 out of the object right now we're going to decrement a X X is now 2 and now we're going to store a X back into the object so now the object has a reference count of 2 working great so far thread 2 comes along does the same thing load reference count into a X decrement it store it back into the object object is now as a reference kind of one that worked we had two threads that came along the object had a reference count of three both threads had reference to the object that both dropped them and the resulting reference count as one worked fine let's do it again but this time it's going to fail and all we need to do in order to make it fail is do is execute these statements in their slightly different order this is what we call race condition so the first thread comes along says let's load the reference count into acts boom X is now 3 Farah - does it at almost the same time and decide - now has an ax is also set to 3 thread 1 says let's decrement so does thread 2 thread 1 says let's store a X so does thread to the problem now two threads have come along it had a reference count of 3 is start both of them drop their references and the resulting reference count is 2 this is a buck but the reference count is now 1 to high and it is permanently one to high forever this object will now live forever nobody it's when once the last person drops their reference count on the object the resulting reference count be 1 it's never gonna reach a zero it's never gonna finalize its call it's never gonna be have its memory freed so this is now leaked memory this is bad it's not it's not as terrible however as the opposite problem if you have a race condition around ink graph then the resulting reference count could be too low and if the reference count is too low then you have a problem when people eventually free the reference counts one person freeze it and it should have dropped from two to one and said it Rob's from 1 to zero and now the object is finalizer czar called and it's free the memories freed but somebody else still has a reference to it and they try and use the object at best you crash immediately and it worse do you compute erroneous results and all sorts of terrible things can happen so we need to solve this problem this race condition in Python if we're going to support threading so what do you do well you would add something called a lock a lock is an object that only allows one person but to hold it at a time so we talk about holding a lock so a lock is the thing that only one person can hold and you establish a rule you say in order to talk to this thing you have to be holding block so T people grind or grab it one person grabs the successfully they're allowed to talk to the thing and the other person is trying to grab and they have to wait until the first person releases it and then they can talk to the thing so you might add one lakh per global variable and lock everything you know have one lock in order to talk to this and then another left to talk to that or you might group them together a little bit you might add a reference counts just for one reference count for all one lock for all reference counts or multiple locks for reference count sort of splitting up into bins but all of these approaches can result in what's called a deadlock let's talk about that so this again is a classic thing from multi-threaded programming let's say that we have two threads thread 1 and thread 2 we have two locks lock a and lock B both of these threads want to interact with these locks but they happen to interact with them in the opposite order so thread 1 attempts to grab lock a and then attempts to grab Block B thread 2 attempts to block B and then lock a in the opposite order and again I'm just gonna go through the race condition here you can let's say that thread one acquires lock a nobody's told to lock a right now grabs it it's fine thread two attempts to acquire lock B and grabs lock B nobody's holding it so now it's holding it that's fine thread a comes along a third one comes along and says let's acquire lock B it can't because it's already helped by another thread try to comes along and says let's acquire lock a tends to grab it it can't because somebody is holding it already these two threads are not deadlocked they can never make forward progress and this is a very common problem in multi-threaded programming it results when you have multiple locks and people acquire them in the opposite order and it's just inevitable as soon as as soon as you have this problem you're gonna have the dead locking problem in your in your multi-threaded program so gooey dough did something that I would say is pretty clever he added something he called the global interpreter lock or the Gil almost two years to the day that the Python repo was created the Gil was added this is the check in number nine 23 from August 4th 1992 and this is the really salient bit this is the global interpreter lock and it's a type lock which is something that he may add to make it cross-platform it's static so there's only one of them and it's kind of hidden from the world and it's called interpreter lock it doesn't have the word global in it but it is the global interpreter lock and if you want to see what it looks like today that's it this is the Gil as it exists in Python 2.7 it's a little different in 3 now because I'm gonna talk about that in a minute but we added a comment to say this is the Gil you found it and we changed the the type of it just a little bit so that had this PI thread in front because we added a new thread library now the rule in python is there is one lock and it's called the Gil and you have to be holding it in order to interact with the cpython interpreter in any way if you want to run byte code if you want to allocate memory if you want to call basically NEC API call you have to be holding a Gil this is a really simple rule really easy to get right so actually let's talk about the ramifications of this design decision so again it's very simple which means it was easy to get right it's very popular in the early days of Python to write extensions for things because there are a lot of C libraries that did a lot of really neat things people wanted to call those C libraries from Python and the easy thing goes just write an extension module the extension module rules were very clear very easy to get right and as a result people wrote loads of extension modules which made python more popular I would say that the design decision of the Gil is one of the things that made Python as popular as it is today so I don't agree with people who say the goal is terrible should never have been added no absolutely not the Gil part of the reason that Python is successful is because it had to go now because there's only one lock obviously you can't have dead locks I mean you could add a second lock to your program and you could deadlock ESS yes but in terms of the C API for Python it's impossible lock because there's impossible to deadlock because there is only one lock now this has ramifications for your programs the classic thing is we talked about IO bound versus CPU bound programs and I'm gonna go into those in some detail an i/o bound program in effect is a program that spends most of its time waiting for IO to happen like it's waiting to write to a socket or waiting to write to a file or read from file or read from a socket or talk to the screen it's waiting and if you have a program dispensable this is a time waiting for IO then under the Gill you're a multi-threaded program will work just great and I'll show you why that is in a minute but if you're a CPU bound program which is to say you have multiple threads and all want to be computing something well if you want to compute something you have to be holding a gill because you have to hold the in order to run byte code so if you have three threads that are all CPU bound but all want to be computing something only one of them can be using the Python interpreter at a time so your your program effectively becomes single threaded this is the sore point about the Gil this is what everybody talks about so let's explore a little bit more about how the that's shared around when you're talking about IO bound code so there are some macros that are provided for C programmers that write extensions called spy begin allow threads and PI end allow threads again these are macros these are like function calls they're hiding a whole bunch of code inside and the idea is pi begin allow threads that's allowing other threats to run this is releasing the Gil pi and allow threads that's reacquiring the Gil that's saying okay somebody else has I'm gonna wait and once it's available I'm gonna grab it and now I'm holding a Gil and I can proceed and start talking to see python interpreter again in the middle you're not allowed to talk to the C Python interpreter but that's a good time to go off and do some computation that's gonna take a long time like if you're computing some you know elaborate mathematical result and I was gonna take a long time and he didn't need to talk to any Python objects you didn't need to change any reference counts you would be a good idea to drop the Gil in between so some examples of places where you might drop the Gil you might sleep if you're sleeping for a couple of seconds no you don't have to hold the Gil and you can allow other threads to run if you're going to write to a file or again read for a file that's a good time to drop the Gil if you're gonna read or write from a socket that's also a good time to drop the Gil and Python extensions and the Python library are very good about dropping the old one they don't need it so that's this is what allows multi-threaded IO bound programs to run successfully but sharing the Gil and CPU bound code just doesn't work because again only one thread can hold the kill at a time so only one thread can run bytecode at a time so everybody else has to wait so your program has effectively degraded to a single-threaded program now that's how it's supposed to work but actually didn't even work all that well there's a researcher at a Python enthusiast named David Beasley and in 2009 he did some research and he published his results as a chicago-area Python user group talk called understanding the Python Gil and he showed that the CPU bound multi-threaded code could run slower than if you had just made it a single thread in the first place it could get slower if you had multiple cores and he also showed that CPU bound threads could starve i/o bound threads these are all terrible things but for so for example here's one of the diagrams from his talk so there's this idea inside a Python let's talk about it's called sis get check interval I should put it on the slides sis that set check interval and sis not get check interval this check interval this is a number that is setting how fine grained you're supposed to swap off between threads if they're running Python bytecode so if you have a code that wants to run and compute things inside of Python you're really supposed to release the Gil now and then and give us somebody else a chance to run and so we enforce this inside of C Python with this check interval which is a number which is the number of bytecode instructions you're supposed to run before you have to stop and give somebody else a chance to run and it's set to 100 in Python 2.7 so you're supposed to run 400 instructions and then release the Gil and let somebody else run and they can run 400 instructions and then maybe you can grab it again and you can run for 100 with this diagram is showing us is the number is not a hundred this one in particular so the top is thread one the bottom is thread to read means I want to run but I can't because somebody else has the Gil green means I'm holding Yale and I'm running right now and the X dimension is time and you can see we've identified this one one interval as sixty six thousand seven hundred ticks this is 667 times longer than it should have been allowed to run so if something terrible is going on here but the terribleness doesn't end there let's talk about if we have two threads and one of those IO bound and the other one is CPU bound now this is a classic computer scheduling concept if you have an i/o bound task it tends to run for a little while just like a tiny little bit and then it's gonna wait on something really slow like riding through the desk or writing to a socket it's going to be sleeping for a long time so if you have one IO bound thread and one CPU bound thread the CPI Brown thread is never happy it always wants to run it but has it's gonna run as long as you will give it time to run so what you want to do in a scheduler is you want this to prefer the i/o bound thread you can keep it happy by just giving it a tiny little bit of CPU this one will never be happy anyway so you might as well make at least one of them happy so you in scheduling you prefer to run IO bound threads over CPU bound threads what's happening in Python the opposite thread one is iö bound white means I don't care I'm not I'm waiting red means I am waiting to get the Gil and then you can't even see the green where it's holding a Gil and doing it's working dropping it again but threat to is cpu-bound and it's just holding a Gil and running forever and ever and ever and ever and it doesn't give the back to the guilt it doesn't allow the red one to run what's going on here so this is what was going on here inside Python interpreter this is the code that was supposed to release the Gil and then get it back later and let other threads have a chance to run so if the interpreter lock is sat if we're running in multi-threaded mode then there's this call here it says PI thread release lock this is the thing that releases the Gil so the idea is give another thread of chance we have that comment up the top we release the Gil we have this common sense other threads may run now and then we say PI thread acquire lock we're gonna reacquire the Gil and run again so the problem the fundamental problem with this code is there's no code here there's nothing that happens here this happens and then mere nanoseconds later this happens so what happens is we're gonna let off the threads go threads run okay release they go over require legal release the Gil are required to go the epithets never get a chance so that sixty six thousand seven hundred ticks later the other thread was trying to grab it the whole time but we were holding it we would like release it and then reacquire it release it and required it just was every so often they would happen to win the race and grab the Gil away from us and that was what allowed them to run in the first place so this was kind of a dumb design looking back on it it wasn't really working at all so in two dot in 3.2 a guy named Antoine patru added something called the new Gil this was done in 2009 in Python 3 - and what it is it added a single new variable and then some mechanisms around it this thing called Gil drop request this is a flag that says hey somebody else is waiting the next time you drop and reacquire the grill you have to make sure that the other person gets to run before you get to reacquire the Gil and we added this and most of the really bad behavior went away David Beasley says he can still torture the Gil and get it to behaviorally badly but it's a lot harder to do now so this working pretty well so fixing it any further than this fixing the the crazy behavior that David Beasley can still do would require even more elaborate machining around it and at this point we're talking about scheduling theory and computer scientists have been working on schedulers for decades and we still don't have a really good one I think it's kind of a fool's errand to try and make it any more elaborate than it already is I think the new Gil is good enough we should just leave it alone which is what we've been doing so meanwhile the world has kind of changed around us as I said Python was invented in 1990 and the Gil was added in 1992 and back then all of our computers were single-core more or less but something happened in 2005 the CPU so you really didn't see multi-core computers because you had to buy two different CPUs and he had to get a special motherboard the supported both of them I did that in the 90s but I was kind of a weirdo most people didn't have that until about 2005 2005 is when the single core the single CPU would actually have multiple cores on it so server CPUs and desktop CPUs and game console CPUs all actually became multi-core in 2005 2007 laptops went multi-core 2011 the CPUs used in tablets and phones and in actually embedded systems on a chip on with multi-core in 2013 I glasses one multi-core in 2014 wristwatches went really core ladies and gentlemen we live in a multi-core world now and python is kind of ill appraised to take advantage of it because as I say if you have multiple multiple threads that are also see CPU bound you're in a world of hurt now it's not that guido hasn't heard about this but in 2007 guido wrote a blog post on his old artemon blog where he addressed getting rid of the Gil people have been talking about it for at this point decades and what he said was I'd welcome a set of patches into pi3k this is back before was called Python 3 officially so Python 3000 back then I'd welcome a set of patches into pi3k only if the performance for a single-threaded program and for a multi-threaded but IO bound program does not decrease this is a very difficult bar to reach and in fact this was you know more than eight years ago now and nobody has been able to meet it so I don't know if we're ever going to meet this bar I think maybe we're going to have to change our stand just a little bit so there have been some attempts in the past to get rid of the guilt the first one that I know about anyway was called the three free threading patch this was written against Python 1.4 what it did is is it didn't require any changes the a PC API so it didn't technically break any of the C extensions at least not from the API level what it did is it took all those global variables and put them as structure that's kind of like taking all of your module level attributes and putting them in a class so now you can have more than one of them and you need to have a reference to the class object you know to the instance in order to be able to talk to those variables so you can have multiples and you could have multiple threads talking to individual objects individual class objects and they wouldn't stomp on each other so that would get you a long ways the real problem was reference counting and his solution for that was he had a single lock which you had to acquire in order to change a reference count so you had to acquire it change the reference count and then drop it very quickly but you work people were constantly grabbing and releasing this lock so much that the lock itself caused a really big performance degradation so as a result Python was between four and seven times slower this is according to David Beasley who revived this patch in 2011 to sort of experiment and report on how well it worked which was to say not very well a couple of years ago Antoine patru did an experiment where he played with something called atomic incre and decker so modern CPUs have what they call these atomic instructions atomic tests and set where it's a way of incrementing and decrementing a variable and a number out in memory somewhere in such a way that is guaranteed to be an atomic operation you can't have a race because nobody else can see the changes until they've already happened kind of like a database or something like that so this atomic inker and Decker is supported by basically all the CPUs that Python cares about these days antwon's experiment he didn't have to change the API in order for it to work and he found that it was about 30% slower which is not the zero percent slower that Guido demanding so kind of abandoned this approach but it did work now I will point out to you that Python C Python is not the only Python interpreter out there in the world there are four that are pretty famous that are pretty well used - and Python Python and ironpython and pi PI and of those four interpreters see python has they'll only use reference counting usually yes reference counting the others use what's called garbage collection and C Python Zeeland really has a guild the others so pi PI kind of has a Gil I my understanding is that's used during garbage collection cycles but - and I'm Python don't have any Gil whatsoever it's perfectly possible to have a working Python interpreter that doesn't have any sort of guilt works fine it's just technically very demanding and they're using what's what I would call pure garbage collection so a period garbage collection is another approach to managing the lifetime of objects very quickly instead of having this reference count where you know how many people are holding the reference to it instead every swath these sort of the easy ways to stop the world stop all computation and examine all of your structures in memory all of your objects and all the ones that you can't all the ones that aren't live all the ones that aren't connected to your program running right now you say okay that's unused anymore and you collect it so pure garbage collection would work in Python obviously but it would require massive API changes it would break every see extension out there and I cannot understate how important it is to not break all of the C extensions C extensions are are a very important part of the lifeblood of C Python we break all the C extensions it's kind of like starting over I mean Python 3 broke all the C extensions and look how slow Python 3 adoption has been would it be slower or faster than C Python with reference counting it's hard to say we wouldn't really know until we did it conventional wisdom about garbage collection is that it's about the same speed or for reference counting so I'm hopeful that it would at least be fast enough but I think that breaking all the C extensions out there is something they can't really afford to do but something else to consider again python is the only one that doesn't have pure garbage collection but Python is Aldine also the only one that has a capi it's interesting to consider that all the modern languages all the modern virtual machines don't have si api's they don't allow you to poke at the running interpreter the way the c python does if you want to write an extension instead you write it from inside of the language and you use something like c types or c FF i because you might want to change how the reference the garbage collector works during PI PI's lifetime it started out with a simple garbage collection and then it added incremental garbage collection and then it added generational garbage collection or maybe I got those in the opposite order but my point is that they've changed the garbage collection thing well if you change garbage collection you're changing how object lifetime works and if you change how object lifetime works that means changing the C API because it's a very important part of that you have to manage objects if you're going to be talking to the internals of the interpreter like that so if they had a C API pipe I would have broken it twice like irrevocably broken it for everybody and so I would suggest that if you have garbage collection you kind of want to avoid having a see API anyway and in fact is you kind of wonder what's having a C API just so that you don't have to expose your garbage collection or your reference counting through the world so in my opinion if we want to talk about getting them to the Gil exposing a CPA C API to the world means exposing memory management into the world and we've already done it and we have garbage Clau we have reference counting we can't afford to change without breaking every c extension which means that we can't afford to change it and so the only thing we can do is something that doesn't change the C API so I think a comma concur and decker is probably the only thing that would work thank you it sounds strange to me like heaven anyone tried to do a thread-local an increment count for ref count for example just keep per thread the counter and the only one you get back to zero maybe just reach out to find out if other threads have been a different value for that you want to store the reference count of an object and thread local storage yes so would there if you had an object that was shared between threads each thread would have its own reference count for the object yes so then if one thread the reference count reaches zero we we release the objects memory and the other thread is still holding oh you don't if you reach down to zero maybe you can have an external garbage collector to find out if all the values from all threads are zero and then you release it well to answer your question no I don't think anybody has tried that experiment with cpython there are various approaches people have proposed about moving the reference count out of the object and into some other storage somewhere I think Mark Shannon who skipped this because he's seen this talk before he's talking about a way of sort of taking reference count changes and like just writing it down somewhere is kind of like a transaction log and only computing them in little bursts now and then so instead of having to hold the Gil constantly and changing the reference counts constantly you just do it as a sort of a batch processing step later and you would free all the objects then so there are ways to alter the reference count approach that don't but you could get rid of the Gil and SIL preserve reference count and do something besides atomic anchor and Decker I really have no idea how well they would work on I haven't read the research about this mark says yeah it could work so I trust him thank you can I ask about what's your opinion on software transactional memory yes that's quite interesting in terms of very interesting yes so there's a research branch of pipe I called papaya STM arm in Rego who when Armen rego walks into a room he's the smartest man in the room he's working on this software transactional memory is kind of like taking transactions like database transactions and using them for memory so what it effectively means like you if you have a thread that's running and it says I'm gonna read this variable for memory and I'm gonna write this value to memory and I'm gonna write this value to memory and now I say commit the transaction the rest of the world doesn't see any of these changes to memory until I have done the commit so if you have two threads that are both doing work they both might be committing you know attempting to write to memory externally and then they say commit the changes don't actually happen out in the world until I say commit and have two threads make commits that collide then you roll it back and you start over again and you do it over again just like databases he's gotten this to kind of work it's very complicated so to answer your question what do I think about it I think it's very interesting I think it's very complicated one of the things that's really great about see python is that it's very easy to work on internally there's nothing that's terribly complicated the code is very easy to read it's very conceptually easy to understand because there's nothing terribly complicated going on inside whereas compare that with pi pi is baffling like complicated internally and then STM makes that way way worse so I talked to me what I said would it be worthwhile he actually wrote his SDM support has two pieces there's a C library called STM gc7 or something like that which means the 7-time he's started over from scratch throwing away the old one and start over the blank piece of paper he's working on eight because he's doing it again and then the pi PI variant uses that library I said would it be worth trying to port that library and use it from inside of C Python he said don't do it it is so complicated is so maddeningly difficult to get right that he's using like code generation to do it inside a pipe I and he can't get it right from a cogeneration perspective expecting a C / C programmers so I I made the point that reference counting is simple people still get a wrong and expecting human programmers to get the pipe I STM interface correct is just simply it's impossible it's it's not going to happen so C Python adopting STM as long as STM is as complicated as it is which i think is going to remain constant I don't think that it's something that C Python should do just one more comment on that except from being complicated might be as well more slower than the cast solution that they had for Atomics hardware like but in the fail to do that yeah well you know we won't know until we try but the theory about SVM is that it really should be it should have almost no overhead whatsoever I mean at the you need a lot of memory for this transaction log but and so that's adding a little bit of overhead for tracking that transaction log but the actual overhead is very small and so you're getting almost linear scaling when you're adding more cores and as long as you have a reasonably parallelizable problem that is so STM the whole point of it is that it doesn't add very much overhead and it doesn't slow down your program very much but it's conceptually very difficult and it's conceptually very hard to get right so those are really the major critiques of SVM as far as I'm concerned it's not it's not that we make it slower it would be wonderfully performing any other hands raised I think I'm keeping you from lunch I am you mentioned at the first slide about the atomic increment and decrement that it was 30% slower was that just because it was the first draft of the implementation again do you recommend this is probably a good way to go forward no it's 30% slower because in order for atomic anchoring Decker to work like it's the overhead of the atomic nests as opposed to just being in Korean decker so adding making it an atomic Inger and Decker instead of just anchor and Ecker that's the thing that's 30% slower I don't know that much about how modern CPUs work internally but like modern CPUs have these enormous pipelines where it's doing all this work you know on the theory that it may that it may work out in the future and my guess is that an atomic inquiry and Decker basically cause everybody to spill their pipelines and start over from scratch and so you're throwing away prospective work that you're trying to do and there's a little bit of synchronization internally my guess is also that in atomic inker and Decker maybe sort of stops the other scores on the CPU from running for just a split second and again that's gonna be overhead that's gonna slow you down but again this is all me making stuff up I don't really know now that we know more about the Gil and how it works and all its understandings what do you recommend for new and old and Python developers alike for the best reading experience if you get your part ISM right what do you recommend in your experience what do I recommend for who I don't understand the question like you said there's different ways there's different you know implementations of Python right going forward if you were starting today and you were gonna write a new language today what do I recommend you do yeah yeah oh that's very clear all of the new work that happens in dynamic languages is all using garbage cache and pure garbage collection and I think that absolutely the way forward is use pure garbage collection and you don't have a C API and you force people to write extension modules quote unquote in your native language using something like C FFI or C types where you're calling out from the language into the the C thing and that's what that's what everybody's doing now that's what Russ does that's what go does nobody has a C API and they're all using garbage collection and I'm not gonna argue with those people because they're really smart

Info

Channel: AlphaVideoIreland

Views: 29,327

Rating: 4.9689522 out of 5

Keywords: PyCon (Conference Series), Programming Language (Software Genre), Radisson Blu Hotel Dublin, Conference Video, Python Programming Language, Larry Hastings

Id: KVKufdTphKs

Channel Id: undefined

Length: 36min 47sec (2207 seconds)

Published: Sat Jan 30 2016