Thinking about Concurrency, Raymond Hettinger, Python core developer

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

The man is made for giving talks

πŸ‘οΈŽ︎ 24 πŸ‘€οΈŽ︎ u/jeewantha πŸ“…οΈŽ︎ Jul 27 2016 πŸ—«︎ replies

Does anybody know where to find the sphinx pages Raymond used as slides for the presentation?

πŸ‘οΈŽ︎ 7 πŸ‘€οΈŽ︎ u/jollybobbyroger πŸ“…οΈŽ︎ Jul 27 2016 πŸ—«︎ replies

He also gave a pretty long (2.5 hours) workshop on variety of topics (descriptors, context managers, etc), it's not available on YouTube though. That's a shame. His talks and workshops are always worth attending. The guy is a legend.

PS: Snapped a picture w/ him after the workshop. It now helps me to charge my Python powers, when I feel especially worthless.

πŸ‘οΈŽ︎ 4 πŸ‘€οΈŽ︎ u/massivewurstel πŸ“…οΈŽ︎ Jul 27 2016 πŸ—«︎ replies

Every multi-threaded program has a race condition. If it didn't, you didn't really need threading to begin with.

What?

πŸ‘οΈŽ︎ 6 πŸ‘€οΈŽ︎ u/jmoiron πŸ“…οΈŽ︎ Jul 27 2016 πŸ—«︎ replies

Late comer here, im confused a bit. Why even multithread then? Asyncio solves the problem of blocking. After blocking, whats the point of multithreading if the GIL is going to lock it to a single thread?

Educate me.

πŸ‘οΈŽ︎ 2 πŸ‘€οΈŽ︎ u/zsxawerdu πŸ“…οΈŽ︎ Jul 27 2016 πŸ—«︎ replies

Good talk but don't understand not even covering concurrent futures or asyncio at all, at least on a high level.

πŸ‘οΈŽ︎ 1 πŸ‘€οΈŽ︎ u/loganekz πŸ“…οΈŽ︎ Jul 27 2016 πŸ—«︎ replies

I have started appreciating golang a lot after perusing the video.

πŸ‘οΈŽ︎ 1 πŸ‘€οΈŽ︎ u/ramrar πŸ“…οΈŽ︎ Dec 23 2016 πŸ—«︎ replies

just cool

πŸ‘οΈŽ︎ 1 πŸ‘€οΈŽ︎ u/mechezawad πŸ“…οΈŽ︎ Jan 01 2017 πŸ—«︎ replies
Captions
thank you for coming thank you for inviting me and my family here this is really a wonderful trip and a dream come true so it's a real pleasure to speak to you all today about concurrency I want to do two different talks today one on threading on one on multi processing and I'm going to do them concurrently so I'll just say them at the same time and there will be no confusion all right fair enough I like to take advantage of opportunity to be on stage to conduct surveys so please participate my quick survey how many of you have tried Python 3 how many of you like Python 3 who wants the hard question now how many of you are using Python 3 in production or pretend to be easy in the next 6 months oh thank goodness this is uh the first time a year ago when I'd ask that question we get only one or two hands and we were a little worried because we're coming up on Python three point six how many of you are using Python three point six right now oh so it'll only be me I just built a fresh one up this morning for you guys so we'll we'll be using that today oh how many of you follow me on Twitter how many of you will follow me on Twitter if i zoom in on this part right here I'll know if I've got 200 new followers up tomorrow I teach Python through Twitter I don't tweet when I go travel to interesting places I only tweet about how to use Python effectively what's going on in the Python community so please take advantage of that feed how how many of you use threading in Python how many of you use concurrent of concurrent futures how many of you use the multi processing module how many of you use a subprocess module my work is done here alright so I'm going to give you a couple of very simple examples and because they're so simple and because you're all experienced at them with with these topics I think we'll have no problems at all with these examples and we'll be able to go right through them how many of you've ever used Sphinx as a tool to generate your slides for a talk it's fantastic so what is our goal here roll up through a couple of examples of threading and multi-processing although we have to ask ourselves the question why do you want concurrency to begin with one good reason is you want to improve our perceived user responsiveness to your system another is how many of you want speed out of concurrency that's good because uh concurrency can take away your speed as well as add to it sometimes you can get some advantage from additional course sometimes concurrency can make your code go slower that's an unpleasant effect another reason to explore concurrency though is a theme of the talk is its how the real world works we think of it as a computer programming concept but in fact the concept is very large and goes into project management and dealings with people so my wife is a computer programmer and she programs of the scheduling for the construction of satellites and in the construction of satellites there's over a hundred thousand discreet tasks and in those tests those are people being coordinated this team is designing the main bus this team is uh designing the solar power cells and pretty much all the concurrency primitives that we use in a computer are also used in project management these two teams have to finish before they are each have to finish before the the two parts can be assembled together this is so a simple thread joint sometimes there's a room called the shaker room in the shaker room you put the satellite in it and you shake the side of it to see if it falls apart before you launch it you can't have two satellites in the shaker room at the same time you need mutual exclusion you need locks people communicate with each other through atomic message queues I mean email and it works exactly the same way so the physical the analogues that we're talking here actually go very much deeper than Python they go into the real world and they cross transcend programming languages fair enough oh how many of you heard of Alex martelly okay so he is a rock star in the Python community and author of a Python cookbook and Python in a nut shell and he communicated to me this really interesting idea of scalability which reflects his Google proclivities he said there are three kinds of programs there's single thread single process programs that take advantage of one cor there's multi-threaded programs a multi-process programs that take advantage of the two to eight cores on your system and then there's when you need more than eight cores off or you use switch techniques to distributed processing here's his interesting observation in our world something is changing the single core is getting more and more powerful in the time that I've been computing well over ten thousand times more powerful than what I had started with so the range of problems that can be solved here is much much larger than it is a bin before and so that makes this second category a little bit less relevant but also encroaching into it is this one it used to be Wow we had definitions of big data that were very small now our definitions of big data are very very large if you go to a big data conference you'd better be saying the word petaa or it's not really big data if it fits on your machine it's not really big data and alex is suggesting that these problems are becoming more important and more prevalent and that we're going to have to resort to these techniques sooner and a consequence of this is he suggests that for a lot of things a lot of people want to do this middle section has becoming less and less relevant over time now fighting this trend is getting more and more cores on our system and it's real really unpleasant to only get one-eighth of the power of VI your system so we will talk about this us against section how many of you part of the global interpreter lock how many of you like the global interpreter lock really Larry Hastings seeks to get rid of it I wish him good luck many great men and women have tried before him and have determined that the global interpreter lock improves performance rather than hurts performance we will see if he can make it disappear we will make him a saint and we'll make a church to him with onion domes on the top of it the likelihood is not particularly high so what's the effect of the Gil the effect if the Gil is that note that more than one thread can run at a time that means that our threading is really great for i/o bound applications in Python so web servers and whatnot multi-threading mostly works fine however if you have CPU bound applications one of the most important things you can know is that if you add threading to a CPU bound application will it go faster or slower slower okay an amazing number of teams decide they want speed and decide to add threading how well does that work out for them not well at all so I'm going to interlace my talk with lots of little tidbits of things that sound like small sound bytes but could actually save you enormous amounts of time now a note of my ex-girlfriend you guys want to talk about her right she was in human resources and what a human resources do people do they don't program computers they program people and they have hacks too and one of the things they do is the Jedi mind trick and so here was one of her Jedi mind tricks she would say to me Raymond your weakness is your strength and your strength is your weakness this would confuse me and I'd say this is not actionable what do I do get stronger get weaker I don't get it what is the strength of threads it's shared state it means that states can communicate very very quickly what's the weekly assess threats it's on the screen I just said the answer you know it what's the weakness of threads shared state which means that you're going to have race conditions in fact it's a little bit of an overstatement but every multi-threaded program has a race condition because if it didn't you didn't really need threading to begin with you're not advantage of its strengths what about the strength of processes it's that they are independent of each other they don't have shared state that makes them a lot easier to work with what's the weakness of our processes that they lack communication and shared state hints it introduces the need for inter process communication to move objects between them you have to a pickle and have other overhead in the multi processing module we hide a lot of these details for you from you do you think they're important anyway even if they're hidden absolutely they're important when you're using multi processing you have to be aware if you move a lot of data you're pickling it through IPC and there's a tremendous amount of overhead if you're using multi processing in a thread pool you have to be aware you've got our good news is you've got shared state the bad news is you've got our potential for race conditions and the Gil keeps you from using multiple cores fair enough all right let's start with some really complex code two simple examples let's see if I can challenge your Python skills how's this for some sophisticated Python code are you impressed it takes a counter it prints starting up it loops ten times it increments the counter prints the value of the counter and prints finishing up are you impressed yes I can teach people that much Python by the way I teach Python for a living so the answer to the question that was asked of you before is yes education is the answer people can be taught to code very well in fact I try and make my classes not at all about the syntax but about the craft of using the language well and so people can write this code in the first few hours of using a Python do you agree that this code is easy and it doesn't take long to write the whole thing is only seven lines of our code I know what you're thinking Raymond how could we throw away everything that nathan has Nathanael has just taught us and make this hard why yes we can we can do exactly this and expand it to 60 lines of code and dramatically improve well change it okay now here's another really powerful piece of code that I can teach people to write on the first day it has a list of websites we loop over the websites open the website read the webpage print out the URL and the length of the webpage yes I do know that you can look at the content length header and not have to read the entire page but that's not the point we're trying to demonstrate here so this loops over and tells you the size of the home pages some of which are very surprising some are really compact only 18 K or 10 K and then there's other pages that are shockingly 500 k or that are just enormous web pages and apparently they don't care about response time I know what you're thinking can we throw away what Nathaniel taught us and make this code hard after all it's only four lines can we make it complex the answer is yes would you like to proceed all right fair enough threading I should actually start my timer but cuz I have a lot of things to say about threading and only a little time to say it so this is a scripting style that we just showed what's great about it is it's simple and clear but also it corresponds the way I teach people to write code I teach them to write with global variables run it top to bottom type a few lines run it type a few lines run it type P lines and run it that incremental style of development is very quick it lets you concretely see your results it lets you test as you go and people can reliably knock out code even if they have very little programming background is this a pretty useful style in fact it is now once you've got this you can always move things into functions but let's take a look at the output the obvious output is it starts up and it says it counts to 10 and says finishing up so one important note for multi-threading the most important principle get your app tested and debug in a single threaded mode first before you start threading threading are concurrency never makes problems easier as soon as you add that to it you've added a whole new layer of complexity plus wasn't it nice to run our code and get it all tested doesn't testing make you feel good we ran the code and produce desired output oh you like testing I see how it is suckers okay here we go the next step in the evolution is to move into functions I teach people after they've written that code to factor out the reusable components in this case we've got a reusable component a worker a unit of thought that says well this workers job is to increment the counter and parenthese and so now to drive it now to keep the slide simple I didn't put this in a main section but when you have your reusable components then you have your testing components at the bottom that yeah that use those oh this code produces the same answer I know because I run it and I see exactly the same answer I've made a little refactoring and I've tested as I go you like my methodology okay I'm not going to lead you into any bad places if you trust me okay all right so I do test your app before getting into multi-threaded so multi-threading is amazingly easy to add to code all we have to do is import threading and instead of calling our worker function directly we launch it in a thread and so now we have my main thread running and we start up the worker thread so the total amount of changes to this code is one line for the import and one line to launch the thread easy-peasy any questions okay sort she'll be a test our code yeah okay so we'll test the code to prove it's correct you are all using Python 3.6 correct okay yeah so I ran Python three six and I'll read the thread and testing multi one in fact let's just go go run it not quick time there you go test it the code is beautiful factored and test it ready to check in no why what's wrong with it it's broken it's broken but it passes the test can you spot the race conditions in the code okay tell me what race condition you see the global counter did you know that everybody sees that even on the first day of Python everyone sees oh we could have one thread look up the value of a counter another thread look up the same value of the counter both of them increment the same base value both of them write out the same incremented value and you have two workers run in the counter is only incremented at one time everybody sees that which is interesting because it almost never shows up in real code even though there is a race condition here this happens so fast that you could possibly run trillions of tests and never have this saw problem show up in fact you could ship this as production code and run it lots of times and no problem will ever show up in your lifetime and so even though you spotted a problem you spotted a problem that is probably not that important to you I say fix it anyway but it is in fact a problem but what most people don't see is that printing is a resource as well and in fact how there is a race condition there but we tested the code and it worked just fine should you trust testing never trust testing when it comes to multi-threaded code it's useful but many interesting race conditions never manifest themselves in test environments in fact they manifest themselves under load under abnormal conditions and in ways that are really hard to reproduce creating heisenbugs who's heard of a thousand bug before you started the Heisenberg uncertainty principle the idea is if you're looking at the bug the act of looking at it causes it to change its behavior towards no longer a buggy in particular if you run code like this through a debugger you will never see the race condition because the debugger interferes with all of the are cases so there's a technique that can be used to amplify the race condition and it's called fuzzy it's an easy thing to add I put in a sleep for a random amount of time this is not a perfect technique but it is a decent technique and so in between each operate step and eye operation I add a little bit of fuzz getting the all value the old counter the increment fuzz a print interestingly the print has two separate steps the print of the string that you wanted and the print of the new line so I'm separating those two and so I just added a little fuzz otherwise this is the same code here's the output of the first result fact I'll just give you the just do it live Oh are you a little displeased now this is exactly the same code and this type of output could have happened in one of your production runs under load when interesting other things are going on in the processor in a non reproducible environment are you convinced that fuzzing is good to help you use a leverage testing to find errors I'm convinced that it is helpful on the other hand we've also learned to not rely on testing and it suggests that this problem is harder than it looks if you've been convinced that the problem is harder than looks I've already done a Barton large part of my job I'll when I teach multi-threading and multi-processing the first thing to teach is fear and respect it is a solvable problem a winnable problem but it is a problem it needs to be respected and needs to be feared are any of you pilots I can fly an airplane it's pretty easy you get in the airplane turn on the engine aim it in the direction that you want to go you look out the window for other airplanes and if you see them you don't aim for them you try not to hit them a simple thing I know what you're thinking what can we do to make this more complicated fly in a cloud now once you're in the cloud and you can't see outside the other airplanes have to tell you where they think they are you have to tell them where you think you are it's imagine when you're driving to work one day you paint the windows of your car black you get on a radio and say I'm about to go through this intersection don't be there when I get there and somebody says well I don't think I'm in that intersection go ahead and go for it you feel safe so in fact this is a dangerous thing every pilot who attempted it at the outset flew into a cloud and their life expectancy was about 30 seconds and I'm not kidding 30 seconds is the life expectancy of a pilot not trained for instrument flight conditions hitting a cloud it's very hard to keep the airplane upright your physical sensations will lie to you you will be going down when you think you're going up and so you fall out of the sky and die like multithreading in fact though you can be trained to fly in the clouds the first person to do this was uh Jimmy Doolittle and Doolittle surprised a bunch of generals by flying into an airfield when it was completely foggy and he knew it was going to be completely foggy and he was demonstrating the first instrument test fly it turns out you can be trained to fly successfully in clouds can you be trained to successfully fly in multi-threaded conditions yes you can I'm about to teach you I don't have time to teach you in depth but I will give you all of the examples and the bullet points and go through them fairly quickly but remember there's two kinds of pilots those who've been trained to go through methodically with a discipline and a careful approach to getting to the clouds and those who have a life expectancy of 30 seconds when you get your flight rating so you went called a visual flight rules pilot VFR for the first rating in IFR for the second rating and so via Faro writing scan pilots can only fly on clear days are you a VFR multi-threaded Java programmer so this was taken from Mozilla this is a fellow named David Barron this is at his office in San San Francisco he said a stand-up desk there and he's actually a quite tall fellow not quite two meters tall but uh up there about right here see a big guy yes he's very tall and up above him is this little sign I think every office should have that sign he says you can write multi-threaded code you just need to be a little taller than you are now so let's go stretch you stretch you out this will be more careful threading there's a couple of approaches one is you can use locks and another you could use atomic message skews I'm going to show you the one that I favor first which one do I like atomic message queues don't use locks locks are great if you're writing an operating system how many of you write operating systems I didn't think so because you wouldn't be using a Python so blocks are great for implementing OSS but for anything higher level than that real applications people don't think in terms of locks they're amazingly difficult to reason about and so we want a higher level primitive we'd like to have something that we can relate to atomic message queues how many of you have an email account how many of you have an atomic message give there's a hundred percent overlap I would think of it in that regard I'm saying atomic message queue because in fact we have a queue object in Python but you can use rapid mq or active zero mmq ActiveMQ email accounts you can use databases with locks almost anything that lets you communicate atomically I will work how many of you've heard of RFC's before okay how many of you've heard of our ours oh those are Riemann rules ok so Raymond rule 1000 all resources shall be running exactly one thread all communication with a thread shall be done an atomic message q what cup of resources need this technique pretty much every resource that is shared global variable input from user output files sockets that said there are some tools that already have locks built inside them in particular the logging module and the decimal module has thread thread local variables and databases tend to have reader and writer locks email is also an atomic message get pretty much everything else should be presumed to be non atomic in nature and needs to be wrapped in it's a home thread next thing is sequencing problems how do you make sure that in multi-threaded environment step a is a followed by our step B is followed by a it's a simple thing if I have many people in this room acting concurrently I can find Nathaniel and give him two tasks do a and then do B if I give it to him and it's in one thread it's guaranteed to be performed sequentially but if I give it to two different people then I have to form some type of message queue or locking in between them what is the easiest way to make something sequential put it in one thread okay what is a barrier a barrier is a concept where you wait for parallel threads to complete so my wife's example somebody making the main bus and someone making the power assembly both of those have to finish before you can join the two together with threading we do that with a joint join says I take another thread and wait for it to finish once it's finished I know it's work is done and I can proceed presuming that work is done yet all of you already know what join does fair enough so it says wait for somebody to finish when is a terrible time to wait for somebody to finish when they're never going to finish let's say a thread has an infinite loop what do we call a thread that has an infinite loop that never finishes a daemon --thread so if you demonize the thread all it is is marking the thread to say that this thread is never going to finish don't wait on it so you can't wait on them to complete which raises the question how do you wait for them to finish their work the answer is you use your email accounts you send an email to a worker and say I'd like you to do some work now should you wait until I you can check and see whether they've read all their emails if they've read all their emails does that mean that they've done all of their work I'll send Nathaniel an email I'd like you to write a book on Python oh hey he raised his thumbs he opened and read his email is he done writing his book no so it is insufficient to wait for the message queue to be empty instead I need to send Nathaniel a email that says this write a book on Python then when you're done Briah send me a note back saying that you're done easy enough and so this is the traditional way to do it which is to message queues we've built into Python a technique that I invented and it's become very popular built into the message queue itself is a method called tasks done so you retrieve a message you do the work and then you mark the task is being done so that someone can join not the thread but join the email queue so when you have non Damon threads you join the thread when you have a daemon --thread what do you join you join the email queue for talking to it fair enough that's Raymond rule 103 global variables so they good or bad ah they're very very bad are they all over the place yes are they sometimes really convenient sometimes they are sometimes you want some a few functions to share some state nathaniel would probably argue and i would agree with them that the the shared state should probably be in a class there's cases where we can't do that though the design of the decimal module requires that it has a global state for the context of the data small module how many decimal places of precision are there and this specification for decimal says that this context has to be independent of all of the decimal objects in other words we can't put it in the class to implement the spec do you think that creates problems in a multi-threaded environment yeah so here's the worst of these the worst of them is locale how many of you use locale before Sola Kyle is a disaster because it was designed back in a time when we had big computers that didn't move and people who worked on the computers who all lived in the same country and never moved do we move around now yes I just brought this in from the United States and I'll be taking it back shortly we do move around and so our problem is locale is a global state not just in your program but across your entire computer so you get a request from France you change the French Global's locale you're starting to make some do some processing on that request but now you get a request from Germany and you switch over to the German locale the first thread our verse process even is affected by that state change that's why the locale is a disaster and you can't use it in anything that is concurrent global state good idea bad idea it's a really bad idea I I agree so to help us with that we have thread-local variables which says that you can have something that is appears global within your program but is unique to each thread it has its own global so each one could to set its own equivalent of locale or its own decimal context also this is an interesting point I get this all the time from experience multi-threaded programmers they look through pythons threading API and they say hey how do you kill a thread in Python I said why would you ask me such a crazy thing well you know you can kill a thread in Java how many of you knew you could kill with Reds in Java you used to be able to kill a thread in Java but they deprecated the method to do it why because it's a terrible idea don't do it it's a conceptually flawed idea it's not an implementation problem it's a concept but because if you try and kill a thread external to a thread you never know if that thread is holding a lock while you're killing it if you kill it while it's holding a lot your program will deadlock and we get bug reports on this all the time Raymon we tried to kill a thread in Python can you kill a thread in Python can you externally kill a thread in Python interestingly there's a couple ways to do it one is you can actually call the operating system and do a kill on the thread another way is to use the C types module to reach into the thread and kill it but we haven't provided a direct way because it's a terrible idea when people get themselves in trouble using the other technique what they're telling us is reimann I've intentionally pulled out a gun that is labeled toe shooter offer aimed it at my toe and shot off my toe and now I've got a problem my toe is missing Python is broken we get bug reports like this all the time just because there's recipes published for it but nathanael to say that you should do it or not do it he said just because the language lets you do something doesn't mean you should do something so here we are applying all of the rules the counter has been isolated in its own thread it has an infinite loop is it a daemon --thread or non daemon --thread yeah it's a daemon --thread so we mark it as daemon here so should you ever join this thread of course not that's a terrible idea it never returns what are you going to join you're going to join the message queue so we have an instance of counter queue that's the email account for talking to this thread and what this thread does is sleeps until somebody sends it an email saying increment once a once its conta it increments of the counter and then it sends an email to the printer saying print a message now the printer might run at its own speed but it also has an atomic message queue that sequences all of the actions coming in once it's done its task it calls this method tasks done who invented the test done method oh that was me okay so this marks is being done so that later you can wait on the queue itself to see if it's done separately we've isolated the print resource this has exclusive rights to the print keyword once again it has an infinite loop daemon --thread or non daemon it's a daemon --thread does it eat clock cycles like crazy or does it sleep until it gets an email it sleeps this is our blocking is there a race condition here in looping over the lines of the job and printing them in fact there is a race condition why is it not a problem the secret to winning a race if you're slow is to be the only one in the race so because this has exclusive rights to the print keyword and it's the only thread that uses print it always wins the race so if you need to be something to be sequential put it in one thread easy enough okay workers their job just becomes to send a message to the counter a queue for remember we can't increment directly we can are the rule is we only communicate with it through the queue the interesting parts are here after we start the threads we then join all the threads which ones threads can we join Damon or non Damon non Damon the worker is a non Damon it doesn't have an infinite loop it returns so at this point after this for loop is off and we are guaranteed that all of the workers are finished does that mean that all of the increments are done and all the counts are done no because all the workers job was is to send it an email so our guarantee at this point is that ten emails have been sent to the counter queue we don't know that it's counted yet or even awake yet we just know ten emails are sent now we need to know is the counter itself to you done do you wait on the counter queue to be empty of course not because like Nathaniel in his book are you finished with your book yet now he hasn't sent me an email telling me he's finished so in this case we're not waiting on the thread were waiting on the queue were at saying that for every email we sent in was there a task done now we know that the right after this line what has happened ten times we know that it's incremented ten times has it printed ten times no it sent ten emails to the printer queue so who do we need to wait on next the printer so we go ahead and send another message saying finished up a week guarantee that is printed all of the other messages bye-bye now no but we've guaranteed that it's gotten 11 emails one the starting up and then ten consecutive print jobs is called a queue for a reason because it's FIFO so we're guaranteed that even though it's not finished yet that this will be the last thing to print which is odd that it's done now once we've sent ten jobs to the printer do we know that the printer is done printing no so we need to wait for the printer to mark that it's done and that's the printer - you join this is a correct solution to the problem if you do anything less than this your code is incorrect keep in mind I started with a trivially simple problem do you have a little fear and respect for a multi-threading now do you have a methodology that will set of rules that will work for you in fact if you apply these rules you can systematically work through this and guarantee logically that your code is are correct I've still got it fuzz so I will go run it for you muffy three and you will see it a slowly run slow because of the fuzzing but despite the fuzzing it will still get the correct answer every time who learned something new for production should you leave your fuzzing in there's some question about this how a book that I love programming pearls has a rule called leaves the scaffolding in and so one way to achieve the leaving the scaffolding in is to have your fuzzer have a true-false value here I can set fuzz equal to false it will skip the random sleep and run full speed ahead and so that's one way to turn it off to get it to run full speed but to leave it in for debugging purposes are you can just clean it up and take the code without fuzzing so this is the code without fuzzing nothing terribly interesting here other than it's the correct solution to the problem the thing I find the most interesting is our seven lines of code is now about sixty lines does it take a little effort to get multi-threading are correct in fact it does and so that's part of the respect for it if you were a manager don't expect people to layer in multi-threading as fast as they created the underlying application it is far more complicated and so having completed that test I can now run it at full speed multi four and it runs just fine who learn something new is there any technique other than using message queues that you know about locks or locks a good idea or a bad idea they're a great idea if you're writing a an operating system here is the same solution or what looks to be the same solution with locks I know it's not just a contrivance because in class I give people an assignment sometimes to write this code with locks and typically I give it to people who said oh this problem would be easier using locks and interestingly it is a little easier to do using our locks what surprised me because they're a lower level of our primitive and so this is typically the solution that people come up with I need all of this to print atomically so I do it with the printer lock I need the counter to update atomically and I still need the odd to know the counter value all of this needs to be atomic so that's done with the counter lock and the width statement makes this clear and beautiful how many of you like the wood statement it's kind of Awesome and so if I and this runs perfectly well and if I take the fuzzing out you can see it's a little bit shorter than the other solution do you guys like the locking method it's shorter it's easier to do and it was most obvious to most of the students in my class it's the tool that they tended to reach for first that won't happen to you will it that's a little bit of a tease this code is perfect it's beautiful and it's simpler using the cues notes unlocks first thing is locks don't lock anything is that is the print function locked in my code we have a with printer lot somewhere else in the code someone can add a print and not check for the lock so logs don't really lock anything and so you can't assume that because you wrote correct multi-threaded code that it will survive maintenance lots of multi-threaded code starts correct but because your locks don't actually lock anything someone is free to access your global variable anywhere they are free to access print anywhere did locks lock anything if you know that you will stop trusting them as much as most people do you've also been taught that there are low-level primitive but I didn't teach you what is yet what is the real problem with this code the good news is it's correct the bad news is it's lower than the original it takes no advantage of concurrency at all and in fact it is fully sequential one of the interesting parts of logs is if you take any sufficiently complex act and put enough locks in it eventually it becomes too sequential we've actually undone all of the effects of multi-threading this code logically does exact the same as our first seven line version interestingly I've had teams where I go out and do consulting I see thousands of locks in their code and I start tracing through working at the logic of anything Wow I really don't want to be the one to deliver the bad news for them said I've got good news your code has no race conditions it is in fact fully deterministic it runs the same way every time regardless of the thread scheduler what does that tell you okay so this code is correct it was also a complete waste of time how many of your liking locks now to it you can achieve the fluidity of the previous version and how would you do it you would essentially reinvent what I already did for you I want the others and maintainer x' of the queue module the queue module has locks in it I wrote that module so that you don't have to reason with the low-level tool and so that you don't have to mess with all of the synchronization fair enough who learn something new lastly there's the dining philosophies are there's problem all of the techniques I gave you above work great for directed acyclic graphs however when the control flow is circular the problem is much harder my right it is still solvable but so much harder that you really do want to resort to formal techniques fair enough and completely out of time or can I still over five minutes I will do multi processing very quickly you've seen the scripting style that's the code we looked at before function style we just factored out to a function what is the site size this is I believe is an important step the remember I suggested you should get your code correct in a single threaded single process mode first and one of the easiest ways to do that is to use map which is sequential so we test the code and make sure that it works the great thing about switching to the map form is it makes it easy to transition to a multi processing map and so the multi processing version of map the only thing we need to change is to the pool I'm out unordered I'd like to show you a couple of little tricks along the way here one of them is that the design of this function returns its input as well as its output which is kind of weird the color knows what the input is why do you need to return it if you design your functions that way it lets you use my map unordered and you don't have to care about the order of the results now this will greatly improve the responsiveness of your program who learned something new okay a couple thoughts on multiprocessing is a lot of the thought process in multi-process is trying to think about what's parallelizable and what's not this looks like a simple bit of code open a URL read it and get its length I'd like to analyze what parts are non paralyzed about do we have to do a DNS request to get the over UDP to get the URL we have to get a response to know what IP address it resolves to we need to require a socket from the operating system we need to do the three-way handshake for a TCP connection a syn ACK and a syn ACK this is the part that takes the longest time well actually this has a roundtrip on the net - oh then we need to send an HTTP request then we need to wait for our responses we get mini packets and then join them all together and then finally we count the pages characters on the web page these are all sequential actions however there is a little bit that's parallelizable the doing the DNS lookup can be done in parallel with a getting of the socket is that worth it why is it not because this is expensive and is measured in milliseconds this is cheap and is measured in microseconds and so there's really no value to paralyzing these to themselves this one is interesting the HTTP requests when you're running from home you've only got one connection out to the Internet but typically when you're running from work you've got a bundle of fiber and there's really no reason that you can't send out get many sockets and send out mini HTTP range requests unless you're using HTTP - in which case all kinds of good things happen for you automatically so this is great because you can send out a hundred parallel requests get the data to come back in parallel and then reassemble it this actually has a great deal of potential to speed up your your code and it's called channel bonding I provided a link an example of how to do that and we don't have to wait until they're all done to count the characters we can count them one packet at a time in other words this these steps are parallelizable it's also probably not worth your time unless there is an enormous amount of data here if there's a lot of data these steps are worth it so what's the better way is to treat all of this is one step and to do many different URLs in parallel what I see when I go out and do consulting is people skipping this part skipping the parallelization here and instead they focused on this why it's because they wrote this first they knew it wanted it to be fast and so they will write a thousand lines of code to do this when in fact they could have come down here and written one line later when you do multi processing do it at the highest level possible and then you get the greatest payoff oh quick note on thin channel communication Wow I have a quick key note that I'm talk that I'm giving right now so I flew over here it took I think 18 hours of travel time to fly here and so I'm about to fly back home and then tonight I've got a tutorial that I'm giving and then I'll fly right back what's wrong with my plan I give a 45 minute talk I fly 18 hours home 18 hours back and then give it to our tutorial what's wrong with my plan too many trips back and forth related to that is not doing enough work on each trip you fly to Paris have lunch then fly to Rome and have a dinner and then you fly to Moscow for a night on the town you're doing too little work as you go so what you should do is fly to Moscow do all of those things then fly to Paris do all of those things and then Frankfurt and then lastly I didn't know what to wear when I was here I had heard from so many books I've read that Russia is really cold but the internet told me that it was going to be really hot so my plan was I was going to bring all of my stuff so I had them back up pack up everything in my house several big containers and ship it over that way I don't have everything I needed to wear what do you think of my plan what's wrong with it taking too much stuff with you and so with multiprocessing remember you don't have shared data so anytime you move data back and forth send it into a process or get it back you want to send only a little bit send in summary queries and summary results these comments sound obvious but in fact most of the time when people get poor results with multi processing they're violating one of these three rules early oh I'm calling the linked function a million times in a multi processing module Wow you're pickling the data / pickling it back oh I send over the multi processing module it fetched a lot of data then handed it back to me and I did a process on it they're taking too much stuff with them as going along I went over and just did one little thing and called the a length function you want to do a lot of work these are the three ways to screw up multi processing I've given you some SQL examples here because most people Internet understand it in terms of SQL so I've got an employee database this one gets the entire database and then computes the summary it gets too much work this one loops over every department and runs a query for each department this makes too many trips this is the right way you send one query that does a lot of work on the other side and return your summary results that's obvious with databases that if you try any of the first two your performance will be terrible while it's obvious what databases people make exactly the same mistakes with multi processing and here's the exact same three mistakes and I see them over and over again in code and in this case we're returning the entire webpage rather than its length we're turning too much data in this case we're returning one line at a time too many trips back and forth this is lots of range requests and good news as it gives us summary data but we're making far too many trips there and back I see all of these mistakes all the time in multiprocessing code I hope it will never happen to you the very last thing which I will do in one minute combining thread and forking it's a simple thing if you combine thread and forking you're living in a state of sin and you deserve whatever happens to you this code was submitted yesterday or the day before somebody said python is broken because I combined threading and forking this is a simple example that deadlocks every time their conclusion was python was broken my conclusion was that their mind was broken so here's the if you must here's the rule thread after you fork not before and here's the reason why if you thread first you create some locks as soon as you fork those locks become shared across the different processes you kill one of the processes that has a lock all of the other processes hang as well fair enough if you're going to combine them and you shouldn't if you're going to combine them fork first then thread and everything will work fine for you and you will not embarrass yourself by submitting this bug report to Python which is interestingly still being argued there's like oh this should be documented in it's like it's documented everywhere don't do this are just follow me on Twitter I'll tell you don't do this thank you very much for inviting me thank you another email la Poirot karate Kokoro Kokoro sir thank you for the talk actually maybe I missed it but when we are using atomic cues is still Jill involved like is it efficient for cpu-bound applications with atomic cues thank you atomic the question was do the atomic message queues decrease yeah your efficiency if you write a correct program using locks that is well-designed it is not actually sequential you're doing something almost isomorphic to using the atomic message queues in fact the the queue module is simply a thin layer around a list or a deck it's a very thin layer with the locks on it for you the same walks that you would have written anyway so I believe there's no net efficiency ah cause to it that said whether you use atomic message queues or not it doesn't take away the fact that we have a guilt and that threading does nothing for you in cpython so you would want to switch over to the multi processing version to take advantage of the cores they have all the same concepts including atomic message queues excellent question five rubles 500 rubles Hey alright another one okay technology chat island of crete order typically jewish from the workshop the old Python regular expression joke a colleague says I have a problem and I want to use regular expressions to solve it what do you say now you have two problems that is an old joke that started in the Python community it is mutated a colleague says I have a problem and I want to use concurrency to solve it what do you say now two problems you have ffffff
Info
Channel: Π’ΠΈΠ΄Π΅ΠΎ с ΠΊΠΎΠ½Ρ„Π΅Ρ€Π΅Π½Ρ†ΠΈΠΉ IT-People
Views: 71,010
Rating: 4.9450803 out of 5
Keywords:
Id: Bv25Dwe84g0
Channel Id: undefined
Length: 52min 1sec (3121 seconds)
Published: Wed Jul 13 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.