Multithreading Code - Computerphile

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
If you've got multiple processors either that's multiple cores virtual cores in a hypothetically a multi-core processor you want to run your algorithm or on Multiple CPU instances so you need to split it up and the operating system needs to find some way of controlling all those different instances actually you can still be beneficial on a single CPU system because it might be times where part of your program is waiting for say network input so the operators will then schedule it in another thread of execution To do something So let's start by thinking about what the operating system sees when it's running a program and this is what's called a process and that's got a series of instructions and it's going to be Executing this is the program this is actually not part of the operating system is it's the program but it needs to know we're inside The program that is so that instruction say But there's, also lots of other information that the operating system needs to keep track for a process a things like list of open files Configuration of memory and so on if you've got virtual memory involved or Paging there's various things that the operating system needs to keep track each of them has got a unique ID and you can then run multiple processes at the same time so one way you could break your program up into multiple chunks is to actually have multiple processes running in for a while people did this they sort of Will use the command a function in UNIX for example called fork you would split the program into 1/2 which would run off as it was the other would have a new process identifier and the operating system would copy that block and Create a new list of what memory uses a new list of the files they had open now these are point back to the same files the same blocks memory allocations and things so that if you change one memory location it updated on the other side and things if You changed a file seat to a different part of a file for example it would update in the others but They would be separate processes and they'd have to have all the storage required to do that and that worked but if you're breaking your program up so you can run your task on multiple CPUs most of them are going to have the same list of files the same memory configuration the Everything else that the operating system needs to keep track of the only thing that's going to be different is Where it is in the program what bit of code is running from what part and so the idea that came along was ok let's Have something lighter weight than a process and they called it a thread because you've got your thread of execution through the instructions And what we have is that each process will have at least one thread of execution but could have many So you have multiple threads of execution in your program so your programs running one thread Which it starts off with that will then create another thread Which does something and then another thread pratik and so on can do different parts they made last for the whole length of the program they may only last for a short part while that task being completed and so on You may have multiple ones that get jobs from different things and so on how you structure your program Doesn't matter but from the operating system point of view it's now Scheduling different threads onto the different cause whether that's one to four however many you've got in your system so the idea behind Multi-threading is a sort of lightweight way of saying to the operating system these are the different things I want running on my computer Whether that's always part of one program or multiple threads through multiple programs and the operating systems choosing which one of them actually gets CPU times on any of the CPU calls physical or virtual that they have available Now that it produces if you were actually one program that you've split into multiple threads that introduces some interesting issues Because you can get to the point where your programs can no longer do the task that you expect them to do a program which Was just one single stream of instructions executing one after the other and so he knew the order they'd execute in two to multiple strengths of instruction that are executing Sometimes in parallel sometimes one after the other but you no longer have any Control over the order that would happen because the operating system might put two of their threads on at the same time Or I might let one thread run and then put the other thread on after the other and then go back and you can all Control, where it's switch from one to the other I can show this with an interesting program on my laptop So what I've got here is I'm going to write a program that adds up all the numbers from 1 to 1 million but I'm going to do that in two Threads so I've written two with the Reds here creating threads the first one is this routine here Which is called the hopefully thread function? We've got a variable a which we're going to initialize to be 0 and I've declared this as a volatile Variable just to tell the C compile and it will change While it's running because otherwise you'd have even more problems so split in two so the first one is adding up the first 500,000 the second one exactly the same so it's adding at the second 500,000 so what it's doing is it's adding up from 500,000 to 1 million and they're adding that's the same variable so go instruction he's going to fetch one Adder on the next one and write it back and then the other one's going to fetch it Add on the value and write it back and so on so then we then run our program When we create both threads and then we wait for them to complete which is what this P Throw joint is doing and then we print out the number that We've got so if you just like to add up all the numbers between 1 and a million while I run the program Then we'll see what result we get 500 times you'd billion so let's just compile this up Not making mistakes the problems would compile this up and let's just run this program and it's giving me the answer to eight six six eight nine seven four eight two two six zero if I run it again I now get the answer three three three nine six three eight two six seven two six I'll run it again I get the answer three two six eight seven, oh four oh nine five Q one I run it again I get two for 307 501 five 490 which sounds like a phone number Well this all this all converge on the right answer all leverage out or is this just one of those completely Well if you think about it Should always be the same you've got you're adding up all the numbers between one and a million you should always get the same answer It shouldn't be converging on something you should always be producing the same answer so something very very odd is going on on my computer Something very very bizarre is going on we could write the program less in fact let's just write the program normally let's comment out all this and we'll just write a very quick program that adds up all the values from 1 to a million so what I is less than or equal to 1 I'm sorry so it may be a single-threaded program now it's just going to run on one thread And if we do that we compile it We now get the same value each time 500 and whatever those so now we've done single-threaded we're getting around certain we're getting the same answer each time so what's going on here well you need to think about what's happening in This program, so let's just put it back to where it was if we look at this instruction here a equals a plus I then what this was telling this program to do is to fetch the value of a whatever that is then add on the value of I to it and Then store the result back in a now if I just put that to one side and find my paper Which is over here with my pens if we? Were to write that in machine code I'll use the arm ship to do that because it just makes things slightly easy to see what's going on We would have this is thread 1 so we'll call this t1 we're going to say that this is going to be an LDR instruction To load a value from memory so they're loading to be registered let's use our 0 the value of a We're then going to add onto Our 0 The value of r1 Which is a register containing our loop counter that's the same as I on there and then we're going to store the value of R 0 Back you know so that one Caesars have become 3 machine code instructions so the way that the program is executing in one thread is it's loading the value from memory and It's making a local copy of it and then it adds a value onto it updates its local copy and stores it back into the actual location and that thread keeps doing that into the loop so it loads it I Had something onto it and stores it back loads it add something else to it Stores it back now if we've got a multi-threaded system, which is what we had there, we? also have another Steve hi Steve hi Steve and in that case I'm going to be here loading the value I think value onto it storing it back And I'm gonna be here loading a value adding something onto it and storing it back so I'm loading value adding something on to it I'm loading value loading a value adding something on to it adding something on to it storing it back storing it back and what's just happened Because other Steve had added something else on to it, while I'd load it on when I storm my back his update doesn't get changed gets lost Well that wasn't nice So if we look at what's happening on the papal it's loading the value in it's adding the value onto it and then it's doing It back but of course, we've got multiple systems and so at some point this pro Was going to start excusing the same code so let's cross this instruction out for now we'll put it back in a bit later on And assume that at this point we load the value into the other thread into our zero in there And then we do the add on in the same way Dot and then we do the store back here of our zero into a and we've got one a Up here there an Singh and then this store actually happens after this store here So what happens here is that these both end up getting the same value of a And this has its value on in its local register this adds its value one in its local register this stores it back Then this stores it back and of course because this is just story back here It's overwritten the value that was stored there so the value that was added on here gets lost so what we need to do is find a way to synchronize these two things so that as soon as this is loaded the value in And it's going to add something on to it it stops This one from doing it and so there's a couple of ways, we can do that The easiest way is that we have something that both divide both Steve's or both CPUs have to access and so we could have a token let's use a floppy disk Microsoft Office Professional 1995 and what we're going to say is that I can only access an update variable a if I have the token so let's bring back Friendly Steve so we've now got a token and we? What we're going to say is it unless I've got the token I cannot access a variable a so I'm going to execute those three Instructions whenever I've got the token so I'll get a load of runny win add something onto it and store it back and gonna hand the token to Steve now I've got the token so I can load a value in add the value from I've merged her into it and store it back and hand the target and now I've got the token again I can load something into It into my register add something onto it Back and pass the token on and I've got it so I can load the value in add the value from a register Story back and pass the token on And because I've got the token again I can do the same again but because only one of us can hold the token at any one time it means that only a can get updated When we've got the token and so we never lose that right because we're saying that unless you've got the token You can't access that very that's great so if we turn back to the laptop we can see How we can implement that in our program we're gonna use the same idea, we're gonna have something that we need to have before We can run that code in this case we're going to have what's called a mutex and mute an area mutual exclusion that we need to lock Before we can run, the code after it and then when we finished we unlock that and so if it's Locked and we try and acquire it our program will pause until it is able to lock it So I'm going to do is I've created the variable for it here and I've Also initialized it down here so it's all set up Ready to go so before I add onto a that the value of the instructions I'm going to lock that mutex like so and then afterwards if I could type I'm going to unlock that mutex Like so and I need to do the same in the other thread as well so I'm going to lock the mutex I'm going to unlock it afterwards So if I can pile this version up we still got two threads we're still getting them to add up the things as we go through it and then at the end We're going to add the result and it should this time give us the right answer so let's run it and It gives us the right answer we now split the problem into it's running on two CPUs and because we're locking that variable so We can only access it by one thread or the other So, we did with the two Steve's then we can get the right result and it produces the same result every time now the downside to this is that locking the thread each time slows the running of the program down But we can be clever if we think about addition if we add six numbers together That's the same as adding Three numbers together then adding three numbers together and having the results of both together and we can do the same thing here so, we can take this mutex and Move it outside of our main loop and instead of rounding on the value of I each time we have a local variable we'll call it local a which we add everything on to so we say local a equals local a plus I So inside our thread we're just going around adding all the values on as we did before which didn't work because we're putting in a Local variable that's local to that thread and this function It's not gonna be a problem because only this function can access it well do the same in the other thread and then we'll do The same thing outside it and the advantage of doing this is that we now only have to acquire that lock Once in each thread rather having to acquire it for every single addition, whether you have to acquire it once if we time the program Beforehand and I should say at this point the program may be running so fast that we won't better get an accurate time and it Wouldn't try anyway so use the time coming to time it We see it takes about nine point eight two seconds let's just run it again nor point seven one Nor point seven one We seem to have standardized around that sort of things about naught point seven so let's compile up our faster version now Not made any mistakes and we'll time it again, well let's see produces the right value will time it as well And it's taking around the same time It's dropping a bit and it doesn't really make any difference in time I think that's just because of how fast the computers are That we're dealing with but with a bigger program we're doing more calculations Reducing the number of times we require that a lot would actually make the system run Faster so multi-threading is the idea breaking our program up into multiple threads of execution that run for different lengths of time but We have to be careful that we do things in the right order and we avoid having multiple threads accessing the same variable at the same time Particularly if they're writing if they're just reading from it it doesn't matter if one's writing and several the reading from it and it doesn't matter if they perhaps get an. Older value at a time then it works fine and so on That same A1 so that he can decrypt the message and read it right so Alice maybe wants to send another one so she's going to tick This KDF function again she's going to produce a new key and a2 she's going to send that to Bob he's going to take this receiving function a2 Now Bob wants to send a message so he's going to tick
Info
Channel: Computerphile
Views: 287,403
Rating: 4.955771 out of 5
Keywords: computers, computerphile, computer, science, Multithreading, Multi Threading, Multiple Threads, Multi Thread Code, Code, Coding, token, mutex
Id: 7ENFeb-J75k
Channel Id: undefined
Length: 15min 54sec (954 seconds)
Published: Fri Dec 14 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.