Safety and Speed Issues with Threads. (pthreads, mutex, locks)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everybody it's 2019 I want to talk more about threads and specifically how threads can be a little bit scary and a little bit difficult to work with new programmers often treat threads like bacon like you can add them to anything and that's gonna make it faster and better and more responsive and just so delicious but the reality is a lot more complicated and so today I want to talk about concurrency parallelism and thread safety so let's start with a little example so I've got an example program here let's say I want to do some simple math like maybe I want to count up to some big number like billion pretty simple now let's say I want to do it twice ok so so here I'm counting to a billion twice I know this is a simple example but it's gonna be useful now let's just let's compile it and let's time our code and see how long it takes to run ok so it takes about four seconds ok now let's see what happens when we do the counting in two different threads so I've talked about the pthreads library in two previous videos and if you haven't already watched them you may want to go back and watch them now it may make this video make a little more sense and I'll put a link to those videos in the description but anyway one thread is gonna do half of the counting and the other thread is gonna do the other half of the counting and winning class I asked students to predict what's going to happen here almost everybody seems to think that things are gonna move faster so let's see if they're right so we compile it we run it and ok that's not cool we got two issues the first is that it got slower and the second is that it got the wrong answer now let's start with the wrong answer because faster faster slow doesn't make any difference if you're getting the wrong answer ok so what's going on well one clue is that if I run the program many times I get a different answer each time so whatever's going on isn't consistent and it depends on the timing of the program now the issue is that the increment operator actually does multiple things it's not just one operation it's actually multiple operations it reads the variable it adds one to the variable and then it writes the new value back into the original location in memory so it's basically doing three things it's equivalent of this code right here and now let's say I'm doing this over and over again in two different threads and these threads line up just right or wrong what you could have is the each thread can read the same value incremented and then write the same value back and instead of incrementing the counter by 2 we've incrementally counter by 1 so so we've lost some of our operations and that's why we're getting a different count each time through is that sometimes we're getting unlucky and things are overlapping in a way that's causing us to compute the wrong result and so as is this code is not safe to run in multiple threads or we might say that this code is not thread safe this specific type of software bug is called a race condition and the reason we call it a race condition is because the two threads are basically racing or saying who gets to write first or who gets to write last and the outcome of the program depends on which thread gets there sooner so let's fix the race condition first because correctness is more important than speed and then we'll come back to the speed issue the easiest way to fix a race condition like this is by trying to make this operation which has multiple parts this increment operation trying to make it atomic atomic just means that we want it all to happen as though or one unit so you can't interrupt it okay so if one thread is doing an increment another thread can't start doing an increment on the same value until this one is done that's what it means to be atomic now most processors have built-in atomic operations and different compilers support these atomic operations in different ways but I don't recommend you use them because it makes your code fragile any time you switch from one processor to the other or even if you switch from one compiler to another likely your codes gonna have to change and we really don't want that so instead we're gonna use locks now locks sometimes called mutex locks mutex for mutual exclusion locks now a mutex lock is a computing abstraction that allows one thread to basically exclude other threads and say hey I have the floor I have the right to work in this space everybody else has to wait now we're using the pthreads library so what we're going to use we're going to use the functions pthread mutex lock and pthread mutex unlock to grab the lock and then release it when we're done okay so now so now we just need each thread to grab the lock before starting to increment and then release it when it's done each time incrementing and remember that only one thread can have the lock at a time so if one thread calls pthread mutex lock and gets the lock and another thread calls pthread mutex lock a second is going to wait peter mutex lock won't return until the other thread calls unlock and releases the lock and then it will be allowed to proceed so only one thread can have the lock at a time so now if we compile our code and run it well the good news is that now we got the right answer the bad news is that our code is even slower than it was before like way slower like four seconds has become 14 minutes almost and so now would be a really good time to talk about the speed issue but first some terminology concurrency and parallelism if two processes or threads are working in parallel that means that they are actually doing work at exactly the same time parallelism typically requires some kind of hardware support like multiple cores or some other may be a coprocessor or something like that my machine does have multiple cores so it's possible that my threads are running in parallel now the concept of concurrency is a little bit looser so imagine if my machine only had one core if it only has one core it can really only be running one thread at a time but we really have many threads in the system at any given time and so what it's going to do is it's going to run one thread for a short amount of time and then switch to the other and run that one for a short amount of time and it's gonna switch back and forth and back and forth and skip it switch is quick enough maybe as the user I don't actually notice the switching I just looks like things are making progress but they're making progress more slowly now in this case but this is definitely not parallel but we'd still call this concurrent because as far as the user is concerned as far as things appear things are still making progress at the same time I'm not having to wait till one finishes to start the other the problem is that there's a lot of things that can prevent you from getting parallelism in your threads one of those is memory sharing so in our example we have a variable that's shared between the two threads and they're each accessing it a billion times they're accessing it all the time every time through the loop so the machine is trying to keep that memory coherent and it keeps shuttling memory back and forth the OS might actually just stick both of these on the same core just to avoid all of this memory sharing contention but the point is the sharing is going to prevent you from getting really great parallelism and now I've added locks and so with locks now I've got this lock and unlock function and I'm calling it billions of times literally billions of times and the overhead is definitely consider and that's why we're ending up with minutes of runtime now instead of just seconds because now we just have a lot of overhead and we're not actually able to do things in parallel each thread has to wait for the other one so we can get the lock and then it moves forward and so they're taking turns and all this turn taking has a lot of overhead so about now you're probably wondering what's the point threads add all these safety issues and they make things slower so what's the upside why would I ever use a thread and the fact is there are hard times that are appropriate to use threads and to give you an idea of what those look like let's look at another example okay in this example one thread still counts numbers but the other thread is going to do some IO specifically it's going to go out on the network use a socket and it's going to download a web page specifically Google's homepage now this is very similar to my sockets client example that I shared a while back if you haven't already maybe watched that video for a more detailed explanation but in this case the delays aren't all caused by the processor and the threads have very little memory sharing so these tasks can be handled in parallel not just concurrently okay so let's run it without threads now let's run it with threads and you're gonna notice that things are noticeably faster in this case now threads are also useful when building user interfaces because often your user interfaces are not about getting a ton of processing done but they do but you do want them to be responsive you want your code to respond very quickly when the user does something and if your program has to do a bunch of disk i/o or send something over the network or wait for some kind of feedback you don't want mouse-click responses or keystroke responses to get slowed down by all that delay so in that case it makes sense to have a thread that can handle some of those user interactions well you do some of the heavy lifting in another thread but I guess the message is be careful if your program doesn't need threads it's almost always better to leave them out you're gonna have fewer bugs you're gonna have easier debugging themes are just gonna be simpler and as a programmer simple is definitely your friend and that's all the time I have for today I hope this helps you more effectively use threads in your programs I hope it helps you see threads and locks as tools that have a place and that are important but not as some magical computing bacon that you can just sprinkle into your programs and make everything more awesome because they're not and with that I'll say happy coding and I'll see you soon if you're enjoying these videos please subscribe please click the bell if you want to make sure that you don't miss any future videos I definitely appreciate all the feedback I've been getting from you and for those of you at Clemson that are gonna be in my classes this semester I look forward to seeing you in the coming week now I'll see you later [Music]
Info
Channel: Jacob Sorber
Views: 35,539
Rating: 4.9907994 out of 5
Keywords: multiple threads, c language, c programming, multithreading in c++, mutex, mutex lock, lock, semaphore, pthread_mutex_lock, pthreads, thread tutorial, thread safety, thread safe, mutual exclusion, c (programming language), race condition, race condition explained, pthread mutex in c, pthreads mutex lock, pthread mutex, pthread_create in c example, pthread_mutex_lock tutorial
Id: 9axu8CUvOKY
Channel Id: undefined
Length: 9min 12sec (552 seconds)
Published: Tue Jan 08 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.