What is a semaphore? How do they work? (Example in C)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everybody today's topic is semaphores what they are how they work and of course an example [Music] welcome back everyone for those of you that are new to the channel all source code is available through patreon information in the description and a big thanks to everyone that supports this channel and help me make these videos so i recently made a video about shared memory and in that video i mentioned that sometimes when you're using shared memory you need semaphores and today i want to dig into that topic a little bit and see if i can provide some clarity and help you understand a few things about semaphores so semaphores fall into this big category of things that we call synchronization primitives which of course are those things that help us coordinate activity between multiple concurrently running threads or processes concurrent just means that they're working at the same time and we're particularly interested in threads or processes that have to share information memory data in between each other they're working together to solve a common problem or towards a common goal so other primitives include mutex locks condition variables monitors barriers none of which are the topic for this video but of course let me know if there's any of these topics that you'd like to hear more about i have talked about mutex locks and condition variables in other videos i'll link to those in the description i've also talked a lot about threads and processes in other videos i'll put links to all of those in the description because you might want to check those out if you're watching this video and you get to a point where you're feeling lost but our topic for today is semaphores and semaphores are fun they were developed invented creative whatever by dykstra back in the days when dinosaurs ruled the earth long before the microsoft zune became a thing for what a month or two but unlike the zune we still occasionally use semaphores so we're talking about them today rather than the zune so what is a semaphore a semaphore is basically an unsigned integer with some quirks one of those quirks is that changes to the integer value are atomic meaning that if one thread or process increments the integer and another wants to decrement the integer those increment and decrement operations cannot interrupt each other another quirk is how we increment in decrement it's that we can only interact with semaphores using two operations weight and post post by the way is sometimes called signal so we don't just access a semaphore's value directly if you want to do anything with it you either call wait or post now what do they do both are pretty simple weight tries to decrement the value of the semaphore if the value is greater than 0 then it succeeds it decrements the value and it returns if the value is equal to 0 it waits hence its name wait and it waits until the sum of four's value becomes positive again and then once the value is positive then it's able to decrement it and then it returns post on the other hand it just increments the value of the semaphore and returns so that's all they do now remember these operations are atomic that's really important none of this is going to work if they're not atomic so this is how it works let's say that i create a semaphore and give it a value of one and let's say i have three threads or processors sharing that semaphore if thread a calls weight then the value drops from one to zero now say thread b calls weight now the value is zero so thread b can't decrement it and it just waits it called wait and wait is just going to sit there until another thread say thread c comes along calls post and when post is called the value increases and thread b is then allowed to decrement the value and return from weight so did you follow all that now you may be asking what happens if thread a and b call weight at the same time like i said weight is atomic so one of them is going to go first and the other is going to go second we don't know which one but you're never going to get a situation where both see a positive value and they both think they can decrement the counter and neither of them wait if the value is one and two threads or processes call weight one of them is going to decrement successfully and the other is going to wait until there's a post but okay fine what are semaphores good for sometimes they're used like mutex locks to protect some critical shared resource so these semaphores are called binary semaphores since they're only allowed to have values one and zero you can initialize the semaphore to one and then whenever a thread or process wants to access the shared resource or critical section of code that thread will first call weight which is like grabbing a lock when the thread is finished with the shared resource it can then just call post which is like releasing the lock and of course if another thread calls weight between my weight and post calls here then they have to wait so then once i call post down here then they can proceed now just because you can use a semaphore like a mutex lock doesn't mean you should there are some important differences so mutex locks have this notion of ownership there's this idea that i hold the lock and whichever thread or process grabbed the lock that thread or process is the one that needs to release the lock so this notion is nowhere to be found with semaphores semaphores any thread can call weight any process can call weight they can all call post at any time under any circumstances so semaphores are more flexible than mutex locks and that flexibility allows programming students to design a wide range of elaborate coordination schemes and tie themselves in glorious knots so in practice my personal preference is that i don't use semaphores if i can use a mutex lock in its place but i do want to show you an example today of how semaphores can actually be used and where they make sense so let's get into some code now i'm going to use the code from my recent shared memory video as a starting place you'll remember that i set up a block of shared memory and i had programs that would write text into that shared memory read text from the shared memory and then there was one that destroyed the block of memory and in that example we didn't need a lot of coordination because i was just running my programs independently not at the same time and so there wasn't any contention for that shared memory it was happening at different times and so there was there was no risk but in this example i want to change things up a bit so that my reader program and my writer program have to coordinate now most of the time when i find myself using semaphores it's usually the situation where one process is producing some data and another process is consuming that data and that's what we're going to do here the writer is going to write a bunch of messages into the memory it's going to be the producer and the reader is going to consume those messages really it's just going to print them out but it could be doing something more meaningful with them which is what would happen in a real program that you were writing now let's start by putting the code that reads the shared memory and prints the message let's put that in a while loop and we'll have it loop forever until someone puts quit into the memory that'll be our signal that is time to be done and each time through the loop we're also going to reset the memory so we don't consume the same message twice because we don't want to get double prints and then let's jump over to the writer this is our producer and just to illustrate why we need semaphores let's put the writing also in a loop as well so now we're going to put messages into the shared memory over and over again and the idea is that we want the consumer to get each of these messages and print them out and this of course is a very simplified example now we can jump over to the terminal i'm using two windows here to make it easier for you to see what's going on we can compile our code that seems fine now let's run the reader program and you notice that now it just sits there waiting for something to happen okay now let's make something happen by running our writer program over in the other window and it works sort of ish not not really we tried to send 10 messages and we only got one so we got something but it's not what we were looking for but the program's still waiting it didn't crash and so let's try something else if i do something like this and run a bunch of different writers concurrently all trying to write stuff into the buffer well then we get more messages received but you notice we're still missing a lot of messages so still not exactly what we had in mind now the problem here is simple the fact is there's just no coordination the reader doesn't know when a message is ready and the writer doesn't know when the reader is finished with the last messages so it just keeps overwriting the old message and so this is where we're going to bring in our semaphores now as with so many things in computing there are a bunch of different ways to create semaphores we have name semaphores and unnamed semaphores we have posix sum of four functions and system five semaphore functions they all work fine but some functions only work with other functions so you get clusters of functions for this example we're going to use named semaphores and we're going to use sem open sem weight sempost and semclose now when using named semaphores we need to give them names obviously they're named similar to how we gave our shared memory blocks names one difference here is that the names here don't have to map to an actual file on the disk otherwise it's very very similar in this example i'm going to use two semaphores one for the producer to signal when it's done producing and one for the consumer to signal when it's done consuming meaning that someone is free to add another message into the block of shared memory and we can go into our reader program and set up our two semaphores the first thing i'm going to do just to be careful is to remove any semaphores that have these same names say that we ran the program and it crashed and left an old semaphore around in the system we don't know what state it's in so i'm just going to remove it just to be safe now we create our semaphores by calling sem open we give it the name we want it to use we tell it to create the semaphore if it doesn't exist we specify its access privileges which are just like file privileges and the last argument is the initial value we want our semaphore to start with so in this case we'll start it out at zero now starting out at zero remember what this means that means that calling weight on this semaphore right out of the shoot means it's going to wait it's already at zero so it can't decrement it it's just going to wait until we get our first post call and of course we need to check to see if the function fails if it did just print out an error message and exit because this is just a demo program and then we're going to do the same thing with the consumer semaphore we're just changing the names one thing to notice though of course is that we did start this semaphore off at a value of one that means that at the beginning the first process to call weight on this semaphore will not wait but it will actually successfully decrement and return that's going to be important later on when we actually use the semaphores now down here in our loop each time through the loop we're going to call semweight on the producer semaphore this means that it's going to wait until a producer produces something then once we're done then we actually call sempost on the consumer semaphore and that's going to let exactly one of the producers add data to the block of shared memory and then just to be tidy let's close our semaphores when the program is done and that's it for the reader now let's look at the writer okay what we're going to do over here is going to be pretty similar so let's copy what we have here and bring it with us now i'm assuming that the reader will always start first so these calls to sem open won't create new semaphores they're just opening them they're just basically requesting access to them so they can use them and they will fail if the semaphores don't exist meaning that the reader isn't running yet so it's important that the reader be running when the writer starts then down in our loop we're going to do something similar to what we did in the reader but we're going to flip the use of the semaphores so here we're going to call some weight on the consumer semaphore this time basically we're just waiting until we get a signal from the consumer saying hey it's your turn to put something into the buffer and then once we copy our text into the buffer we signal or call sempost on the producer semaphore signaling the reader that we've put something in the memory and it's ready for consumption i also need to remove these lines here because we don't want our producers to be destroying our semaphores because that won't end up well and now back to the terminal we compile it okay that looks fine and we run our reader and it waits just like it's supposed to the one thing i didn't mention here is that before i add my semaphores our loop was spinning pretty hard it was just checking the shared memory over and over and over again now with the semaphore my laptop's fan is a lot quieter you probably can't hear the difference but it's a nice change as well now this is very similar to what we did with condition variables in my multi-threaded server video series in case you're interested you can check that out and we could have used semaphores over in that video as well but now if i run my producer you see that i get all the hello messages and if i run a bunch of producers you can see i get all of those messages and note that if i run this over and over again they're not always in the same order and that of course is going to depend on how the scheduler chose to run those processes but the point is we're not losing any of the messages which was our goal now there are other ways we could have done this without using shared memory and semaphores we could have used pipes named pipes sockets message passing there are also a lot of other things we can do with semaphores just be careful like i mentioned before getting fancy with semaphores is a great way to end up with software that doesn't work please let me know in the comments if this was helpful to you if you'd like to see more videos like this if you'd like to see me dive under the hood with things like pipes or some other communication mechanism be sure to check out my other videos like these subscribe to the channel so you don't miss the next one and until then stay safe and happy coding
Info
Channel: Jacob Sorber
Views: 90,795
Rating: 4.9583912 out of 5
Keywords: semaphore, semaphores, semaphore example, semaphore in C, c tutorial, semaphore tutorial, ipc, synchronization, synchronization primitives, programming with semaphores, thread, processes, threads, coordinating threads, race conditions, signaling, post, wait, sem_wait, sem_open, sem_post, sem_close, sem_unlink, posix semaphores
Id: ukM_zzrIeXs
Channel Id: undefined
Length: 13min 26sec (806 seconds)
Published: Tue Aug 25 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.