Lecture 8: Semaphores and Monitors

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

all right so let's pick up where we left off so all so long ago remember we were talking about synchronization so we we were primarily talking about locks which you know allow us to get your mutual exclusion and have critical sections where the only one processor one thread is executing at a time so and remember we talked about you know we need hardware support for these kind of things and we talked about two specific ways to implement locks using disabling interrupts and then using those atomic testing set instructions and then we talked a little bit about the idea of busy waiting and how and how you can avoid that so today we're going to talk about semaphores and monitors so we'll start with semaphores so basically semaphores are a way that we're going to generalize the idea of blocks so it's another synchronization primitive and we can use it in the way that we use a walk but we can also use it to do things that you can't easily do with locks so essentially a semaphore is boils down to just an integer assemble four is essentially just an integer that has some special instructions that you use to actually update its value but ultimately it's just a value it's just a number now there are basically two types of semaphores the first type is what we call a binary semaphore and this is basically the exact same thing as a lock so you know we're going to use it to guarantee guarantee mutual exclusion to you know some shared resource and so you know this is again an integer except it's only going to have two values 0 and 1 and of course we'll initialize it to free so you know this should look very familiar from last class this is essentially the same idea as a lock the lock is either free or it's taken by some process but then the second type which is more interesting is what we call a counting semaphore so this is going to allow us to have multiple shared resources that we basically manage using one you know synchronization primitive one semaphore because remember when we use a lock you know you have a lock that manages one resource we're going to be able to manage multiple resources using accounting semaphore so yeah so as I said a semaphore is a number but we have two main operations of which are which should look fairly similar we consider locks we have weight and signal and so then we have some you know critical section that goes in between those two operations so you know if you want to use this just like a lock there's going to be have some semaphore you call s dot weight and that's going to wait until the semaphore is available and then you execute the critical section and that's where you know that only one thread is executing at the same time and then you're going to call signal which basically says you know two other processes that want to use the semaphore that the semaphore is now free so you know that's essentially just using it like a lock and like a lock you know each semaphore is going to have a list of processes that are waiting to use the semaphore and you know calling signal will tell the next process wait in the line that it's so the way in which this is going to differ from a lock is that you can potentially have a value greater than one so what a value greater than one basically means is that multiple processes can hold the semaphore at the same time right so in the lock as soon as one process takes control of the walk everyone else is going to block in the semaphore you can have higher values rather than 0 a 1 and that's going to say you know three different processes can be using the sound before at the same time so we'll get into exactly how that works but first let's just look at the really simple case you know the binary semaphore which is essentially the same as the law so if we look at the you know too much milk example you know previously with locks we were talking about acquire and release so you know thread any calls locked on a fire now is in a critical section knows that no other threads executing that checks if you buy milk and then goes and buys you know if needed and then releases the law saying print be using semaphores you know this is a binary semaphore it's going to be exactly the same thing oh when you're following weight and signal rather than acquire release so let's see how this actually works so here's how we can actually implement a semaphore and so this should look pretty similar to a lot we have you know wait and say and then we have some value which remember in the case of a lock was whether the lock was free or not that's just some energy our and then we have a few processes that are waiting so what's the what's the one difference in how we define the semaphore that's different from a walk what can anyone say so how about this constructor when you create a semaphore we're passing in this integer so that integer is going to be the initial value of the semaphore which might be 0 or 1 or you know higher value and so that's essentially going to be like the number of your resource slots that the semaphore has so in the case of the lock we sort of assumed that you have so one slot one Rhys one process can take control of the lock at once with the semaphore you're going to have multiple ones so when we're calling weight what we're going to do is we're just going to take that value and we're going to detriment it and we'd say value equal value minus one and then if the value is less than zero that essentially means they're no you know they're no slots available and so they're going to block so what you're going to do is if the value is less than zero you're going to add the different process to the waiting for you and then you're going to walk and then when they're calling single you're just doing the opposite thing you're taking that value and you're incrementing it by one and then if the value is less than or equal to zero you're removing a process from the queue and waking it up so first of all why are we doing this if value is less than or equal to zero so what does that mean if values less than or equal to zero yep if there's no processes that are using it so if not necessarily so if the value is less than or equal to zero then what does that mean about how policies have been calling wait so let's assume let's assume that we created the semaphore with a value of one so the initial value is just one so the first process that wants to use the semaphore is going to call wait and so then what's value going to be write zero so first process comes involved wait now the value zero now second process that's going to call wait it's going to set it to negative one if this first process is still running right so basically a negative value means that there's a process that's waiting on the semaphore that you know it does not it's not able to use it right now whereas if we you know set if value is positive then what can we say about threads that are waiting to use the semaphore are there any you know if there can't write there can't evaluative then there can't be anything waiting to use the semaphore because positive means you have open slots so whenever you're in forensic value you know you're always taking the next thing off the next process that's waiting and you know letting it letting it proceed past the Pesta wait if value was less than zero so what value is positive you know no one's actually waiting of a semaphore so let's just can't think of a few simple examples why we might want to use something like this right so when you call the semaphore you're giving it some value and that's saying how many processes can use the semaphore it was now a good example of why we actually first can anyone think of any any example where you might want to use something like this right so why is this what what what's a scenario where this might be better than a lock where you essentially just have one resource versus here where you may have you know five of the same resource that you can that you can you know give access to any ideas yeah yeah sure I'm actually a good example is suppose you have you know you have a lot of processes that need to do something and your computer has four cores and so you want to execute four or something at the same time but not more than four because you only have four cores right so there you know every time you take you know a unit of work and you know start running it it'll run on some core you don't care which core it runs on you just want it to run and you want to run for it once so that's where you might you know you could use a semaphore and give it a value equal to your number of cores and whenever you're you know dispatching a unit of work decrement the semaphore right so what that's going to do is if you you know have a core that's not being used you're going to send the sending you know whatever the work is and that will start running on one of your cores and if all of your pores are running then that's going to block until one of your cores is finished with the previous piece of work that's doing so a concrete example of how we might use this is I think I mentioned this last class the idea of a web server so a common way web server written is you have some thread that is listening or page requests and whenever you get a page request you have basically a pool of worker threads and you dispatch the request to one of those threads to actually read the page and send it out so when you have some fixed number worker thread you want use a semaphore to basically you know keep track of sending the pieces the you know the Webber press out to one of your worker threads so that's another that so that's another example um and not pictured up here but remember we have the same uh we have the same requirements in terms of executing things atomically that we have with a lock right so we need to ensure that weight and signal are executing atomically so it's not pictured up here but we have the same issue of either needing to disable interrupts or use you know test and set in order to make sure that weight and signal are atomic all right because otherwise if things are executing you know in both of them or you know two things are executed at the same time and those that code then you know the semaphore is not necessarily going to work like we wanted to okay so so let's walk through a simple a really simple example of using a salad bar so let's say we have you know process one and it's going to call rate twice and then signal twice and process two is going to twelve wait wanted single ones and remember you know the order in which p1 and p2 execute is determined by the CPU schedule are like we don't explicitly control it so here's just one video let's let's say I did it execute like this so let's say you know the let's say the initial value of the semaphore is given so that's the initialization value we give it and so of course initially the queue is empty there's nothing waiting on the sound before and both p1 and p2 are executed or at least ready to execute on the people you so let's say we first call p1 falls s not wait so what's the neck so what after calling that what is value going to be equal to Y right because p1 falls wait so we decrement the value of the semaphore we've previously - now it's one does so now just does p1 block or does p1 keep executing right P one keeps executing because you know the value of your value is still is still non-negative so you keep executing and of course P two is still executing so you know now the value of the semaphore is one nothing is waiting in both processes are still executing so now let's say P 2 you will call desktop wait so what's the new value going to be now zero right can we decrement it again and the same thing as before you know the value is still non-negative so the Q is still going to be empty P 1 and P 2 are still executing so now let's say P 1 calls wait again because remember this is this is a little bit different from how we were using a lock right because in a lock you either have the lock or you don't the idea of a semaphore right a semaphore is fundamentally just a number so you can call wait multiple times you can call signal multiple times all you're essentially do is is manipulating this integer so let's say that P 1 calls wait again so now what's what's value going to be so what is weight due to the integer value of the semaphore right decremented by 1 so it's going to be negative 1 so now that the value of the semaphore is less than 0 what's going to happen if you want right p1 is going to block and is going to be put on the waiting queue right so you know December 4 had you know to resource slots to loadout you know you know if released one inch time wait was called and so when you call wait a third time it's going to block the thread that call that so now the value is negative 1 p1 is waiting and p1 is blocked p2 is still executed so now let's say p2 pulse signal so now walk towards what's going to happen to value what's the value going to do right so you're going to add one so that's going to become zero and so now what is going to happen to to p1 right so now P 1 is going to unblock and yoky 1 is going to be taken off the queue and you're going to start executing it again and then you know the last two times your pulse signal are just going to you know increment value from 0 to 1 and then increment value from 0 to from 1 to 2 the basic basic idea Claire you know every time you're pulling weight you're decrementing the value by 1 and if you go below 0 the thread blocks and whenever you're calling signal all you're doing is incrementing the value by 1 and if someone is waiting on the semaphore you take the you know you take the next person waiting off in and dispatch them yeah um potentially if you were waiting on multiple different events to happen but we'll get to that and we'll get to a few examples of how you actually would want to use some of ours yeah yes ROH so when you're using a semaphore like this where they've out when we use the sum for the value of one that's where you're making it a law so if you make a semaphore and you give the initial value of one it operates just like a lock and is only going to let one process and at once so there it's functioning you know you can make a critical section like that so that was the too much milk example we looked at what you just so I use waiting signal and then sleeping and then you know you will have a critical section but when you have a value greater than one you know you are going to have multiple processes executing the section at once other questions basic idea clear okay so through that example so now let's consider you know how we might actually want to use this so the first example I just gave was if you want to do useful exclusion using semaphores all we have to do is make a semaphore value of one initial value of one that is you know binary semaphore and works exactly like a lock so then if you want to do a critical section you just go wait on it before the critical section that will set the value of the semaphore to zero and then if anyone else calls wait it's going to go to negative one and they'll block and then once you want to you know once you're exiting the critical section you call signal and the value either goes back to one if no one is waiting on the sum before or if someone is blocked then you know it'll it'll go back to zero and the next the next process will then start executing again but now another thing we can do with some afar's that we can't do with locks so easily is to consider is to do scheduling constraints so what I mean by that is that we often have pieces of code that we need to execute in a certain order and remember in general the CPUs schedulers interleaving things so we don't have any control over it but we're going to be able to use before is to actually impose some constraints so let's see a look at simple example how to do that so let's say you have two threads with run 1 + r/n - let's say that in front of you we're going to execute our function take exam and in turn 1 we're going to execute those WS alright so obviously we want one of these things to execute before the other one and in general you know thread 1 or thread 2 might execute in any order we don't know so we want to force you know we want the force starting the last to execute before take exam so the way we're going to be able to do this is we're going to be able to do this using a semaphore from summer 4s before always snow and we're going to set the initial value 0 right so this is a little bit different this is not like a lock but we're also not using a big number we're just going to set it to 0 anyone see how we can use a semaphore with initial value 0 to force it ordering here yeah right so we want you know what we don't want to happen is this execute before this executors so we want if this is executing first we want it to clock so we knew we can say you know s dot wait and over here you can say s stop signal right so now this could execute in either order and is ready to execute first we're going to call wait semaphore is going to s it's going to decrement from zero to negative one and thread two is going to block and then ones we have to be thread one which made you later we're going to call signal some before is going to go back up from negative one to zero and then we'll start executing thread tail right so that's a prayer to execute first if the CPU schedule decide to keep thread one first then signal is going to get called number four will go from zero to one and now that the value of the semaphore is one when we call wait thread 2 is not going to block at all it's just going to keep going right so this is this is what we mean by a scheduling constraint where we want to ensure that some piece of code happens before some other piece of code you know regardless regardless of the way the CPU scheduler decides to arrive now based on the use of the semaphore we know that thread one here is going to execute the fourth right to make sense questions on how this is working so actually the end because we've also do something similar here let's say you have a phone Fred you know red three and thread three is going to execute let's say thread three you're going to execute get sleep so we want red three and Fred 1 to X you before thread tail so how can we do that using using the same basic idea what's the one little modification we have today yeah wait twice in Fred - yes that's actually one approach yes you could you can essentially have wait s not wait and then s stop wait again and of course the same thing here we're going to call a stop signal and so then you know once the first you know preliminary task is done you're going to go past the first wait and then the second one and you're you ensure that when you hit this both thread 1 and thread two have gone what's another way you could actually do this yep right you can actually take this and not be 0 but you may be 1 then you only need the one wave and when you then this is not going to proceed until semaphore has been incremented twice people see how that works right because then it's going to be you know it's not going to be positive again until both the call signal yep then what you probably do is use multiple semaphores because you know we saw how you can order two things with one semaphore so of course you could always multiple more things like order multiple things by using more than one set of four other questions before you continue okay looks looks so so let's say that you and your roommate inspected past you know nine days working on too much belt and you probably disgracefully where now you're not gonna both go for them there's a kind of one of you that's going to do you know find help and you know your roommate is not going to buy mail your roommate is going to you know buy cookies and then once you once you have milk and cookies you know you're both going to XP so so what's the what's the scheduling constraint we want to oppose here yeah right you want to make sure that you've executed both buy milk and buy cookies before either of you execute eat so any ideas how we might go about doing this with semaphores yeah right so we're really going to have to have you know semaphore that are doing something before you're calling because that's what we want to potentially block threads before doing give it okay yeah what yes perfect we are going to use two semaphores s1 and s2 and what to their initial values be negative one right so the idea is that we're going to have s1 and s2 and in one of these we're going to call your s 1 dot signal we've a bit of space here but in one of these we'll call s 1 dot signal and then s 2 not wait and then in the other one you can do the opposite we can say as 2 dot signal and then s1 dot rate so well what do we want the what do we want the initial values to be if we're is this what you're saying so what do we want the initial values to be here so remember when we're calling X 1 dot signal or incrementing s1 so let's say that lets say that thread 1 is executed first so for every one is going to buy milk and then I'm going to call X 2 dot signal so that's going to be incremented by 1 and now let's say threaten 1 immediately execute s 1 dot wait so what do we want to happen that since thread 2 is not run yet what we want to happen yeah right so bread 2 is not yet run and we're down here and we're about to call your function so we want this to block so if this is the only thing that's executed and we want this to block what should the value of s 1 be it will block on negative 1 right but if it's negative 1 then when we come over here and we call s 1 dot signal it's still going to be 0 yep right so s 1 will cause it to block we want to make sure that it unblocks when we call signal over here so these things are both going to have a mutant values 0 so whichever one whichever thread thread 1 or 2 runs first it's going to increment one of the semaphore and then block on the other one because the other one is presumably still zero and then once the second thread runs you know whatever we learn first whenever the subsequent thread runs it's going to call signal which will release the other thread the thread that was previously blocked and then it's wait the semaphore that it waits on was already implemented by the thread that ran first and so it's not going to block it all so you'd essentially then enforce for that whichever thread runs first is going to block before calling me and once the second thread is running it's not going to block at all and the thread that ran first is going to unblock and then they'll both proceed makes sense there are so this is this is how we can essentially you know heat two threads in lockstep using semaphore as we can ensure that they get to the same point before both of them are able to continue any questions on how this is working so let's consider a little more complicated example so let's return to remember the producer-consumer example we talked about a few classes ago where remember the idea is you have unphysical a shared buffer which you can just think of as a shared array of data items and you have producer threads that are making items and putting them into the shared buffer and then you have consumer items that are consuming items off of the shared buffer so we have two types of you have two types of threads we have producers in future words so and also remember we had you know we had two types of bounded buffers we had a rather two types of the producer consumer problem one where you are bounding the size of the buffer and one way you're not bounding it so let's look at this code for actually doing producer consumer using semaphores so you know we can have a separate code for the Phillies room consumer we have the shared buffer of items and everyone have some semaphores so first of all your based on this is this a based on this power of constructiveness is this abounded or an unbounded buffer are we limiting the number of items and how many items can be stored left and right so we're saying we're going to create this bounded buffer with a parameter N and we're going to create the new buffer with n slots so first without considering how exactly this code is working what are sort of the two scheduling constraints that we have when we're doing this producer-consumer what can we say about the order in which things you know can or can't run so let's consider the let's consider the consumer so what does it consumer allowed to run yeah right so the scheduling can stuffer scheduling constraint is that the consumer can only run once something has been produced and is in the buffer so that's the first scheduling constraint what the second scheduling constraint yeah right so the producer you know because this is a bounded buffer the producer can only produce when there's at least one empty slot so so those are the two scheduling constraints and we are going to actually simulate their scheduling constraints using two different set of fours so we have the several four empty that has an initial value of n right and so each time the producer is called the producer is using up one of those empty slots so the value of empty room is getting decremented every time you're calling the producer keys are called empty dot weight so that's reveal reducing the value of empty so you know so what exactly is this enforcing so how many times could a producer run before the next producer gets blocked supposing you're just running suppose you have you know 20 producers and 20 consumers the CPU scheduler could run them in any order what can we say about the maximum number of producers that can run you know you know but in one by one if they're all back-to-back remember so each time you come here we are decrementing the value of empty and when is that going to block when's the value of empty going to block or rather when this when is empty dot wait going to block yep right so it flips you know when when empty from the value of empty is zero which means all of the slots of the buffer fall because each time you know each time we're consuming something we're calling a signal on empty so each time to consumer falls the value of empty is going up by one each time we're calling produce the value of empty is going down by one so is empty gets all the way to zero then that means that every single item in the buffer is full and so if another producer kotnis and tries to execute it's going to call empty dot wait and it's going to block because you know that's one of the constraints that you can't we need to have at least one empty slot to allow a producer to run so you know by setting the initial value of empty to n it means that we're we're enforcing that there is at least one empty slot to allow the produ for the ride and if there isn't an empty slot then the producer is going to block and we're doing similar basically the same thing on the consumer except rather than enforcing that there is Milt at least one empty slot we're enforcing that there's at least one item to consume and initially how many items are there to consider zero right so initially if the first thing that happens is a consumer tries to run we want it to block because there's nothing there's nothing in the buffer for the producer for the consumer to consider so the initial value will fall into the second symbol form of using it's going to be zero and the consumer is going full dot weight so if there's nothing to consume the consumer Lavoie and here the producer is pulling signal to actually implement that value so if you know a consumer comes anning falls consume and there's nothing in the buffer that it will block and then once a producer runs we'll call it signal on full increment the value of full and then the consumer can now run again make sense Zep clear how we're using so we're using those two semaphores to enforce the two scheduling constraints one for the produced or one for the consumer questions on that ok and then we also have this verb semaphore and see the one we're using this outer sign for both the tradition the consumer saying new text way and you that stop signal after they're done so what is that doing what is the function of that third settle for yeah right so what what is it basically what is the third semaphore right the third semaphore is just a law right this is this is a this is the buyer in settle for we took that initial value and set it to one and this is just using it you know to enforce basic critical sections that before the item produce or consume because remember we're accessing the same share of buffer so we can't let them do that at the same time so we are required to walk before we modify the buffer and then releasing the log after we modify the bumper and throw the producer and consumer do that same thing so you know the food semaphore is just being used as a regular lock here like we already saw last class that clear the to the previous two semaphores we're actually using them as counting some of ours the third one we're just using it to stand in full block and that's a binary set of four questions before move on yep good question so what what would happen if we fall wait on while he's first or what might happen anyone's getting salt all over that right what basically the idea is you can run into a deadlock which is where no one can proceed because when you call you know when you call great on the wall you're essentially shutting out both other consumers and other producers come running until you release the law so if I follow you know you text not wait if I'm a consumer I hold new text wait and I go hold on wait and let's say there was nothing in the buffer so if there's nothing in the buffer the consumer is going to block but remember the consumer now essentially still holds the lock so what's going to happen if a producer tries to run now because remember the consumer is waiting for a producer to run but if a producer runs now what's going to happen if the consumer has already is still holding a lot yeah right so now you have both consumers and producers are blocked and no one can actually proceed so we are only thinking a lot for the point when we are modifying the buffer and we are ensuring that you're not actually going to get blocked yourself while you still hold the law because you didn't enforce that then you could run into a deadlock or no one is able to that makes sense any other questions okay so this we basically did we already think to the covered this but you know this is just an illustration pounds process for work we have you know the empty mutex that initially kind of high-value the full new text that has the run of a whole society empty semaphore on the full sample for the empty one initially has a value n the full semaphore initially has a value of zero and so you know producers are pulling weight on empty and signal or full vice-versa for consumers in that tell that we're enforcing this good boy so just to you know summarize semaphores here but we talked about walks and then sama fours are essentially a generalization of locks and we can actually use semaphores you know to do walking and then we talk about three different purposes of semaphores one we can just use them directly as a replacement for a loss you know by using a binary semaphore setting the initial value to one and it's exactly like a log - we can use your shared full of resources so you know if we have you know ten different items you know ten ten different identical resources you know ten threads or ten will CPU cores then we can use accounting semaphore where the initial value is going to be n your greater than one or we can use threads to you know wait for a specific action from another thread and there we can use you know semaphore value of zero and that's how we constructed this example where we're enforcing the one thing happens before another so those are those are sort of the three the three use cases which set up loads so next let's talk about monitors and condition variables so monitors are essentially a more sophisticated type of synchronization primitive than Summa four so we started with locks we went to semaphores and now we're going to go to monitors so first let's consider look what was sort of wrong or what was always sort of not so nice about some of ours so what are some problems people might have with using sound applause any ideas yeah yeah you could use a lot of them um so just from looking at some coded semaphores we fall wait and signal was it really is it necessarily clear like what what that is actually doing weather like is if I give you a walk and you say locked out of choir or you see locked out release that's sort of easier to understand right then when we're using the semaphore code and you know what these signal and weights are actually doing is not always clear because the way a semaphore is actually working is depending a lot on like what the initial value of the semaphore was right because we just talked about a couple of different very different ways you can actually use a semaphore so you know one problem is that they're not they're not super you know they're not super understandable they're kind of complex because you can use them in all these different ways another poem is that what a sudden for essentially is is it's you know it's this global integer that multiple people are modifying um so that's that's not ideal and there's also essentially no direct connection between this semaphore itself and you know whatever data the semaphore is users protect or it feels like we're sort of inserting the singles and weights all over the place but there's no it's not really clear you know what exactly the semaphore is related to you know except in you know the specific places we inserted all the calls and yeah so essentially a lot of this boils down to you know there's no there's no control and it's very easy to essentially make mistakes with semaphores and not end up getting the behavior you're trying to get so rather than suffer FAR's we're going to consider a better primitive which is called monitors so what a monitor is is at the highest level the monitor is basically a class so you all know you know what classes are from you know Java or C++ or whatnot you know a class is you know some data and you know some methods and a monitor essentially is a class that has some synchronization operations built into it so it's basically a class that's providing synchronization for you so you know rather than you having to handle all the synchronization yourself you can just you know use a monitor and it's basically going to do a lot of the work for you and we're going to require all the data to basically the two ways differs from a class in that first it's going to just give you mutual exclusion so if you have a a method in a monitor class you're guaranteed that only one thread is executing inside the monitor and we're also going to require all data to be private and you know the reason for that is that if you have public data than anyone could modify it and clearly you know your code is not you know it's there could be synchronization issues if some other thread is just directly modify your data but those those are essentially the two the two differences from a standard class on when we're when I'm talking about a monitor so let's look a little bit of a more formal definition so a monitor essentially is two additional pieces you know other than a regular class that might have you know whatever data and methods and so forth on so first a monitor is going to have your one single block so that's the one walk that is associated with the monitor and then we're going to have some of zero or more what we're going to call condition variables that are going to give us some additional abilities I won't go into details on those right now but of course we're going to use that lock to get mutual exclusion so the lock is going to ensure that only one thread is executing a method in the body do it once and the condition variables are going to let us do are going to let us put threads to sleep inside of critical sections and I'll get to in a little bit why why that actually matters but first let's look at how we actually can use monitors and it in a language like Java so Java makes it actually really easy to be partners so let's say you have a class and you want to make it a monitor so remember what that's going to mean essentially is that you have mutual exclusion on all of your methods and in Java all you really need to do is you needed the word synchronize so if you make your methods all synchronized then that's essentially saying that that is giving you a mutual exclusion when you have that to do Java than a synchronize is only one thread can execute it once so we could do something like this where we have this view that you can you know this is this is essentially producer/consumer yet or you can add to the queue you can pull add and put an either on appeal or you can call the move and that will remove will take an item off the queue in the target and since we put this keyword synchronized here we know that Adam will move are not happening at the same time you're not adding multiple things at the same time so on and so forth and so again you know the reason therefore need is data to be private if you Excel to be private so that you know we're not executing in one of these methods and some other thread executing other code is pumpkin modifies public data of the class so you know this is this is you know safe in the sense that we are not you know mutual exclusion from the monitor what's actually missing from this there's one little piece that's that's missing from this code here anyway see what it is well so the long I haven't actually gotten to what exactly is going on underneath I basically just told you this is what a monitor ensures in the monitor has a lock we don't actually have to manage it the monitor is doing it for us so that steps actually just sort of you know a nice feature that we don't have to manage the lock ourselves in java at least but what about this what about this little bit method you're saying if use not MV or removing an item in return so what's what's or what's the case that's not covered there yeah right what happens if the queue is empty so your religion is that we want it to feel essentially this is again producer/consumer we want to consume an item appetito so we want to remove an item and return it of you know what if there is no item now of course you could just say well you'll return nothing or something but let's say that we actually want the behavior of this to be when we call remove we want it to let's say block until something is available to to be removed right so that's the idea in producer-consumer if you're calling consume then if nothing has been produced you want to block until there is something there so this is essentially another way of you know this is another scheduling constraint here where if the queue is empty and you call remove you want to put a friend to sleep right so why can't we just you know put the thread to sleep there and then you know wake it up somewhere and can add for example why can't we just you know insert a call to sleep to that or why should it be rather yeah right because remember there's no locking via the monitors using a lock and that lock is enforcing your mutual exclusion of all the methods so when you're executing inside this code you're holding the lock so if you're here and you call sleep and your thread goes to sleep you're still holding the lock and so if someone else later comes and pulls add they're going to block because you're still holding the law so that's the problem we need to deal with here is essentially how can we have the thread wait but not continue holding the lock and preventing anyone else from doing anything so to deal with this we're going to introduce the idea of a condition variable which basically means it's a way that we can actually use to have a thread wait on some condition and give up the law 4:40 anyway so you know what we want to do we want to change remove so that it actually waits until something is on the queue and you know as I said logically what that means that we want the thread to sleep until something is Val but we can't just you know take the lock and hold it because then nothing will actually be able to be added so we're going to use these condition variables and essentially the idea is that when you're using a condition variable you are going to wait until a certain condition is changed and while you're waiting for that condition you are atomically releasing the lock so so a condition variable is basically just a queue of threads that are waiting for you know some condition to change and importantly this is inside a critical section right because this is the problem we're solving of how do you want how can you actually sleep inside the critical section without folding everything else off so you're going to have three operations on a condition variable um you have weights you have signaling a broadcast so when you're waiting this is an atomic unit all of the electronic operation together the idea is you know you're going to release the lock and go to sleep and then the idea is you're going to sleep until that condition has changed and once that condition has changed some other thread is going to call signal and that's going to wake up a thread that previously called weight right so some of you you want to weigh on your weight and you're still inside the critical section so you're giving up the lock and putting the thread to sleep and that happens atomically and then sometime later someone is calling signal and that's going to wake you back up and then broadcast is just a variant a signal where you know you could have multiple threads waiting on the same condition variable so when you call broadcast they're all going to wake up and you know we have this rule that you can open all these methods you know when you actually hold a lock so you know when you call weight you are guaranteed to actually hold the lock and then you're giving it up immediately and then order for call signal and broadcast you must hold them off as well so so that also is essentially you know inclusion the operations of the methods of the condition variable itself so let's look at how to do this in Java so you know in Java the names a little bit different instead of signalling broadcaster it's called notify and notify all but it's essentially the same thing so we have wait to give up the lock of notify to indicate that you know the condition is satisfied and then we can you know anything that's waiting for that condition can go and notify all it's going to wake up everything and in Java the way this works is that every object can essentially be used as a condition variable so if you have more if you want to have multiple condition variables you need to define multiple objects but for the purpose of this we're just going to assume a sort of one or one condition variable so here what we're doing is we just modified them a little bit and you know when we call remove we're going to say no while the queue is empty we're going to wait but remember we're in a synchronized method so we're holding the lock and we're calling weight and that is monotonically putting thread to sleep and releasing the lock so that you know the threat is still inside the critical section but it no longer holds the lock and then sometime later somebody can come in and add and because because a friend that was in here no longer holds the lock you can still execute this will put something on the queue and we'll call notify and if something is waiting on you know waiting for the queue to have something in it then that thread will wake up and when once the thread actually wakes up it's going to have to reacquire the lock right because this is still inside the critical subject so whenever you actually get woken up waiting on a condition variable you're going to have to take the lock again before you think you continue because you're still inside the critical section care so you know so again you know the condition what the condition variable is here is essentially that you know the queue is not empty we're just following the weight and notifying these are actually methods of any actually any class in Java you probably all know you know any class in Java is a descendant of the object class you know this inherit you all know you guys all know inheritance right right so you you know you have objects that inherit from classes above it so when you say you know Class A extends Class B or whatnot then you're inheriting all the methods for Class B but everything in Java inherits from the object class and the object class is actually where wait notify and notify all at the farm when a farm you can essentially just call notify and wait in any class and you're essentially using the class itself as the condition variable so if you wanted to define multiple condition variables you basically add you know you can add with another private object you know condition variable to and just call condition variable to not notified out wait so on and so forth but in this example we're just using the class itself with the one condition variable which is enforcing that there is actually an object so notice that we said you're the wild who is empty oh wait so why did we not use it just an instant any ideas so remember when you hold wait they're getting out the walk and they're going to sleep and some other time then sometime later notify is going to be called and that's going to wake up though that's going to you know take the thread which is locked it's going to put it back into the ready queue so that the the threat is ready to run again but what I didn't actually say was whether we actually you know given the lock back to this threat or not right you could have motive threads calling the move and when something calls pad you don't necessarily know which thread is going through the wall so we actually have two different types of monitors and we have Mesa and for style monitors and the difference is what actually happens when you pull signal so remember when you call signal you still hold the monitors walk you know we said that that was a rule when you're following weight or signal or rather a signal uh yeah weight or signal you must hold the law so in Mesa style monitors which is what java news is when you call signal you actually keep control of the law and then you know the thread that was woken up still has to wait for the law so you know returning to this example for a second you know what might happen after the thread is woken up here before it actually runs again yeah right exactly so before we all notify this thread is ready to go again but it still does not have the law it still has to reacquire the lock when it starts running so in the meantime some other thread might execute and call the move it takes the law and then take the item and then once it's done taking item it releases the lock again and the thread that we woke up now might pick a lock and see that the cue is still empty because it was taken by some other thread that got scheduled in the meantime keep watching how that works so we you know in this industry in Mesa style which is what Java and most operating systems are using you know when you are woken up you're not explicitly given the law so someone else might actually take the law before you get control over walk again so that's why rather than just saying if you actually always have to do a while here because it's still possible that the next time you execute this it still might not be true because you you know you were not given the lock immediately as soon as there's another item so essentially the other approach is not really used in any actual system but it often needed in textbooks where when you call signal you actually explicitly give the lock back to you know the thread that that you woke up so if you were using that style then this would not have to be a while it would be an if statement because you're guaranteed that once you are signal you now have the lock again and so now if you have the lock again what's going to happen if some of your other thread gets scheduled and fold removal so if you know if notify is called here and we wake up the thread here and give it the law we still don't know anything about how friends are getting scheduled so let's say some other thread somewhere else now get scheduled info remove what's going to happen to that thread right it's going to block because we already gave the lock away to the thread that we walked on so in that case and it would be safe here because you know that regardless of how other threads get scheduled anything else that tries to take the item is going to block until you get scheduled and you can actually remove the item make sense any other questions on that okay so so that's basically you know the only difference between between these two style of bottlers yeah so essentially this is just the only one I just went over that in one style you may have to wait so you're always going to have to use a while loop otherwise your if statement is sufficient so we looked at Java which is it's pretty nice in terms of it makes monitors information variable our clip on is a little bit more complicated in C++ because you don't actually have that synchronization keyword so so what is that synchronization keyword actually managing for you what component of the monitor is that essentially doing automatically for you right essentially the synchronization is inserting all the calls to take the lock when you go into the method and then the move and then release the lock when you leave the method so in C++ you have to do that explicitly yourself so when we're doing this in C++ you know for each of our method we don't have the synchronize keyword so instead we have to call it walked on fire the beginning of the method and lock release afterwards and that will actually ensure the you know the critical sections and then we have you know a separate condition variable where you know the stages extended before when we think all the way on all simple so essentially you have to manage the mutual exclusion or self but otherwise the concepts are basically all the same so now let's look at the bounded buffer problem using condition variables so remember we earlier looked at this question where we were you know we did this using three different semaphores remember we had two semaphores for you know the scheduling constraint on the producer and the consumer and a third semaphore that was just the log to ensure the critical sections so let's look at how we can do this here so we again are going to have you know the the bounded buffer and we're going to have you know the two condition variables now because remember previously we're using those semaphores but it wasn't exactly clear you know how those semaphores were actually you know what they were protected either we can just whew no expensive condition variables basically saying you know wait until the buffer is not full so that I can produce something or wait until the buffer is not empty so that I can actually consume something and so this this one makes more sense than you're calling these because you know when you're calling ahead you're simply pulling your weight on empty because if you know you're if the if a buffer is full then it you're just going to wait until something is removed and then continue and when you're removing something full not wait so then you're just going to wait until you know B wait until there's something to color something so you know this is this is a sense of the same thing we just went over just losing you know condition variables rather than rather than semaphores and you know here we are using the if statements again here so these are using for style monitor is if you're using Mesa style monitors it will be exactly the same except you have to use again wild statements for for both of the condition variables because it might be the case that you know another thread could come in and take the lock in the meantime so now let's consider the differences between sella forms of markers right so so you both semaphores and your monitors are kind of similar and you know we both have feeling of weight and signal you're both for condition variables and remember just so just so it's clear remember condition variables are you know associated with a lock of the monitor so that's how you know condition variables and monitors are sort of you know two halves of the same thing or rather you know a condition variable is associated with a specific monitor that monitors walk but so you know let's ask the question well what exactly is the difference between you know using a condition variable and seeing condition dot weight versus a semaphore where you're saying semaphore dot weight so so what's the difference between that so remember that you know with a semaphore you're when you're pulling a when they're calling wait you're decrementing the integer and you're blocking if it's below zero and with a condition variable when you call sorry with a condition variable when you're calling weight you know you're not using that integer you're just waiting for someone else to call signal or notify rather and wake up the thread so any idea is what what sort of the difference between those two yeah right so so once I don't using Excel form and we have you know one thread that's pulling your s dot weight and a another thread that's calling s dot C so you know again suppose we're using semaphores and if you fall weight first what's going to happen so I don't think I'm doing great job so basically the idea is write a semaphore have that integer and that integer is some state that it remembers right so when you call you when you call the methods of a semaphore you're modifying that integer and that integer you know state that the state of that integer is staying between Falls so when you call signal on a monitor what happens if nothing is waiting on the condition right nothing will get notified is anything going to happen right if you pull signal on a monitor on a condition variable and nothing is waiting for that condition variable nothing is going to happen whereas if you call if you call signal on a semaphore regardless of if anything is blocked or not you're still going to modify the semaphore is integer so essentially the semaphore is remembering you know weights and signals that were polled on it whereas a condition variable is waking up a thread when you call signal but if you're calling signal and nothing is waiting on it you nothing is going to happen so you know to summarize the differences here you know condition variables do not have any history you know but semaphore is do you know when you're calling signal on condition variable if no one is waiting you're not doing anything and you know if a semaphore signaling or if no one is waiting you are still you're incrementing the value of that semaphore so actually let's say so let's say you have a semaphore and one thread calls rot sorry let's say you have a condition variable and one friend calls signal on the condition variable and then sometime later a different thread pulls weight on the condition variable what's going to happen hmm if we're using condition variables is that going to happen remember condition variables don't have any state they don't have any memory of what was previously called so if you pull weight if you're a condition variable and you're pulling weight on condition variable by definition you're always going to sleep right that's how we define condition variables when you call weight you're giving up the lock and going to sleep that's always happening and you're only going to start executing again once someone calls signal on that condition variable but in the case of a semaphore if someone calls signal first let's say some thread call signal and then later some thread pulls weight is a thread going to blockers they're going to keep running right in the semaphore case it's going to keep running because the semaphore is essentially remembering previously that someone actually called signal so the next thread that calls weight is going to keep executing versus in the condition variable case there's no memory as soon as you call weight the thread is always going to sleep until a subsequent signal is called so you know basically what this means is that when they settle for regardless of the order in which you Inc in which you call weight and signal you know that doesn't really matter the end result is going to be the same so if you have you know two threads that are calling weight and one thread that's calling let's say the initial value of the semaphore is one and you have to value or and you have sorry let's say the initial value of the semaphore is zero and you have two threads that call a weight and one thread that calls signal in some order how many threads are the two that you know call weight are actually going to block so if the initial value is zero then the first then any thread that calls weight is going to block right so if the initial value of the semaphore is zero then the first and then the first thread that pulls weight is always born the block right and then if you call you know signal then the thread will get released again but basically basically the point here is that you're always going to end up with the same sequel with the same number of threads that block or continue regardless of how you execute the order regardless of the order of execution of the semaphore operations whereas as we just saw with a really simple example if you're using a monitor the order of weight and signal does actually matter because if no one is waiting then that changes the behavior signal signal doesn't actually do anything at all it is waiting now you can actually implement monitors with semaphores I'm not actually going to go into the details here all the code is on the slides if you want to review it later but basically the key point is that in order to implement monitors using just semaphores as the basics you need to enforce the two difference the two main differences that we just described which is that when you call wait in a monitor you always go to sleep whereas in a semaphore you're not always going to sleep it depends on the value and so that's one difference and then when you're calling signal in a semaphore you are changing some state that is carried over whereas in monitors you know there is no state if no one is waiting nothing actually happens so I won't over the details but essentially this is just code that shows how you can actually implement that logic using just semaphores as the basic building block and yeah so just to summarize you know a monitor combines this you know combined a single lock which is providing critical it's providing critical sections for all of the individual methods and this idea of condition variables and right between the key ID and the condition variable is that when you're waiting for some condition you are giving up a walk but still remaining in the critical section remember because if you don't have that then you are essentially forced into going to sleep while holding the lock and that yo can cause a deadlock so that's what condition variables are doing for us and Java has this idea of monitor sort of really built in with that idea of synchronized and it's managing all the locking for you we don't have that in C++ but you can do the same thing by just making sure that you have walking at all you know between all but for all the methods and then you know it is possible to implement marv you're using semaphores of keeping in mind those two differences that semaphores are keeping state and mana going on

Info

Channel: UMass OS

Views: 36,494

Rating: 4.9350648 out of 5

Keywords:

Id: o5cXttjjBVs

Channel Id: undefined

Length: 70min 23sec (4223 seconds)

Published: Sun Feb 23 2014