Producer - Consumer Problem in Multi-Threading

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in today's video i want to take a look at the producer consumer problem and uh what it actually is and then how to solve it uh first things first let's uh get a little bit theoretical because this problem is kind of abstract and well hard to understand at first the idea of the producer consumer problem is that we have a shared buffer so a buffer that is in shared memory between multiple threads let's say like this this is our little buffer that has a few elements it's let's say it's a simple array um and in it there are certain producers that add things to the buffer so here we're gonna have uh an arrow that denotes the producer and we're gonna have consumers that take things from the buffer right and these are named producers and consumers because the producer actually produces some valuable data that then is passed to that buffer and the consumer takes in the data and also processes it some way or another you can think of there are many examples in real life that you're going to be able to find but a really good one i think is uh if you think about the requests as elements in that buffer so requests to a server just http request you know when you access a website you send a request to the server um you can say that your browser is the producer of that request and that request is passed into the to this buffer and then the consumers of those requests are the servers themselves where it could be one or more okay a good example would be a a printer queue so if you have one printer in in an office of like let's say 10 people and all 10 people want to use that printer you're going to have to have some mechanism that manages the order in which each print occurs right so in that case the the employees that use that printer are the producers for those for that information that needs to uh create the paper itself and that information is taken from the from the buffer by the printer which is our consumer in that case now there are at least three problems that we have to solve in the process consumer problem one is that we are using shared memory so manage shared memory access this is what what i mean by that is uh that the buffer is in share memory and it's going to be accessed by multiple threads possibly multiple producers possibly multiple uh consumers but at least one of each so we're gonna have to make sure that we don't have any race conditions in that probably using a mutex as the first problem the second problem is what if what if you have let's say two producers and only one consumer right and let's see all of them actually produce at the same rate right well if you do that then that means that you're gonna have many more items in the buffer than the consumer can take from it right so in one second you're producing uh two items the consumer can only take one at one point the buffer is gonna be full what's gonna happen if the buffer is full what are the producers gonna do because they have to do something you cannot just uh have a simple while loom that keeps on checking whether or not the buffer is uh has an empty slot you're gonna have to handle this one way or another so this is another product we have to handle that is checking for if buffer is full and the third one is the opposite of this if you have for example one producer and two consumers in that case uh then what happens if the consumer actually tries to take items from an empty buffer what what happens exactly do you actually have a while loop again that just checks that's not that efficient so you can solve these problems in multiple ways uh depending on exactly what the problem is but this is the problem whenever you want to take things from the buffer you have to make sure that there are actually something to take from so checking checking for if buffer is empty and that's all we're going to tackle in this video there are sure many other problems and of course there are going to be better ways to implement than what i do here but this is a very uh simple example and i hope you understand it through this now let's start by creating the actual buffer right that's the first steps i'm going to have i'm going to have the program simply have uh producers that produce random numbers and consumers that consume the numbers and they just print it on the screen that's all but really these two operations you should think of them as something that should take a bit of computational power right so in that case it doesn't but this is just for to exemplify so now let's read the buffer we're going to have here an buffer let's say of 10 numbers and then we're also going to need a count and that count should be set to zero because we don't have anything in the buffer at the beginning so that's fine now i'm gonna create one producer and one consumer using of course two threads and it's the same code that you've seen uh before so i'm just gonna fast forward to it and just like so we have created here threadnam threads which is right now defined just to be two and uh half of the threads using this if are gonna be producers and half of them are gonna be consumers so right now we're gonna have just one producer and one consumer and of course we're joining them here at the end nothing fancy let's create those uh these routines in here so void pointer producer and we don't take any arguments but i'm gonna have to add those uh there for the signature and the producer what it does really is well first it generates a random number now to make sure that uh our random numbers are always random i'm gonna do a uh s trend of time of no here in the main function and then in the producer we're gonna say let's say in x equals rand and say percent 100 so we're going to get any number between 0 and 99 now that's perfectly fine this is the this is sort of the produce step right so the produce stuff is sort of separate from the the synchronization step the step where we're actually gonna be adding it into the buffer here we're gonna add things into buffer so add to the buffer well to do so all we have to do is uh just say buffer of count equals x right so count is gonna be if it's zero then buffer of zero let's find the first one if it's one then that's the second element and so on and so forth so that's perfect and then we increment the count once we're done so that's just adding to the buffer amazing that's technically all you need to do now for the consumer let's see it's kind of the reverse right we have first uh remove from the buffer that's the step that we have to execute first so i'm gonna have to get the element from the buffer how do we do that well we can get the last element from buffer which is count minus one and i'm gonna use uh the buffer as a stack so it's first in last out but it could be uh implemented as a first in first out the same way it's exactly the same thing except i don't want to move all the elements to the right or to the left i just want to take the last element that has been added that's fine uh so here we take it from the buffer and then of course we're gonna decrement count just like that and then lastly we're gonna actually uh consume and by consuming here we just mean printing it to the screen that's all so we're gonna say got a number that is y okay so we have two threads running this if we try to launch this we should see something that works okay so we've got 69 networks and now really the only issue with this is that the producers and the consumers are something that are usually processes that are long running so usually you're you continue on producing things so right like your browser if you change site still produce uh requests to the servers and servers are always there to actually accept those requests right so to simulate that we're going to create here a while loop so i'm just going to say a while of one but it's never gonna finish i'm gonna have to finish to actually kill the process in order to stop these threads but that's fine that's perfectly fine especially if we don't have any dynamically dynamically allocated memory here now if we try to launch this let's see what happens oh we have a segmentation fault why is that well that's really because we are not taking into consideration the space that we've allocated for the buffer so we only have 10 uh slots and 10 integers that we can store in this buffer and if it goes past that it doesn't really care about it and really this personal consumer method revolves around a bounded buffer so about a buffer that cannot grow in size this is what it usually comes down to but there's also implementations where you have an unbounded buffer but that's a bit more complex so here we do have to check if the count is a certain number but before we do that remember what i said uh when i first presented the issue the first one was actually shared memory access because we are accessing the same buffer from two threads we should stop the race conditions so we shouldn't allow if one thread is trying to increment count the other side shouldn't at the same time decrement it right it should wait its time so to do that we're gonna have to use a mutex here first things first just like that initializing and destroying the mutex pretty simple and then what we have to do is just call ptread p thread mutex lock retex buffer and unlock whenever we are we are done modifying the actual buffer right similarly here in the consumer we're gonna have to add a lock and an unlock on this and let me move this comment up here like that so that they are looking like the same thing okay so now if we try to launch this we are still going to get a segmentation fault like we expect so a bit past after 10 elements we hit the place in memory where we didn't have access to so it just kind of uh broke how do we fix that issue well a naive attempt to fixing this issue is to add an if statement so whenever our count is actually less than 10 then we can actually add to the buffer because we have space in that buffer otherwise we don't add it i guess um and similarly here if the count is higher than zero only then take something from the uh from the buffer and actually what i'm gonna do here is also instantiate here a y and set it to negative one so that i can print it outside of the if statement so it's gonna print negative one if nothing was taken from the buffer all right interesting now if we try to run this we're going to get a lot of messages printed on the screen so this definitely works but as you might have noticed we got actually a lot of negative ones here they are right and this tells me that there's a lot of cpu uh cycles wasted just basically doing nothing with getting nothing and trying to get nothing from the uh buffer and so on and so forth right basically busy work that does that accomplishes nothing because negative one means we didn't get anything from the buffer but we still sort of consumed that sure you could check it but you cannot stop you have to somehow not um check so often that there is stuff in the buffer and a sort of weight whenever there are things in a buffer so that's a problem and another issue that's even bigger than this is that some cases the producer ignores certain numbers remember when i said that if the buffer is full then we just keep the number but what if that number was actually let's say a request a request to your bank account right that you made for a payment that was kind of important to do should it be just ignored not not a good solution for this right so uh we have to find a better solution than this and you can see here if i put an else statement let's say printf skipped for sandy backstage and i'm gonna say here um x and if i try to launch this you're going to see a couple skipped messages here and they are actually quite a lot yep probably the buffer is very full and the consumer doesn't consume them fast enough yep okay but we know that that still works just not optimally right now we are uh we are throwing away some requests some numbers in this case and we are also wasting a lot of time just taking nothing from the buffer now you could think of many solutions to this one of them would be to just sleep for like one second and try again whenever there is stuff so if the count is less than 10 try to sleep and then try again to check if it's less than 10 and so on and so forth that could be an idea and with uh condition variables you could actually achieve this but another idea is to use semaphores because with this um semaphores are quite valuable in that they have that value stored in it and we can have two semaphores one that stores the number of empty slots in the buffer and the other one the stores the number of full slots in that buffer so let's try to implement them here we're going to define two semaphores so same t same let's say same empty and the same pool don't forget to include semaphore dot h here up top it's very important and then let's also initialize them in our main function so you're going to say same init of same empty and the second one is zero because we're not sharing it between processes we only have one process two threads that's fine uh the value well for same empty it's kind of interesting because we have how many slots do we have in the buffer we have 10 slots and we have no elements in the beginning so this should be a 10 right this is the number of slots that we have available at the beginning then we can initialize the other one same full and same thing zero as the second part and the third parameter well we have no elements in the buffer so no uh fully occupied slots therefore we have zero as uh as a default as an initial value to the semaphore and of course we're going to destroy them once we're done using them let's say same destroy of uh same empty and same destroy of them full now if you understood what the semaphores actually represent it's actually pretty logical the way we're going to have to use this so we can use either same weight to decrement the semaphore and weight if it's 0 or same post to increment it which which goes which though well in the producer we are going to have to decrement before actually putting something in the buffer so decrement and wait until there is an empty slot right so we're gonna same weight on same empty if some empty zero that means there are no slots on the semaphore therefore we should wait here there's no point for us going forward because there's actually no slots left for us okay so that's fine if there is if this passes that means that we are slot so we can just add it and once we're done we should post on the same full because we have added an element to the uh buffer so we should increment the semaphore that represents the number of elements so i'm going to say here same full and well basically the same thing but in reverse for the consumer we're gonna have to same weight on same pool because we want to wait until there is at least one element in the buffer right sample if sample is zero then this same weight call is gonna wait until it's more than zero so we're just gonna uh wait until there is actually an element in the buffer that has been posted by this producer and lastly of course we should actually post on the same empty semaphore say telling uh everybody that's waiting on the semaphore that oh now there's an empty slot in the semaphore because we have taken something from it so you can say here same post on sam empty and now if we actually try to run this we should get a result that's more it's much better in the sense that we no longer get skipped elements as you might notice but we also don't get negative ones right so both of these issues have been solved simply because of these semaphores because these semaphores are gonna keep track of however many elements are in the buffer and this same weight is gonna wait until at least one is either in in the state of being full right at least there is one when taken or at least there is one when one slot empty when trying to produce and put one into the buffer right so therefore that means that we don't longer need this if statement because if if we're here right same empty necessarily needs to be at least one if some empty is one that means that count is at least uh nine right so either nine or lower that means that this one slot is free that means that we don't need to check here we don't need this else at all we just need to execute this and similarly here because if we have some weighted on stem full sample really represents the number the count here so because of that same full is actually going to represent the count and it's it it is going to be higher than zero necessarily otherwise same weight would have blocked here so this is always true so we don't have to any longer check this in fact we don't even need to assign it a negative one here but yeah we can remove it and it's probably gonna be fine so again if we launch this this is probably fine so now we can play around a bit with this since it should be able to work with as many consumers and as many producers we give it right so first things first let's limit this uh this output because it's quite really fast and we don't know if we have four times as many threads it wouldn't make a difference so let's say we sleep one second after every so let's say the consumption itself takes one second and then let's say that the producing or closing an element x in one second as well if we try to launch this we're going to get one element a second so that's kind of expected that's fine and the buffer is probably not even getting filled right now the interesting part is this should work if even if the sleep one is only on one side so if i remove the sleep one from the producer let the producer create as fast as possible elements it's still gonna print once a second and it shouldn't crash simply because of those semaphores so at one point the buffer got filled and it started waiting on this a lot right so probably now it's actually filled up and it's just some wait until the consumer takes in elements right and of course if we have multiple threads so let's say we up this number to eight we should get uh instead of just one a second we should get four elements a second which works nicely and this should happen even if we take away the sleep time on our consumer because even though the consumer could read as fast as possible the producer would still just uh produce once a second so it will be the same result as you can see just that these consumers are going to wait a lot more on the same weight so this is a lot more efficient than actually just keep on checking every i don't know how many milliseconds for this count to be whatever number especially if you're dealing with mutexes at that point it's going to be quite costly in terms of cpu time and it should also work the same even if we have just one consumer so let's say that we have eight threads and just the first one is a consumer the rest are producers so to do this i'm going to change it here in main in the create part of it so here we have in the if being the producer so i'm going to say here if i is higher than zero then create a producer and if it's zero then of course create a consumer so it's gonna only create one consumer the rest are gonna be producers and in this case we're gonna get seven numbers a second as you can see here simply because the consumer can consume uh the elements from the buffer as fast as possible but the producer is limited so this is sort of the problems that the producer consumer problem tries to solve right whenever we have more uh process than consumers whenever we have more consumers than producers whenever we have the producers producing more elements that the consumer can consume and vice versa and so on and so forth all these issues uh need to be solved in this problem and this is not the be all and all solution that i have presented to you here there are better ones out there there are uh more fit ones for the actual problem that you're trying to solve but in our case generally this is good enough but if you for example i said the producer like the produce stage and the consume stage if those uh don't take time at all right then that's a bit of a problem because then would it mean that this part is what will take the most right adding it to the actual buffer right and because of that it might be that uh due to the fact that taking it takes so much to add to the buffer since it's a critical section again it can only be executed by one thread we might have to optimize this so that multiple threads could actually write to this buffer at the same time okay that's about it for this video i hope you got something out of it i hope you understood what this problem is about i know it's a bit abstract but trust me if you look around if you know about this problem and you look around a bit you will find it at every corner if you do have any questions leave in the comments below or on our discord server again the source code is going to be found down in the description below on our website take care bye you
Info
Channel: CodeVault
Views: 18,945
Rating: 4.9822745 out of 5
Keywords: codevault, c (programming language), producer, consumer, problem, threads
Id: l6zkaJFjUbM
Channel Id: undefined
Length: 25min 18sec (1518 seconds)
Published: Wed Dec 30 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.