Improve Go Concurrency Performance With This Pattern

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
is writing concurrent code difficult some might say myself included that learning how to write concurrent code is probably one of the more difficult Endeavors that you'll encounter in your programming journey and this is partly because there are many ways to go wrong with writing concurrent code and there are only a few ways to really do it right and in order to facilitate doing it right and avoiding the stress and everything else that comes with writing concurrent code the wrong way I think that one of the best things you can do as a programmer is try to adhere to the patterns that have developed over time by more experienced programmers through lots of trial and error as opposed to trying to reinvent the wheel so in this video we're going to go over an additional concurrency pattern and this pattern is going to help you to write more robust and safe proof and performant concurrent code code and for the sake of Simplicity we'll just call this pattern confinement but it's really a way to make operations safe code when dealing with multiple concurrent processes and what I mean by that is this if you imagine that we have multiple concurrent go routines and all of these go routines need to access the same segment of code and that segment of code accesses shared resources for example if we have a value here that is subtracted from by multiple concurrent go routines this segment of code is called a critical section and it's critical because if multiple concurrent go routines are trying to access this same resource and we have no synchronization mechanism in place to ensure that the shared resource is accessed by one go routine at a time then we open the door for data inconsistency and or race conditions which means that if two go routines access this amount at the same exact time they both see the same value for the amount and they both subtract one from that same value and now the new value for the amount becomes whatever it was before minus one well that's wrong right because two go routines we're trying to subtract from the amount and that means that the current value should actually be whatever the value was initially minus two but the problem is since both go routines access to this value at the exact same time we had what is called a race condition and this resulted in data inconsistency so what we need to do to solve this is we need to synchronize go routine access to this value now a simple way of doing this is to guard the critical section by way of mutual exclusion or in short a mutex and you can just think of a mutex as a lock and before go routine can access a critical section it needs to obtain the lock and lock off the critical section from other go routines now there's only one lock so if it is currently being used to lock off the critical section other go routines will have to wait until the section is no longer locked off once the section is unlocked and the lock is released another go routine can pick up the lock and lock off the critical section now this solves the problem of race conditions and data inconsistency but it actually comes at a price to Performance and that's because a critical section is really a bottleneck to your program because remember go routines are needing to wait for the go routine that currently has the critical section blocked off before they can continue with their work so you can imagine if the critical section is a large segment of code that takes time to execute that can add up to a lot of idle go routines waiting around for the lock to be released okay so before continuing with this video I need to mention that I'm going into this expecting you to have an understanding of how pointers work and go as well as the basics of concurrency and go and don't worry I have video is explaining both of these things I have a short video that will get you up to speed with pointers and go and I also have a video on go concurrency patterns which is essentially the precursor to this video and both of these videos are linked in the description so yeah without further Ado let's continue with this video so let's have a look at another example of some code that might need to synchronize go routine access to a shared resource so this is just a simple go program it's just a main.go file and of course the go mod file so if you're following along with this tutorial just go ahead and create a main.go file and we're just going to go ahead and create package Main and we're going to create our main function and the first thing that we're going to do is we're going to declare a weight group so we're going to use the sync package and we're going to just use weight group and if you don't know what a weight group is it quite simple it's really just a way for us to wait for a collection of go routines to finish and I'll show you more about what I mean by that in just a second so we're going to declare this weight group here and we're basically going to create sort of an input stream and it's just going to be a slice of integers one two three four five and then we'll create a result slice as well and this is also going to be a slice of integers but this one we're going to initialize as empty and what we want to do is we want to process this input stream and put the results into this result slice here and for the sake of Simplicity and for the sake of simplifying the understanding of this explanation the only thing that we're going to do to process this slice here is we're just going to multiply each value by 2 and we'll just put the result value of each of these elements in this slice in inside of the result slice now the shared resource is actually going to be the result slice and you're going to see what I mean by that in a second so we'll go ahead and do a for Loop and for every element and input we're going to first do weight group dot add and we're going to add one and we're going to need to add one to this weight group for every go routine that we spin up and that's how the weight group's going to know how many go routines it needs to wait for so we're going to spin up a go routine here and we're going to call this function here called process data which we haven't created yet so we're going to create this function and we're going to pass into this function the weight group and that's because within the function we're going to need to let the weight group know that this particular go routine is finished because if the weight group's waiting for the completion of all the go routines that we spin up each grow routine is going to have to be able to communicate with the weight group and communicate that it's complete so that the weight group knows that it doesn't need to wait for that particular go routine anymore and again you're going to see what I mean by this part in a second as well so we're going to pass in a pointer to the result now remember the result is the shared resource here so we're passing in a pointer to the actual result that we have here so within this process data function we're going to be mutating the value of the result here so we're going to be actually mutating this slice so that's why we need to pass a pointer here we need we need to actually access this slice that's outside of the scope of this process data function and lastly we're going to pass in the data which is just going to be whatever value that we're at in the for Loop here because remember we're iterating through the input and the input is are these values right so for each call to process data we're going to process one value value in this slice so there's going to be a go routine for each value in this slice and each go routine is going to be processing its own value and after this for Loop we need to do weight group dot weight and this is where we're actually waiting for the completion of the go routines that we've spun up because if we don't wait for them we'll get through this for Loop and these go routines will be running and we'll still just if we don't do this we'll still just in the program like the main function we're just in without waiting for the completion of the go routines and again if you're not really familiar with how go routines work I suggest you watch my video on concurrency patterns in go prior to watching this video so we'll just add weight group dot weight here and lastly we're just going to print the result so by the time the main function completes the result should contain every element from input multiplied by two so it should contain two four six eight and ten so now now let's go ahead and create this process data function so we'll just go up to the top here and we'll do Funk process data now this process data function is going to be the critical section of our code because in this function we're going to be accessing shared resources from multiple go routines so remember we're passing in the weight group which is a pointer to sync.weight group we're passing in the result which is a pointer to a int slice and we're passing in the data which is an INT now this input we're just reading from it right all we're doing is iterating through this input here and then we're just basically giving this process data function a copy of the element from the input so we're not mutating this input in any way this isn't considered a shared resource in this case so remember the shared resource that we're dealing with here is this result and the reason this is the shared resource is because this so let's go ahead and write out the rest of this function here so before we do anything we're just going to defer weight group dot done and all this is going to do is when this process data function completes this done method is going to get called which is going to tell the weight group that this particular go routine that is currently running this process data function is finished and defer just means that we're deferring it to the end of the function so it won't run until the actual function completes and all we're going to do is we're going to set processed data equal to data times two and then we're going to take the result we're going to dereference the result since it's a pointer and we're going to append to the result and what are we going to append we're going to print append the process data now here we're going to have multiple go routines trying to append their process data to this result and what you're going to see happen is is you're going to see that that we're going to get race conditions and it's going to result in the results being inconsistent so you'll see what I mean I'll show you what it looks like in just a second so let's go ahead and run this and you're going to see when we print the results that there's going to be some weird Behavior so let's go ahead and open our terminal and we do go run main.go now as you can see here we get 6 4 and 8 and that's only three elements right but there's actually five elements that should have been processed and appended to this result slice here but we only got three so that's that's not the correct Behavior right so let's go ahead and actually just run this a bunch of times and kind of get a better idea of how many times we see inconsistencies with this data so if I just type in Forever here and then go run main.go it's basically going to infinitely run this gorun main.go command so it'll just keep running here and you'll be able to see how many times we're getting these inconsistencies so as you can see here the lengths of the actual result slice vary and at sometimes we're able to get the full result and sometimes we're not but it's very inconsistent it's not something that you could have in actual production code this is very buggy so this is the type of thing that we need to avoid we need to avoid race conditions when accessing these shared resources so if we just go ahead and stop that we're going to go over the first way that we can actually resolve these race conditions and as mentioned in the explanation at the beginning of this video the naive way to resolve it in this particular case is to actually synchronize go routine access to the shared resource using a mutex but as I mentioned before remember using mutexes are going to bottleneck your code but it will resolve the issue of the race conditions so let's go ahead and we're just going to do VAR lock and we're going to do sync.mutex and then let's go ahead and remove these spaces we'll do lock Dot Lock and that's going to lock off the critical section then we'll do lock dot unlock and that's going to unlock the critical section so basically we'll encapsulate the critical section in these uh lock and unlock calls so that's going to lock off the critical section the same way that I explained in the explanation at the beginning of this video so now if we go ahead and save this and we go back to the terminal and we'll do the same thing again we'll just infinitely run this command you can see that although not always in the same order we're getting the process data all of the process data appended to the results array and of course it's not in the same order because I mean the go routines are running concurrently so whichever one gets the lock is the one that is a able to append first and the order might vary depending on which go routine gets the lock and stuff like that but the point is access to the shared resource is now protected by the lock when it's being mutated so if we go ahead and in this you'll see that we're always getting the values from the input multiplied by two so it's two four six eight ten regardless of the order we're getting the correct data here but again this is not a performant way to handle this because let's imagine that process data is actually doing something time consuming like right now process data takes constant time because we're just multiplying data by two but what if process data actually did something time consuming so let's let's go ahead and and see what that would look like we'll do Funk process and we'll just say data which is int and we're just going to do time dot sleep time dot second times two so we're gonna process data is going to take two seconds and then we'll just return data times two now here we're just going to call process and pass in data and I forgot to put the return here so it returns an inch and now let's actually put a timer on the main function here so we're going to do start equals time dot now and at the end of the function right before we print the result we'll format dot print line time dot since and pass in start so this is going to print how long the function takes more specifically how long the main function takes and let's first comment out the lock so we'll comment out the lock because I want you to see the lock being the bottleneck like I want you I want you to see why I'm saying this lock is a bottleneck so we're going to comment out the lock we know we're going to get inconsistent data but let's just see how long it takes to run the function without the lock so let's open the terminal and we'll just do go run main.go and you can see that without the lock it takes two seconds right and of course the data is not there's only four items in this results slice and the data is not consistent so we're just going to ignore that for now I just want you to see the bottleneck here so let's now reintroduce the lock save the file and we'll go to the terminal again and do go run main.go now as you can see here it took 10 seconds now and if you do the math you remember we spent we're spinning up five go routines we're spinning up a go routine for each one of these elements in the input slice and each go routine is taking two seconds and two times five is ten right so it's going to take a total of 10 seconds with with the lock and that's because the go routines are needing to wait for the release of the lock so one go routine is going to get the lock and it's gonna lock off this critical section so all the other go routines they're not going to actually run concurrently they're actually gonna just wait for the lock and the go routine that has the lock for that two seconds once it releases the lock another go routine picks it up but it's only one other go routine that can pick it up in the other go routines still have to wait for the lock so we're actually needing to process the data synchronous synchronously so in this case using this synchronization method with the mutex is basically the same as just synchronously iterating through the data and processing it one at a time like we're not even necessarily using concurrency here although we're using go routines so that's just an example of how this is bottlenecking our actual program but now finally we get to confinement like confinement can actually improve the performance of this program by eliminating the need to use this mutex for the synchronization of access to the shared resource so essentially what's going to happen is we're actually going to be able to process the data concurrently meaning that instead of this program taking 10 seconds to run because each girl routine needs to wait for the lock and it takes two seconds per go routine it's going to just take two seconds for all the routines to run concurrently and let's actually explore what I mean by that so we'll go ahead and refactor this code to you use confinement and after that I'll fill in any gaps in understanding that might come up when I'm explaining confinement while actually refactoring this code and actually I forgot to mention one thing that's important to understand when we're actually talking about using mutexes to synchronize go routine access to a shared resource so right now even though we're using uh a mutex here we're not necessarily using it in the ideal way because if you look at this code here you can see that process data is the longest running part of this code right but we don't necessarily need to have this result array locked off for Access while we're processing the data so right now when the go routines are waiting for the lock they're needing to wait for process data and that's why they're take that's why they're needing to wait for such a long time but actually the critical section doesn't necessarily include process data because there's no shared resource being accessed in process data so this isn't necessarily part of the critical section critical section is actually this this is the critical section so we could actually move the lock here and this would mean that the lock isn't being held while this long running process data is happening and that would also improve the performance of this code with the mutex and although this video isn't necessarily about mutexes in the sync package I feel like it's important to mention that like when you're actually needing to lock off resources you should try to do it in the most optimal way possible like if you have the lock being held up by things that don't necessarily need to be a part of what's locked off then you probably need to refactor your code at that point so actually we'll go ahead and run this too just might as well so if we open a terminal now it's not going to take 10 seconds anymore because the go routines aren't going to need to wait for the process data they're just going to need to wait for when the lock is held and this actual shared resources being updated which should be relatively quick so we'll do go run main.go and you see now that it took two seconds but that just brings me to another point that is necessary to understand like there will be times where the code that's encapsulated in the lock is going to take a decent amount of time like enough time to actually degradate the performance of your application to actually run so this is why I'm trying to help you to understand ways that you can improve the performance of concurrent code and go and also ways that you can avoid using locks and mutexes completely because at the end of the day there's a good chance that they could potentially bottleneck your code so let's get into confinement so we're going to refactor this code to make use of confinement so we'll go ahead and remove the lock here because of course we're not going to use that anymore and we're actually going to need to change a few things in this code so for example or for starters the main function this results slice we're going to want to give it a specific size because we need the indexes and you'll see what I mean by that in a second so we're actually going to change this to result equals make so we're going to use this make function here and we're going to make a slice of int and we're going to actually set the size of the slice to the length of the input here so basically you can use make to create a slice with a specific size like here when we're click when we're creating this slice here the size is pretty much undetermined from our perspective but if we use make we can create the slice with a specific size which we're going to set to the length of the input so we can go ahead and delete that one and now we have a result slice that's the same length as the input slice and now we want to actually use the index here so before we weren't using the index because we were just putting this underscore here here this underscore is a placeholder for the actual index but we can actually change this to I which would be a variable containing the index of the input that we're currently at the index that the for Loop is currently at and now here's the main part that we need to change like a very quick explanation of what confinement is is basically we want to confine the go routine to a specific part of the shared resource essentially so for example like we need to be able to concurrently append or write to this result slice here right so each individual go routine needs to be confined to an index in the slice and it might be difficult me just explaining this right now while writing the code but don't worry I'm going to go over this in more detail after we print after we finish refactoring this so for now we're gonna set this to a pointer to the value at index I of result which in simple terms this is the memory address we're going to pass in the memory address of the box that will hold the element at index I and I know that sounds really confusing but don't worry I'll I'll go over this further soon so for now we're going to change this to this right so that means we need to go to process data and we actually just need to change this to a pointer to an INT again I'll explain further soon and now we need to actually change process data as well so of course we're going to leave the deferral of the weight group dot done method and process data is still going to do the same thing it's just going to call this process data which takes two seconds or whatever and then for results this is where we're actually going to need to change and actually we're going to name this result destination because it's the destination of where we want to put the process data so we'll do result dist and then that's just means result destination and we're no longer using a pin here because now process data can't see the actual full slice before process data wasn't confined to seeing only one index of the slice before process data was was able to see the entire slice so there was no combinement before but now we're confining process data to seeing only the index of the slice that it has access to from its parameters So within this function's lexical scope it can't access this result here because we're only passing in the actual location in the slice where we can actually put the process data so we'll go here and we'll just delete that because now we're just literally setting this particular location to the process data and I know it sounds really tricky right now like that that's confinement but I'll ex I'll explain more don't worry so we're going to go ahead ahead and just run this make sure we save first and we'll do go run main.go now as you can see it's taking two seconds for us to process all of the data and we get two four six eight ten so if we run this again it there's gonna be there will be no inconsistencies it's gonna it's gonna be correct every time and actually we can we can actually check and see if the code has race conditions by doing go run race and we'll do main.go and as you can see here there there are no race conditions if there were race conditions uh there would actually be some some information printed to the screen here letting us know that there are race conditions in the code so we can leave that so that's confinement but actually now it's time for me to actually explain to you what this actually means like like why is this confinement so let's get into that okay so you'll recall that before refactoring the code to make use of confinement we had this process data function here and basically we were using the go keyword to Fork off go routines to run this process data function concurrently now before the introduction of confinement into our code we were giving every go routine access to this entire result slice so with multiple go routines they were all able to see the entirety of this result slice and they were all trying to append to the same result slice concurrently so if you can imagine that all of these are running concurrently prior to the introduction of the lock we were getting race conditions because we had no synchronization mechanism in place to safely mutate this result slice so in this case every go routine can see the entirety of the result slice and within this code we can actually mutate the result slice there's basically no restriction to what an individual go routine can do to this shared result slice now confinement is basically the idea of ensuring information is only ever available from one concurrent go routine for example in our case each individual go routine really only needs to be able to access a particular index of result because we're only storing the one value in the result slice so if we can coordinate the indexes that each go routine can access we can easily coordinate writing to this result slice concurrently so let's explore this a bit further so let's go ahead and draw out our results slide now remember in the new code we're giving our result slice an actual length this time so now it's not just an empty slice it's an empty slice with indexes that we can actually use as locations to actually store the processed data so right now this result slice looks like this so the index is all right in yellow here we have 0 1 2 3 and 4. and these are the indexes and in go these are all going to be initialized with zero values so this is what this result slice looks like when we first initialize it and for the sake of comparison we'll put our input slice here and the input slice has the same number of indexes 0 1 2 3 and 4. and the values in the input slice are one two three four and five now the reason we added the length to this result slice is because we want to have have specific indexes that we can assign the process data to so when we actually get to this for Loop here we're taking the index here and we're using it to determine which place in this result array that this data here is going to be stored so for example this data here at the first iteration here this data is going to be one and this index is going to be zero so at the first iteration this is the data in the index right so in our results array we want to store the process data in the corresponding index in the results array so the process data at index 0 here will be stored at index 0 in the results array so we're passing in one here and we're giving the memory location of this particular index of the result slice and if I make the mistake of saying array instead of slice just ignore that we're dealing with slices here anyways so this here is a pointer to the index I at result so for the first iteration for example this points here to index 0 of the result array and since it's a pointer this is the memory location of this part of the results array so maybe the memory address of this index in the result slice looks like this so when we're actually calling process data here and we're spinning up this go routine we're giving this go routine not the entire result slice but just the memory address of a specific index of the slice so when we get here this go routine is going to be confined to the lexical scope of this function right and within this scope we no longer have access to the entire result slice we only have access to the memory address of one index of the result slice so we're only able to alter this one index and here in this for loop we're controlling which Index this data needs to be assigned to at East at each iteration so each go routine is going to be confined to their particular index and that's what I mean by confinement basically it's impossible for this go routine to access a different index in this slice right so there can't be a race condition because the other go routines it's going to be impossible for them to access this index of this slice they're all going to be assigned their own individual index of the slice and they're going to be confined to writing to that index only so this makes this actual code here concurrency safe so we don't need to implement anything in regard to like mutexes so since this is a pointer here that we're receiving which is just an address in memory and again if you don't understand pointers it's critical that you understand them for this video so please go watch my pointer video but yeah since we're receiving the pointer here and then we're de-referencing it here and then assigning a value to that in the case of the first iteration here we're assigning the value to this index here so it would be 1 times 2 which would be 2. and we're doing that for all of the indexes and since the index of input and result are synced we just need to iterate through input get the index for input and use that to assign the end the index for the results slice so remember confinement is the idea of ensuring information is only ever available from one concurrent go routine so in our case for the first iteration here this index is only ever available from the go routine that we spin up at this first iteration so the go routines that we spin up at subsequent iterations here they will never be able to access this and that's confinement each go routine is confined mind to its specific index of the result slice and it can't access any other index therefore we don't need to use any synchronization mechanism to ensure that we don't get race conditions and we don't need to worry about the code not being safe for concurrency and I hope that makes sense okay so congratulations on making it this far I know this is a relatively cumbersome topic and it's quite difficult to get through videos like these so if you were able to make it to the end and you were able to learn something then you're ahead of the pack in my opinion but anyways if this video was in any way helpful to you or if you learned something valuable please don't forget to leave a like leaving likes really helps out small channels like mine and I put a lot of time and effort into these videos so yeah I'd appreciate it if you leave a like and if you're interested in more content like this feel free to subscribe as well if you haven't already ready and with that I will see you in the next one
Info
Channel: Kantan Coding
Views: 11,351
Rating: undefined out of 5
Keywords: golang, go programming, go concurrency, concurrency patterns, confinement
Id: Bk1c30avsuU
Channel Id: undefined
Length: 34min 16sec (2056 seconds)
Published: Sun Sep 24 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.