GopherCon 2017: Kavya Joshi - Understanding Channels

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay good morning everybody has everybody doing sufficiently caffeinated cool so my name is Kavya and I'm here today to talk about channel specifically to open them up and show you how they work now I assume everybody here is familiar with gos concurrency primitives goroutines and channels in fact I bet a number if you came to go and choose go for its concurrency that actually true how many people here primarily use go for its concurrency so that's a fair number of you so you don't need me to stand here and tell you that these are important that channels are important but there's another thing channels are inherently interesting what do I mean with an example say we have this simple task processing function all does is get a list of tasks and for each task calls the process function now the process function itself you can assume it's some long-running function that makes a network request and what have you now if we have to take this program and scale it to a helitack we know how to do that we bust out our favorite concurrency tools goroutines and channels and implements a simple task queue model so we have a buffer channel to store the tasks we then run a fixed number of worker goroutines to process those tasks and finally we'd arrange for the main function to get the tasks and send them over the channels to the workers each worker receives a pass processes it and repeat simple but the point here is that single channel in this single program already illustrates some very interesting properties about these data types right first of all they are GU routine safe secondly the buffer channel can store values and pass them from one core team to another and dub so in first in first order and I see most of you know less but that's channel can also cause these guru fiends to block and unblock now that's a very unusual list of properties for a single in a female seemingly innocuous datatype so what do you do when you think something like this it's like wisdom B shows you a magic trick what do you do well you take a second to appreciate it and then you try and figure out how it works so that's what we're going to do today we're going to delve into the internals of the runtime the go runtime to try and understand how channels have these properties and how they work how they work to do that I will first talk about what happens when you make a channel we will then look at the mechanics it sends and receives and finally we'll take a step back and look at the design considerations okay then so in order to use a channel you have to first create it and we do that using the built in make function now you can create buffer channels which have a non zero capacity or you can create unbuffered channels or synchronous channels today I will primarily be talking about buffer channels but in any case making a channel gives us this of this black box this channel struck with these proper to you right we'll come back to the last few properties later since they have to do with runtime behavior in some sense but looking at those first two properties if I told you I wanted a girl routine space first in first step struct what would you say you'd probably say Kavya I just use a lock with it just choose a cue with a lock and it turns out that's exactly what the channels do making a channel allocates this each chance truck it has some number of fields that implements a queue it has some other fields that we'll uncover later and it has a good old mutex now the implementation of the queue itself is pretty straightforward there's a circular buffer or ring Walker so this is one that conceptually wrapped around itself and the send and receive positions from my buffer are tracked using those two indexes send X and receive X with a quick example say you make a channel of capacity 3 you have this buffer with 3 slots and initially it's empty of to send X and receive X to 0 after your first and Hugh send X is incremented after 2 more n to use the channel is full note that at this point against end X and receive X have the same value this is very informally the definition of a circular buffer and finally this is DQ so this time the receive index is incremented now this each chance truck is allocated on the heap and make a returns a pointer to it so when you make a channel this fancy each this fancy channel type you get it's really just a pointer under the hood it's a pointer to this hm and this is why we can simply pass channels from one function to another this fight goes passed by value semantics and both functions end up and human DQ from the same underlying buffer right we don't need to pass pointers to channels for those semantics because the channel is a pointer under the hood okay so at this point we have a channel now let's go ahead and use it this is our program with an all the non channel related code removed for simplicity and we're interested in how that send and receive operate now I'm going to assume that there's a single worker so we have a single sender and a single our receiver for the sake of simplicity but everything I'd say translates just as well to the case of multiple senders and multiple receivers I'm also going to call goroutine running main g1 and the girl routine G to execute the worker function so free at the beginning of the world the sender comes along first and so g1 comes along first and and past zero at this point what needs to happen is pretty obvious it acquires the lock because it's going to modify that a chance truck it then performs an NQ like we saw before and finally it releases the lock and goes on its merry way there's a thing to note here is the actual nquing is a memory copy it copies past Bureau into that slot of the buffer okay now g2 comes along and receives from the channel and it mirrors those operations so it acquires the lock it's DQ's again this is a memory copy so it copies whatever is in that buffer slot to the memory corresponding to the variable P and it releases the lock and goes on its merry way pretty simple so the thing to note here is that this copying into and out of the buffer is what gives us memory safety when we use channels rights the only memory the both that both goroutines access the only memory they share is the H Aniston and that is protected by the mutex everything else is just coffee this memory this is also why our favorite go concurrency adage makes sense you've all seen this before it yeah ok so back there a program g1 has sent g2 has received then we have an empty channel again so things that's pass g2 is processing is taking it a really long time so it's just sitting there and processing while g1 keeps sending the g1 sends another task and another task and another task and another task now it can't actually send tasks for right the channel is full so what happened here well g1 of execution is paused and it's resumed once this space in the channel so it's resumed after errs receive now we knew that what we are interested in today is how how does this pausing and resuming of goroutines work and the answer is by calling into the runtime scheduler now don't let the name scare you the concept is quite straightforward goroutines are simply uses based thread so they are created and managed by the go runtime not the operating system and userspace threads are generally preferred to ought to OS thread because they tend to be less expensive with respects to resource consumption and scheduling overhead and so go chooses userspace threads and the runtime is responsible for implementing them nothing you say threads have to actually run on OS spread and so part of the go runtime that's responsible for that is the scheduler and it uses an mm scheduling model so the idea here is you have some number of guru teams and a few operating system threads and the scheduler multiplexes those go routines onto the few OS threads so in this example we have two OS threads and up to six go routines because the scheduler takes care of swapping in and out so this go routines at a high level goes mmm scheduling is described using tree structures M represents an OS thread G represents a goroutine and P holds the context for scheduling what that means is P holds to their fixed number of peas and these peas holds the list of goroutines that are runnable that are ready to run and these are called run queues right so the peas holds the run key and anytime a goroutine needs to run on an OS thread that the less thread must hold on to one of these peas because that's where it gets its workload from in some sense everybody with me cool so why do we care about this though the reason we care about this and where the power of this MN scheduling model really comes across is what agora team needs to be paused like when it performs a blocking channel send so what happens here is that channel that the channel send on a full channel called into the runtime scheduler the call it makes is go part so it calls into the scheduler and at this point when we enter as a scheduler the currently running store routine the green and yellow that is g1 right because g1 makes a call and what the scheduler does is it changes g1 state from running to waiting and then it removes that associations between the operating system thread and saguru team effectively freeing the os threads to run a different routine and then the group the scheduler goes ahead and does just that it pops a goroutine off the run queue and scheduled with to run on that OS thread so what we basically did here is walk through a contact switch right when we entered the function when we entered go park jie-won was the running go routine but at the end of this function when when this function returns a different go routine is running so this is cool this is cool because this is good for performance what we've done is blocked our goroutine were blocked g1 but we haven't blocked the underlying OS thread right we said these OS threads they're expensive so the group to the go runtime strives to not spawn and to not manage many of them and it does this by doing this clever switching out of the go routines and freeing the OS thread to run other Gurr teams cool so this was great we've successfully paused order teen but we still have this problem that we have to review MIT so once a channel receive happens and the space in the channel we want g1 to run again at that point so how do we do that the answer to that is she wanted steps ups and stay for assumptions before it calls into the scheduler okay so it turns out the a chance struck stores waiting senders and receivers and it stores them using the structs of a Basu doji struck that it's simply another struct and this pseudo G contains information about the waiting goroutine what is this information this information is the waiting go routines that has a pointer to the burgeoning and it also has a pointer to the element it's waiting on so the element is waiting to send or receive so in this case our program g1 the routine g1 it creates this pseudo G right so G over here is set to g1 and the element it's waiting to send its task for so creates one of these trucks it puts it on the channels sent queue effectively setting up the states for a receiver in the future to use that information to resume g1 and finally it calls them to the scheduler and the scheduler positive cool so where's our hero or hero in where it's g2 okay so our receiver finally comes along and the receiver is going to perform our receive on the channel and this is the state of the channel at this point right there's a full buffer and those awaiting sender so what JQ does at this point is at first Deak used the elements from the buffer right it resists it receives task 1 and then it pops off the waiting sender it pops off the pseudo G it's infuse tasks or into the buffer ok this is interesting we'll talk about that but it includes it and then finally it has to go and resume g1 it has to set G 1 to runnable no threes in g2 itself infuse the elements is also an optimization it's so that when g1 finally runs g1 doesn't have to mess with the channel g1 doesn't have to acquire the lock right so let's go insect you want runnable again here because we're messing with score routine state it means calling into the runtime scheduler the culture is go ready this is g1 telling the scheduler to make G to run a g1 runnable so g2 calls into the scheduler and this time the scheduler sets g1 to runnable so is the Gura keen off on the corner state waiting so it switches that back to runnable and then it puts this on a run queue and then it returns to G 2 and G 2 the receiver will continue executing now since g1 is on a run queue it will run at some point it will eventually be scheduled and it will run so this is pretty cool at this point we'll basically seen how both sides of Cal sends and receives work but you know what's even cooler what happens when the receiver comes first what happens when the receiver comes first and it finds an empty channel well we conceptually know what happens here descend the receivers execution is taught and I'll be resumed after a sense how do how does that happen well we now know how to do that as well this time GPU creates the pseudo G and sets up the state for resumption and then calls into the scheduler so it's paused and at the end of that sequence of operations this is what the channel looks like system we have an empty buffer and a weeping receiver the reading receiver is g2 and it's waiting to receive to that to that variable T and now the sender comes along and the sender sends past what might happen here well just reasoning through it what could happen is g1 could put the tasks in the buffer and then it should call into the scheduler to review g2 or we can be smarter the key insight here is we know the memory location waiting for the receipt so we could just have the sender sends there directly we could just have g1 rights to T directly seems reasonable but this is incredible this is incredible because guru teens have their own staff they have separate sacks they typically don't overlap and the agora team never reads - or write rights - or reads from another Burton's back never except in the case of these channel operations now the reason to do this again is it's a performance optimization again here by doing this when G - when g1 finally runs um it doesn't have to mess with the channel it doesn't have to acquire the law in this case is also a fewer memory copy right instead of the sender writing into the buffer and then the receiver copying out of the buffer into T we have the sender copy directly to the receivers stack this is this is actually very cool so at this point we finally fully sorta understand how channels operate and to end the mutex the buffer the pseudo gqueues the calls into the runtime scheduler this cross girl routine sax manipulation those make up the secret sauce that's the big reveal of the magic trick now at this point we can talk about how these core concepts apply to other channel operations what happens in the case of unbuffered channels or select hints - it's pretty much what you expect them to be but in the interest of time I'm not going to do that go is written in go and it's open source is all like very clean and very good go and you know how the fundamental knowledge needed to go and look at the source code of channels so if you're interested I highly encourage it it's pretty fun what I'd like to do now instead is take a step back from the how and asked why why our channels implemented the way they are and two themes emerge from the implementations simplicity and performance a simple cue of the lock is preferred to a more involved implementation like a lock free implementation the latter may be more performance but not enough to justify the increase in implementation and code complexity on the other hand things like calling into the runtime scheduler and Cecrops go routines stack manipulation those are motivated by performance by performance games but they come at a cost right so the cost is in a complexity of implementation the implementation now has to carefully account for things from memory management things like garbage collection and stacks ranking but in this case the the performance twins are persuasive enough to justify performance over simplicity and so in the implementation of channels we see these astute trade-off between performance and simplicity and maybe there's a lesson there for all of us and the systems we build that is all thank you all for coming out today I hope you leaved with with an understanding of how your favorites or my favorite at least synchronization primitive work and an appreciation for the sophisticated machinery behind the deceptively simple API of channels thank you you
Info
Channel: Gopher Academy
Views: 73,302
Rating: 4.9640288 out of 5
Keywords: gophercon, golang, software development, programming
Id: KBZlN0izeiY
Channel Id: undefined
Length: 21min 45sec (1305 seconds)
Published: Mon Jul 24 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.