Golang UK Conference 2017 | Jack Lindamood - How to correctly use package context

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello all right a little bit about me so I've been writing go for about four years currently software engineer at which so twitch if you're unfamiliar we like stream video games most of our backend at twitch is written in go there's probably hundreds of miniature git repositories that we use from a various different micro services and I kind of like to think that because of the vendor experiment where you have to copy your code into your repository we we grow lines of code Big O N squared so we may be more glad to code than even Google just because you have to copy all the code into every repository so mono repos can't keep up with their big o n asymptotic time so I'm gonna break this talk into kind of three kind of sections the first section is gonna kind of explain context what it does how its kind of built and context itself is mainly two different things one of them is about request cancellation and the other is this added part called context up value so I'm going to kind of talk about each of those separately and give you best advice on like how to use it when to use it the right way to use it the wrong way to use it and hopefully you'll learn a lot so why does context exist that's kind of like step back a bit this would be the experience report for context so ideally every long request that your service is going to do should have some way to timeout or end that long request right so like context doesn't exist you exist yet let's just think about the problem there so your service is doing micro service calls you want to timeout those requests right or you want to cancel them early so there needs to be some piece of information that kind of holds that timeout and maybe your request comes in to your micro service or your HTTP server and you say all right I don't want to take more than three seconds and you're kind of further down your call stack like how much longer do you have left for this second call that you're doing like 10 or 20 call RPC calls down the stack so you kind of need to store that somewhere you kind of need someplace to communicate that and the problem gets a little bit expanded if your service is making multiple RPC calls in parallel so what I have here is kind of like a simple diagram of a request comes in right it does one RPC call to get some information and then from that it's gonna spawn three concurrent RPC calls and kind of aggregate that information and then spit out a result so an example of this might be someone requests something from your site so you go to the user database you get like user information and then you kind of populate various parts of the user from different micro services and combine that into one big request right so everything's working great this request takes 40 milliseconds everything's perfect you still don't need context right what if RPC 2 fails so usually when your service is doing micro service calls one part of it fails you can't really assemble that request that the user wanted so you have to fail the request that the user gave you but the problem here is that RPC two fails at 30 milliseconds but you're waiting until RPC 3 finishes and your request takes 40 milliseconds even though you probably should have just failed it at 30 right so you say okay I'm just gonna like end my request right when our TC 2 ends right so RPC 2 ends you're gonna say oh I can't make this request because one of my micro service has failed so I'm just gonna fail the whole thing the problem with this is that you have RPC 3 and 4 still doing computations still going out there consuming resources right when you can't really do anything with that information because you can't give the user a response when our PC 2 failed so the ideal situation would be all right if RPC 2 fails maybe I can kind of communicate that to my parallel request and say all right you should stop doing whatever you're doing because I don't really need your response anymore and then return like a an error to the user so this is kind of the bigger problem that creates how the context solution created so if you were to solve and go you probably have some object to signal when a request is done you'd probably need hence on like when the request is going to end and of course being go like you have to throw annals in an API because why else would you right go that's a joke so and then someone was like well we got this thing and they don't really want to add thread-local information I know let's just like shove variables in there too that happened so I'll kind of talk a little bit about that you could argue that request cancellation is a subset of the variable problem right because the cancellation itself could be a thread local variable so context implementation details it is a tree so a tree is a graph without cycles a tree of immutable context nodes and cancellation triggers down the graph and value lookup triggers up the graph so I'll give you like an example of one created so this is just an example of context chain there's like a background context which is the root of all your context chains and you say all right I'm gonna make a context that can cancel and then from that make a context that times out in 5 seconds and then from that make like two other context is and then from that attach user data not too much not too important to kind of read the code just kind of going through the way it works after three seconds this context chain obviously times out the three-second notes of the 32nd note it says it's done at five seconds the five second note times out but which may be less intuitive is that the C six second node also times out because it is a child node of the five second right and then the value you know it also times out because it's a child of the six all right so the API that is created it has two areas one of them as three functions they're gonna give you basic wind and your sub request information so the done and the error are the two main ones that I honestly use myself the deadline one may or may not be there but it's basically there if you make a timeout context and you want to timeout an i/o operation so you say oh this con sex is gonna die in two seconds so just set the time out on my i/o operations at two seconds to be done and there's got this extra thing about request scope variables which I'll talk a lot about later so when should use context that's interesting so stepping back from context a bit it certainly makes sense and that I think we could all agree that any RPC call you make should have an ability to time itself out right because if you're writing a library and it's part of an RPC call it just makes sense to give you user and ability to stop that call right and not just timeout but remember the RPC one two three for example I showed before you don't want to just timeout an RPC call you also want to be able to preemptively cancel it early if something weird happens and you're like oh I don't even need that information anymore right so any API that you're gonna design should have that in that ability context is the go standard way to a community communicate that information so how do you create a context that's the first thing you do I would call context up background at the beginning of every RPC call so one way I've seen people do this that I would consider not necessarily the right way to do it is to call context up background in your main function and then attach that context to like a server that creates a derived context from it that's kind of weird ideally you're just calling context up background RPC call comes in and you call context that background and you kind of like thread that context to how do you integrate it into your API so there are two common ways to put context into your API one of them is if you just have a raw function call so the first thing I have there that both of these are in the standard library the first one is in the dial function so you're like dialing an address right to connect and the context there is the first function the first parameter of a function call that might block the other way to do context is when you attach it to something called a request struct so the request struct pattern is often used when the parameters to a function would be so large that you want to just make a struct out of those parameters and then execute things on that struct right so you could imagine if go had keyword arguments maybe you would need a request struct you would just kind of keyword them all but like with HTTP there's you know dozens of things you can set so it wouldn't make sense to kind of have like an HTTP function with like dozens of parameters so you kind of want to make a request struct out of that so I believe that's the reasoning there and you can attach context to your request struct and you should probably name your context etx just because that's what everyone does and you want to be like everyone else right where to put the context hmm so the mental image I would give the context variable is flowing water in a river right so water in a river doesn't stagnate anywhere that would be like don't drink that water it just kind of flows through and your context should live with your call stack so I don't mean like call stack versus heat necessarily I mean you know RPC call comes in you've got your context your call stack goes up and down up and down and then your call stack goes back up ideally the context kind of goes away with it so it's not like referenced by anything it's not sticking around the one exception of putting it on the call stack is when you need some kind of like request struct and you're maybe passing that request drop to a channel or you're passing that request drop to like it's the HTTP dot request in that case it's OK to attach it to a struct instead of like having it as a variable that's kind of passed around but in that very specific use case of attaching it to request start and your context dies when your requests dies so they're like married together until death do they part and the request dies context dies too one big caveat about the context package is I would get in the habit of closing any context you create so contexts don't actually have a closed function they have when they can be closed or canceled they return a canceled call back get in the habit of calling that canceled call back even in the case when you're done with your RPC call one reason is that when you create a context with a timeout internally it uses something called after funk and this after funk won't garbage collect until the timer executes and so you don't want this thing sticking around for a full second or a full like year if you have a miss configuration when your request is done in like 10 milliseconds right so the cancel context doesn't have this issue but I would get in the habit of closing your context when you're done with them so always close them and I kind of had a mental note there on the third bullet point if you could think of it as if your context gets garbage collected and it was cancelable and you never cancelled it that's like a chain of events that should not happen request cancellation so let's go into it when do you cancel something early so I'm gonna kind of deep dive into the error group package it's a really awesome package that puts are very easy to use abstraction around context that at least most code at twitch uses when it makes parallel requests or concurrent requests so let's deep dive into it so when you create an error group its destruct it's got a bunch of private variables in it right so you're probably not going to make destruct initially you're gonna make it with this thing called with context and it's going to create a cancelable context and give you that context back so the idea is that you're gonna now do operations with that context and you're gonna tie parallel or concurrent requests with that error group so rather than call the go keyword you would call the go function on the error group which is going to internally call the go keyword with some weight group wrapping and the important part of this is it will execute your past in function and if your past end function returns an error which in this API means that like something weird happen so you probably don't need the other stuff it will cancel the context that was stored in the error group and it will do that sync dot once so only once will it cancel that context and it actually stores the first error you got back as something you will later return when you wait so the idea with error group is you make an error group it gives you a context and then you use that context for as many parallel calls as you want right and you call wait and wait will block until either it gets in and it'll block until the wait group is finished and your wait group should finish on the first error if all of your parallel requests are respecting context oh my goodness what is this so I'm not gonna go like fully into this I'll kind of leave it up there a bit I guess absorb and maybe you could like look at it later but this code is doing a lot of like very complex stuff and a short few lines of code so what I'm doing here is I'm executing two requests at the same time one of them is a fast request obviously by the name so don't this is obviously a great way to optimize your API you just rename it to fast request no then they'll file a bug report no no you rename it to slow request so when it's slow you just say well obviously fixed so it's executing to request a fast request in a slow request right and it's going to execute both of them at the same time and if either of those requests return an error or a 500 error code or I guess anything above a 500 they return an error and it should cancel the other request if it's still taking a long time so you could imagine inside a similar pattern you're executing you know multiple multiple multiple requests at the same time and you're thinking oh how do i time these requests out I don't really get just uses the airgroup abstraction and it makes it like pretty easy to kind of timeout and return the first error you get from all of them so this is how I do almost all of my concurrent requests okie-dokie requests scoped values oh boy the API duct tape so is duct tape like a universal meme or is it just kind of like a southern thing I'm not really sure if you're unfamiliar with duct tape it is this great thing that fixes everything one of my story time haha one of my first memories involves my dad duct tape and the brakes on his truck and I don't I was too young to remember the cow but I just remember my dad might the truck and duct tape and brakes pads so I don't know it really can't fix everything I mean stepping back I mean that sarcastically but a lot of people use contacts that value is duct tape so this is the implementation of context of value it's pretty simple it has an embedded context that a parameter to the struct so it kind of inherits all the context functions and it just changes the value function to check its current value and if it's not there just kind of pass it down the chain so using this embedded context is how most of like the internal context is are actually implemented so advice number one is to scope your key space so context stop value you don't want to pass in like just a string or an integer because another package might for some reason use the exact same string or the exact same integer right so to scope it you do four things you create a private type so the private type is a way to make sure that no one else like creates something that looks like your private very alright so create your private type then create a private variable of that private type and and then you expose getters and not setter so I called them with rather than set and the reason for that is context is supposed to be immutable right so if you use the word set it kind of implies not immutable but if you use the word with then it kind of implies oh it's you know it's making like a context with this thing rather than like setting something on a context so there's an example like you have a request ID and you can either get it from the context or create a context that has that request ID in it context of value should be immutable this is another important part so the context package is designed and mutable like so all of its immutable when you create a context that has a five-second timeout you can't change that to three seconds it's five seconds if you want a three-second one you make another one that maybe inherits from it so you should keep that philosophy with context value so I would not store a value that you expect people to mutate directly that kind of goes against the immutability idea that context has and you don't want to store a value that can't be accessed thread safe or in a concurrent way because that's not gonna really work because the whole point of context is to ideally be able to do it and lots of different go routines what do you put in context values so you put request scoped data so what is a request scope value request scope value is something that is derived from information in the request right and information that goes away when the request is done so that's a pretty broad definition let's kind of give some examples what is clearly not request scope so anything you can make and func main is clearly not request code right because you made it before you even got a request so like a database connection and some kind of logger you set up in func main clearly not request scope so put it in context of value but then you might ask the sub question what if you take that logger and you attach a user ID to like a sub logger can you now attach that sub logger to a context or if you take a database connection and you kind of attach a request ID or a user ID to that database connection which is tied to a request can you now attach that and I'll get into answers for that later stay tuned so what's the problem with context up value everything is a request scope value which is somewhat ambiguous when you try to figure out how to use it so you can imagine an ab function right that just takes a context because really why do you need for amateurs and if they don't want to give us keyword arguments but they want to give us context I'll do it and we'll just make our own this is obviously like a really bad idea right and if you think back to it what is request scope data and what is not request scope data like almost all your variables in your service are requests scoped like they come from the request and you probably don't need them afterwards so it's a pretty broad scope of what you can put in there so this is why I just like context that value this is kind of a summary of some code I've actually seen where someone had a user ID in a context and they wanted to know if the request user was an admin so they got this like authorization singleton they got this user ID from the context and they kind of like return to boolean from it the problem is that it's really hard to tell from the function signature what's going on right because when you're reading code you're not like going into every single function you read you're kind of like skimming through the code if the function signature had instead been is admin user and it takes a context and an explicit user ID an explicit Authenticator it's really really clear what that function is doing right it's really really clear that the function probably blocks it's probably cancelable and it does something with a user and something with an authentication service so if I want to stub it out or mock it or test it it's really clear what I need to kind of change and with the first one you have you have no idea you gotta like actually read the code and then oh you might refactor it and things get messed up so big reason not to use context about you all right so I get it it's really bad to use context up value what can you put in there it should inform not control that's kind of a phrase that I made up which feels like it fits with the philosophy of context value it should never be required input for documented results and if your function or API won't behave correctly for whatever you kind of think correct is based on what is or isn't in a context value then you're probably obscuring your API too heavily all right so what things don't control them let's give me some concrete examples a request ID is a great example of something that would easily fit in their context I value right so if a request IDs in there it doesn't matter if it's request 5 or 10 or 20 you're not doing if statements on it right like if it's not there you know so what it doesn't have a request ID what are you gonna do with it are you gonna like actually do logic on a request ID so that's probably ok logging so logging is usually not documented into your API right it's kind of like this extra thing you say oh there's logging but I'm not gonna like document when I log and where I log usually you don't do that so it kind of fits in what I would recommend is rather than attach the logger to the context attach metadata to the context and have your logger extract that metadata from the context and the reason for that is you kind of separate out the immutable part and because you can use that request scope to data for other stuff like the the logger isn't really the request scope thing like the request scope thing is the user ID or the request ID that you're logging things that clearly do not clearly control right so I said inform not control database connections authentication services these are things that are very very clearly super important to your API they control your logic so even if your database connection has a user ID on it or a request ID on it like I would advise against putting it into your context at value because it's probably your code won't work if your database isn't there I don't know unless you're like really psychic so creative debugging with context value so there are some creative ways to use context of Aiyar with this blog post a TV trace and context debug patterns that I guess you can now clap on medium so a CV trace it's built into the standard library and what it is is it's an optional struct that you can attach to your context and it has function callbacks on it so I only listed three because I only have like slide space for three but there's like a dozen of them for when first bite first header like requests done got it from a pool right and you can attach information to those requests and if you look inside net HTTP way deep inside the request struct there's this private write function where it will extract a trace from your request context and if there's a trace there and you defined a wrote headers callback it'll call the wrote headers callback and this is a really great example of inform not control right so the idea of this struct is not to control your logic on when the headers were wrote and the HTTP library is not going to do special things based on when the headers were wrote the idea here is to inform the API user that you wrote headers so that they can like log stuff or trace things or debug why things are failing or things like that this is another really weird example of requests of value so go off to use request up value for dependency injection during testing I think I've never actually seen this before so I'm not I would advise against doing it it seems like it's they're mostly for testing but the idea is that you can attach a client to the contexts that you call the OAuth functions with and the OAuth code will kind of look into that context and extract a client out and like use it for stuff it's it's it's very indirect I would much rather the optional client be on some kind of struct that as part of the request itself so that's really weird all right so reasons people abuse context up values so I'm very sympathetic to really large code bases with lots of middle layers and lots of middleware abstractions where you're like oh I don't know how to put this user ID that I calculate way over here and like fiddle it down way over here I know that's like how do you like rearrange or api's to do that and see just give up I'll just throw it in the context and just kind of like assume it's everywhere else so very sympathetic to that and it feels like your API is cleaner right like it feels like I don't have to put this user ID in my API and just kind of like throw it in the context but your your if it's required logic then it's it's still part of your API you just don't have it as a function signature it's like this hidden thing inside the context so I would argue that putting things in your context that value because you don't want to observe I actually obscures the logic of your API and there's ways around this with lambdas like lambda style functions or callbacks things like that so summary of context that value really great for debugging information required content value parts would obscure your API so it kind of advised against those I think it's I believe it is better to be explicit in your API so make explicit parameters and try not to use it like it has a purpose so I would go back to inform not control and if you're doing informed style operations and that's like a really great use of context I value but if you're doing control style operations summary of this so a lot of information here summary any operation you do that is long or blocking or i/o should take a context right I mean it should have some way to stop itself and context is the go standard way to kind of do that so it should probably take a context right error group is a really great abstraction on top of context so I would strongly recommend using it if you're doing parallel calls it just makes it really easy to kind of cancel them all at once and time all that back out always cancel your context when you're done with them because you might end up changing that to a timeout context and putting in the wrong value and now you've got a memory leak really bad so always cancel your context context that value shouldn't inform not control so don't make it required logic scope your context value keys that's the part where I had like the private variable type context itself should be immutable and thread safe and that should also be true for any values you put in it and a context should die with the request so they are married together when the request is done the context should all be done to that is it thank you very much I have a little time for questions if you have any [Laughter] thank you I think there's a mic back there do you use context in the in the context of like transactions with databases and so on to cancel as well we'll keep track of them yes so we use a lot of different databases at twitch and I do attach my context to those requests but you could argue that it's maybe ambiguous what actually happens if you cancel a database transaction I normally don't cancel the right ones unless it's totally okay to have like weird data there but canceling a read operation on a database should unless you have like a really weird database it should be okay so I definitely attach context all of that that's jetlag talking okay cool thank you
Info
Channel: GopherCon UK
Views: 9,417
Rating: 4.8105264 out of 5
Keywords: golang, uk, london, computer science, google
Id: -_B5uQ4UGi0
Channel Id: undefined
Length: 32min 21sec (1941 seconds)
Published: Mon Sep 18 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.