GopherCon 2019: Marwan Sulaiman - Handling Go Errors

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hi everyone my name is Marwan I am a software engineer at the New York Times and the title of my talk is handling co-heirs a little bit of background about myself is that I've been a back-end engineer for about two years doing primarily go and before that I was a front-end engineer doing a lot of javascript and reacts at at brooklyn startup and before that I learned modern web development using Ruby on Rails and I have no computer science background and the reason I'm saying this is because I'd like to make an assumption and this assumption has become a little bit less true over time because go has become very popular what is this assumption go is not your first programming language a lot of the people that I've met and also that I know have come to go from a different programming language and why is that important why is the fact that go was not your first programming language ever what does that matter at all the reason is is that if you're coming from a different programming language you tend to focus on the syntax you already know a lot of computer science concepts you already know programming you've done it before so you tend to assume that the concepts are similar you just want to learn how to do something for example a syntactical question could be how do i parse a JSON string and go if you know how to do that in JavaScript you already know what that means you just want to learn how to do it and go but imagine if you're a beginner to programming in general does that question even make sense someone who's a beginner to programming might not even know what a string is let alone JSON and so the conceptual question you should be asking yourself is what is data serialization and that's a better question because that kind of looks at the concept of things before you look at to how to do it syntactically and go and even more importantly how does go approach data serialization another question would be how do I import a library and go so for example if you use any other language you probably know how to import other code but that might not make sense if you're new to programming the better question to ask yourself is what are dependencies maybe more even broadly what is code reuse and finally same thing in it with errors a lot of times if you come from a different language you say how do I catch an error and go because you already catch a lot of errors in many different languages the question you should be asking yourself is what is error handling what does it mean for something to go wrong and how does go define error handling and this way you get to kind of learn the concept a little bit more so the conclusion is that I encourage everyone who is a beginner to a new language regardless of whether it's go or something else is that you should probably learn like a beginner so that you can focus on the concepts because there could be subtle differences between conceptual things in programming and so let's look at the concepts of errors and go we've already looked at a little bit of that so it's going to just going to be quite helpful from earlier today is that the first concept in a proverb is that errors or values what that means is that to the go compiler and the type checker an error is just as important or just as unimportant as a string or an integer or anything else meaning that if you have a variable declaration that is an error and a variable declaration that is a string go doesn't care about the difference so the Pro is you get to define the importance of an error because it's just the variable declaration because it's just an interface implementation you get to take say what kind of information does this error have and what kind of information doesn't on the flip side you get to define the imports on the air and that could be really bad if you're feeling really good on one day you could write a really good error but if you're feeling lazy on one day you might even completely ignore that error much of the discussion and the proposals about the indigo community that Russ touched upon today is the fact that what can the go language itself do to encourage you to maybe do a better job or like to kind of remind you to make sure to handle your errors correctly so the way I think about it is that if I want to narrow the scope a little bit to think about maybe okay errors or values but values can be anything can i define an error in a narrower scope that might not apply to all programming but maybe most abuse cases the way I like to think about errors is that they're kind of like IO errors are usually read from somewhere and they're usually written to somewhere but where there read from and where they're written to really varies it's still broad and context matters so the things you should be thinking about is is your program a CLI tool is your program a library that other people are going to use or is your program a long-running system where errors should not terminate the program who's consuming your program is it another program is it people and how and so all these questions really should make you think about errors and then handle them differently there shouldn't be one true way to handle or error all errors in all of programming ever so let's look at a couple of examples through a simple program every time I learn a new language I'd like to make a virtual sandwich and let's look at how we can handle errors and go from the ground up and then we're also going to go and look into more complex systems the sandwich we're gonna make is the one at the top at the bottom-left it has three ingredients let's imagine the ingredients come from different markets the first one is from Peter Jose the avocados come from Whole Foods and finally the bread comes from shop right for shoppers I forgot so let's imagine that we have all of these libraries and all these like shoppers and different libraries and each import path will give you the ingredient that you want so for example you can call buy avocados buy eggs and vibrant and for the sake of brevity let's imagine they all return the same type signature they all return ingredient but they could possibly go wrong the way we would write this function if we wanted to write a good ingredients function so that we can make our sandwich is that we're just gonna call out each library so whole food is that buy avocados Trader Joe's that buy eggs shoppers that buy bread and then we put all the ingredients in a slice no pun intended and then we'll return them to the caller the question is how do I handle errors and go that's the fundamental question right and because they're just values this is how go defines the air interface which we all know by now what this means this is the minimal building block this is the least that you can do to satisfy the air interface but it certainly shouldn't mean the most but for now since we're beginners we'll accept it so we'll do the famous if aeronaut no just return it up the collar if you think about arrows as IO you can think of it as a buffered i/o you're just passing it back and let's try to run the sari before we run the program let's look at the main function so they get ingredients if it goes wrong it means we really have a missing ingredient and we don't want the sandwich unless it has all the ingredients so we can safely panic here because we want to exit out of the program and panicking is okay in this particular instance now if you run the program and actually something did go wrong imagine we get this error ingredient not available the question you can ask yourself is well which ingredient imagine if you go to a supermarket and you say well can I have some eggs and there the person who is gonna answer you they assume you just asked for something and they assume that you know what you just asked for they don't have to be robotic and say eggs are not available they could just say well we don't have it right so there is a question and an answer but in this instance we forgot the question we only have the answer maybe we have stack traces if you come from a different programming language stack traces tend to be the go-to thing to see where something went wrong to get that first question however our stack trace does not show the good ingredients function that's because only planets show track stack traces if you remember errors are just values they shouldn't contain stack traces they're just values and therefore we only panic to the main function and we have no stack trace so the second proverb is that don't just check errors handle them gracefully meaning add a little bit more information to the error itself make sure you know the context and you don't lose the question you asked so this comes to decorating errors and go which is a very common convention how do we decorate errors and go how do we add more context to the to the error itself and go on and go you can do from that error format and in 1.13 you can add the % W so that you preserve the original error but the idea is the same you add your unique message and then you extend the actual error and this way when something goes wrong you can look at your unique message and that's the where and then you see the answer which is the Y for the last decades package errors has also been pretty a pillar as restaurant today there's a lot of different packages but let's assume this is the one we're gonna use where we can call Aires da trap and it works very similarly you pass the error you pass a unique message and that's how you get Aires now what we have to do is we need to put unique messages so on every error statement or error block you'll see that we're calling errors that wrap but each string is very unique good not by avocados is not mentioned twice as only mentioned once so if we run the program again we're finally going to get our full answer could not buy eggs because the ingredient is not available so this is the where or the what and the why so that's great and more importantly we kind of don't care about stack traces anymore we kind of got everything we need and we're don't you know we could just kind of ignore the stack trace so the third thing I thought when I was learning the new concept is that stack traces tend to be for disasters a lot of go programs is that you print a stack trace if you really don't know what happened and stack trace is in general they're hard to read they're pretty long and ugly they're hard to parse so they're kind of difficult to query for unless there are special programs that parse stack traces for particular programs and at best they say where an error happened and not why where did the error happen but they don't tell you exactly why so stack trace is also not really efficient and what if we want to act on an error what if we want to say look okay we know why something went wrong but we don't know we don't want our program to completely fail maybe if the eggs are not available at Trader Joe's we're okay to pay a little bit more and buy them at Whole Foods and so errors just like any other interface they're comparable and a little so we can compare if an error is not if it's the type of that as Trader Joe's are not available we're just gonna get our boiled eggs from say Whole Foods and so the takes takeaways are is that we can handle errors gracefully we don't have to panic everywhere we can trace the error back to the code and we can act upon an error the question you want to ask yourself is this enough and for a lot of simpler programs Emily you can especially if you're writing a CLI you're running a simple tool this is usually more than enough but what about complex system what if you're writing a program that has to be maintained for a while and the answer I believe is no and here are further things you can do to your go programs you can categorize errors by severity so just because an error happened doesn't mean it's an error that you should pay attention to the severity could be an info level so if you think about structured logging this is the severity I'm thinking about the structure or the the log level could be info debug warning or error so this is you can categorize all your errors by that severity and you can embed it in the error itself you can categorize errors by type so think about maybe an unexpected error or a user error like user is not found the password is incorrect a bad request and authorization request you can add application specific data it shouldn't stop at just the typical things if you're writing a program that is a food delivery app you should probably include things like the restaurant ID the zip code or where things are most importantly in my opinion the things that I've learned is that you can query all the above just because you you creative the right error doesn't mean that the goal here is done the goal is that you can later on inspect errors and query and ask really specific questions and be able to get those answers back given the data you put so part of it is the data you input and then the other part is the is the reading and the querying that you do later so fast forward one here when I was learning all these go concepts I started working at the New York Times and my first theme was the accounts API if you're a New York Times subscriber which I hope you are you come and you want to change your password you want to change your email maybe you are suspend your home delivery paper or whatever it is that you want to do that has to do with your accounts you go to your profile the profile is a UI for an in JavaScript and react and it talks to one API now like a typical micro-service architecture company we we have one gateway that talks that a UI talks to but this gateway talks to a lot of underlying api's because each one is separated by concern so have a paper delivery API that will tell you something like when is your paper supposed to get delivered today is it at your doorstep is it missing is the weather really bad and your paper gonna be you know arriving late we have a typical login and register API so that you can sign in or sign up and you have a subscriptions API we have different products so the subscription API tells you if you're a subscriber or not and maybe emails API let's say you forgot your password we want to send you an email and the accounts API talks to all of them under the hood if you think about this for a second this is really not that different from making a sandwich it's a more complex system but at the NDA at the end of the day it's the same basically you have the sandwich which talks up which talks to that get ingredients function which talks to a bunch of different libraries under the hood so it's really not that different the only difference is it's too long if it's a long-running system and so instead of panicking we don't want to halt the program we want to log and monitor so the gate user function you can think of it as the equivalent of the getting greedy ins function it takes a user ID and we call three different API so the login service to validate that the user IDs corrects the subscription service to say if what kind of subscription you have and then maybe the delivery service to say what what time are you supposed to get your paper today and then we return those kind of just like the ingredients the subscription the delivery time and a possible error because something can always go wrong and now instead of panicking and instead of having it at the main function we have a particular handler which is a typical HTTP handler and this way we just say if something is wrong I just want to log it and now we want to monitor our logging because it's not a you know it's not a program that's written once and run once it's basically gonna happen over time errors could come from a lot of different places and this is the way we monitor it we use Google cloud and we look at our monitors to see what's wrong however everything is the same level if you focus on these errors one error could be accounts it's not active another error could be user enter the wrong password and something like user abandoned requests and more and more and more what I want to do is that I want to separate errors that I'm expecting from errors that I'm not expecting I want to inspect if my system is working properly I want to inspect what's going on and my logs and everything is kind of like one big log file and that's really hard to filter through so I set out some refactor goals the things that I want to do is that I'd like to filter unexpected errors anything that could be wrong that I haven't accounted for I just want one click of a button and I want to see all of this on a group by arrow types sometimes something like if there's bad weather in New York and I want to say well show me all the errors that are happening they're expected you know we're expecting that sometimes the delivery paper is not being delivered or maybe it's late but I want to see them so I want to categorize by error types and I want to be able to answer specific questions if your product manager came and told you how come there was something wrong on September 30th with this particular zip code with these particular subscribers you unable to go through the logs and basically write a query and find all of that and be able to get the answer quickly so I started looking for inspirations and if there is one advice I give you today is that you should probably read these articles they're awesome the first one is called air handling and up-spin by Andrews around and Rob Pike and the second one is called failure is your domain which is actually inspired by the first one and they both give you a similar idea about how you should structure your errors the idea is that you have your domain and your domain is your business logic and that business logic usually talks to a lot of third party services could be a database client it could be maybe to internal services and a third party library and they all return their own special errors and each error what you're supposed to do is that you probably want to convert specific library errors to your own domain error so that your domain think about them like your handlers could talk to errors in the same way an error that is not found it doesn't have to say if it's service a not found or if it's service B not found or if it's a sequel not found to give you a little bit of an example imagine you have a database interface where you can say you can get by ID and it returns a database record and a potential error but let's say your program is complex and you use multiple different implementations you have a sequel implementation you have a file system implementation a MongoDB implementation your handlers but they have to struggle with is that whenever it is called DB together they don't implementation you're using so they have to check against every error not found and each one is a different type for example sequel has an error no rose if something is not found OS is has OS is not exists function which returns a boolean and then MongoDB the MongoDB client has a error no documents so imagine doing this everywhere all over your code and what the article suggests is that you should probably abstract that away and so that your handler could just say is this a not found error instead of like worrying about what the implementation is so how do we do that think about maybe creating your own package for your errors within your codebase whether this is shared with other libraries or not at the end of a it's up to you the whole idea is that you probably need to do what makes sense for you and the first the second thing we want to do is we want to probably create an error type as we learned earlier it could just be a string or it could just be an integer it could be any type but a struct allows you to add more information over time and a allows for you to not break anything if you just want to add extra information what do we put in the error struct the first thing you might want to put is an operation an operation is that unique message that we were talking about an operation is basically what we're in your code something happens and the second part you might want to add is a kind which is the category of there is this a not found error is this an unexpected area this is an authorization error and the last part you want to embed the original error so that users could retrieve it again you don't want to lose that kind of contextual information that's the why that's the wrapped error but perhaps most importantly these are basics what you might want to add is application specific data and that's where things get interesting that's where you put all your metadata inside your error struct so how do we construct an error type just this because we defined it we want to learn how we constructed and the Ben Johnson article he suggests that you could just basically instantiate it like any other structs there is not no just return to L or the error and then like define all the fields that you want to define and the up-spin case there is a helper method called e and e takes basically a very attic argument of anything and then we basically iterate through all of these arguments and check the type and based on the type we actually fill the error struct and if the type is not recognized this is an error within our code base and we can panic right away this way means if you do an if error not nil you can just call errors and pass all the variables that you want and it doesn't have to be like four or five lines you can pass the kind and the embedded error and the operation and even your application specific data so let's look at an operation what is an operation it's a unique string describing a method or a function multiple operations can construct a friendly stack so let's take a look at how that's written instead of writing the unique message on every return statement you can define the operation once at the top of your function so for example in the gate user we have something like a counter get user so it might even describe the package where something happens we inverse the the uniqueness of the message it's not it's not about the message itself or the wrapping message is not what the function we're calling is it's the function that's being called from but if we trust that whoever we're calling is also doing the same thing like for example the validation function has also its own operation then we're not going to lose that context we're gonna see we're gonna see like kind of a stack trace of what operation called what operation if you're wrapping all the way back to the caller and then if the air is not no you can see that we just call air see and rewrap everything now in the errors package we might want to have a helper function and this is where I guess to the idea that you can really do whatever you want you don't have to write this function but for me it kind of made sense if I give it an air struct I want to get a list of all the operations and the way to do it is that you start with the top-level operation and you do a typecast if the internal error is also wrapped it's also the same type then I can basically call myself again in a recursive way and pass the underlying and then I keep populating the slice until I have a full stack of rap tears so the air is down ops returns something like that which is just a slice and it's also easily Marshall to JSON so what you get is you can think of it as a JSON array of strings and these this array of strings literally is your stack trace but your static looks something like this now something that you've written yourself and a traditional stack trace looks like that and in my opinion it's a little bit uglier again if I want to look at the previous one this is really easy to read and it also has only your domain logic and that's also a better part if you think about the error stack trace it has a lot of stuff that's happening internally we then go like even the panic itself which is you can see on the second line it's a runtime slash panic we don't care about that we just want to see where something went wrong in our domain and not the entire stack trace so the benefits of an error is that on peas you have a custom stack trace of your code only it's a lot easier to read but what's easier is that you can easily parse it and query it you don't need to have any special program to query over an array of JSON strings you can write a little script that just parse it and then query anything you want and you can also because of this type of query aim you could look at the impact it has in your application so to give you a bit of an example of what you can do with this kind of slice is that you could quit your Streck and this is kind of like pseudocode in a way you can imagine if you have some sort of a database of all your logs you can do something like select star from all of my logs where the operations include this field which is log in to validate what your attorneys return a list of logs so imagine if your validation function had a bug and your product manager engineering manager says well what's the impact why it's a function in the middle of a big codebase who is calling this function what handlers not just in general not just in the codebase but like say in the last 30 days were there any requests that reach that code just because it's being called from 10 places doesn't mean there were 10 errors maybe only one user kind of wanted to validate their who they are and so you can do this kind of complex query parsing one final thought on the operation field is that you can make your stack even simpler by removing helper functions from the stack itself and if you look at this helper function it takes an operation as an arguments and it returns in a potential error and this way when it's wrapping something it's not wrapping the operation of the function itself it's wrapping the operation of whoever got you know past at this operation in this way you can actually get reduced stack traces so if you have a helper function that's being used in a million places it's always going to show up on the stack and you could just easily remove it so you're in control of your own stack and that's kind of the point now let's look at the Erb kind the air kind groups all errors into smaller categories they can be predefined codes like HTTP which is just integers and they could be error codes like G RPC or you can define your own so let's say we pick HTTP we can say a kind not found is an HTTP status not found which equals 404 and a kind unauthorized is whatever and unexpected is whatever so this idea is that when you want to talk with your errors API you can say is this a kind not found basically we're even though we can we can say is this error kind 404 it's probably a little bit easier to say if this is something that is not found and if we want to later change what these codes are under the hood we don't have to change our code everywhere so we're abstracting the actual underlying implementation we're just kind of talking as if it's an interface to extract an error kind we basically unwrap the error to see if it's of the type if you're given an error that has never been wrapped and it's never been defined within our domain that's probably an expected type and if the kind is not zero because we know it's an integer under the hood this is the implementation we can return it otherwise we can pass the underlying kind meaning like we can recursively go until the first error that was wrapped that had a particular kind that the code base asked for one thing you can think about is that if your kind is an HTTP status then you can propagate the error all the way back to the clients and now you're sending that error category across the wire so just by defining an error struct you're actually also defining your response error so the get user handler has something like this instead of just you know saying this is a 400 or this is a 500 what we can do is we can call HTTP error which takes a string and we don't have to say exactly what went wrong because this is a client we maybe we don't want to expose internal system error information but we can pass the kind you can see the third argument is called errors kind and that returns an integer and that's exactly the status code we return all the way back to the client whether it's a UI or anybody else but if you look at the error of the logging it still logs logger dot error and that's where we come to the severity and that's the additional information you can even add to your struck if you're using a logger like clogged grass which is used by a lot of open source tools you can embed the level right away you can just say well this error is pretty expected this is the operation this is the kind but here's also the log level and so now the gate user let's imagine that any validation error is something we're expecting meaning like the user did not pass the right cookie or they did not pass the right ID and this is always something we expect because users maybe put their wrong password or something and so you can see in our errors that a function at the top we passed loggers that info level directly into the error struck constructor and that's all we care about we don't care about what people do with that we're just saying hey this error is expected it's info but the handler now could do something a little bit more complex could it could check the the the error itself it could unmarshal it and it could just look at what the level is and log based on that you probably want something like a helper which is which is something I usually call a system error something that the system understands so you call logger the system error which you define yourself and then it will basically look at the severity but more than the severity you can add its information to the log so it's like an example of that the system error like any other function that we saw so far it can basically typecast the error and to the type but if the typecast is not correct well that's an error we don't know what it is so we know the level is log or error because we don't know what it is before we actually get to figure out the level of the error we can add more fields because we know an error has a lot of helpful information we already defined our struct it has the operations it has the kind and even more applications to specific data so you can automatically add that to all of your to your log entry before you even log anything and then finally we could do a switch statement on the errors level so imagine we have another helper which is called errors dot level it basically does a recursive call to find the first level it can find and this way we can say if it's a warning level we can call an entry that warned if it's info we call entry info and if there's nothing maybe we could default to error again this way we have the operation we know where something went wrong we have the error we have the entire information of an error maybe if if your again whatever type of app you have if you're a music app you'll have like the song ID and the album ID right in the error itself right and the log actually not just there and then you also have the right level so let's think about like some potential application specific data again if you're a music app you can add the artists and the song if you are a delivery app you can add the zip code wherever you're delivering to if you're an email application you can add maybe where this is coming from and where it's going to and if you're maybe a sleep delivery app you could say like what time did the order happen what's the restaurant ID and the New York Times I've added a lot of information when it comes to something like where did a delivery go wrong what kind of user was it what's the user ID where is the zip code all of these things we can just add and then query an art on my own time so these are the some of the questions you can answer you can say show me all the delivery errors in zip code 2 2 4 3 4 or show me all the food delivery errors by seafood restaurants maybe you include the restaurant category in your errors show me all the errors that happen while trying to stream the latest Beyonce album you can just basically ask these questions without doing anything so the takeaways are that the error interface is intentionally simple and that I encourage everyone to design an error package that makes sense to your application and no one else by adding your own stack by adding your own information to it and then you don't have to basically copy and paste it everywhere because it only makes sense for your application and the quote that I use from all these proverbs talks by Rob Pike is that a big part of all programming for real is how you handle errors so I hope you're always thinking about it thank you

Info

Channel: Gopher Academy

Views: 18,339

Rating: undefined out of 5

Keywords: gophercon, golang, software development, programming

Id: 4WIhhzTTd0Y

Channel Id: undefined

Length: 31min 54sec (1914 seconds)

Published: Tue Aug 27 2019