How I build APIs capable of gigantic scale in Go – Mat Ryer

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I think there's a there's some great reasons to use it there's some times when you can't use it and the big one for me is if you're using Google App Engine standard environment that does not support G RPC and so I'm gonna talk about you know how we can avoid some of the ugly code that you can see when you have to do marshaling and marshaling all the time and you know all that rapid repetition I'm gonna char bail out today so this is me I you know do computer things and I like to contribute to open source where I can and if anyone is interested in joining any of the projects you know let me know it's always looking for more maintainer and contributors it's awesome does anyone can show of hands has anybody created a gopher on gopher eyes me great and if you haven't check it out you can build even though that I made that sarcastic comment about being an adult you can build a cartoon version of yourself a little gopher so you should do that probably and then please follow me on Twitter I'm in a battle with my dad because he doesn't think I can get more followers so I want to just want to prove him wrong frankly this is how I got here so that was now's a good day no no but I'm pleased to be here actually and also thank you so much for being able to speak English honestly that's really helpful for me so this is how I build gigantic scale api's and things well first of all actually you don't really have to do much you know it's really a mindset and I think once you once you get used this approach and some of the limitations that you have to work around you can achieve this these things that will that will scale you know pretty big and it's not too difficult as we know go runs extremely quickly and there are lots of it like kubernetes is a great example of technology that will actually manage your scale for you as well so and a Google App Engine standard environment which is I'm going to really be focusing on today has the similar kind of thing although it's it's a bit different and it's a bit special and you have to really understand why that is to make the most of it so I'm gonna do a quick what is go quick intro to go just in case there's anybody who hasn't used it or maybe hasn't seen it so it's go if we write it Golan but we don't say Golan we say go it's it's a modern programming language and I'll explain a little bit more why why that's important it's a strongly typed language so it's not like JavaScript where you can just do anything you like you can't do anything actually I said once at a talk that in it you can't do that many things with a girl and was meant to be a go talk and it got a laugh because people thought I was having a go at it or you know mocking it but actually it's kind of kind of a good point it go deliberately has this cut-down simplified feature set which stops you just can't do complicated things with it actually you can but it helps you avoid complicated things but I think is the is the point so what what why why do we care about how modern something is then you know because it could just be it's cool so we like it and it's fun so we use it and it's just you know and that actually that that's not insane right if you enjoy writing code if you're enjoying yourself you're gonna do a better job so that's sort of actually isn't an insane reason but really the key thing is if you think about C when C was designed we were using computers in actually quite a different way we weren't we have basically single core CPUs and we you know we weren't really running them in that not everybody was really running them in the way we do which is at scale so when go came along they started to think about that and they actually built into the language things that allow you to really maximize the cause that you have on the on the machine and also the the networking stuff in the standard library does you know it's it's it's pretty solid it's great and that helps when it comes to building things that can horizontally scale and one good example of this of how good the standard library is was at where HTTP 2 came out we go got it early and got it for free you didn't really have to do anything and you could suddenly you know you things just got better so go deliberately doesn't have a load of language things that we're sort of used to and again it's really a nice thing in the end if you feel like you're limited and you kind of are but what it means is everything's very clear you know there's no magic there are there are lots of different ways to do things to many there's a bit of that but there's there's there's nothing really that you can you can't build extremely complicated kind of type hierarchies and things you know all that stuff that we we we kind of got used to in other languages you you sort of don't do it in go anyway and you avoid the problem of course you can still write terrible code but but we don't we try not to and for anyone that hasn't seen it you probably can already read go if you've never seen go before and you saw that code anyone even probably non-programmers would be able to guess at what's happening there but it's a sea based language so it's familiar so this saves you structures you can embed fields here I've got this greater structure with a format string inside it and this is how you do methods this is a function and the first the the receiver type at the beginning that's that's kind of like puts that method puts that function on the on that type so that's how we encapsulate data go also has my favorite implementation of interfaces in the world and this is this is also my favorite interface still and you know the its it has this kind of duck typing approach so you don't have to explicitly implement an interface you can as long as you as long as a type has the right methods you can use it wherever you need that wherever it asks for an interface of that type that's also nice because you can flip it around and actually write interfaces after the fact as well for the code but other people have written where they didn't include an interface it doesn't stop you from having an interface still that matches it another thing is we format all the go code so that all looks the same and the goal is you know when you join a new project you should be able to get become productive pretty quickly and I think that's kind of the the goal okay so if anyone wants to learn more about go if you if you knew I'm happy to answer any questions you have on Twitter again remember dad because it's having a go at me and or look at the tour as well and there's loads of stuff online but but yeah ping me and I'll I'll get into it so I'm gonna be talking about Google App Engine standard environment and building api's you know that you can be confident or gonna be ready to scale and and various little bits and pieces inside that and I think the first thing you have to think about is in what in what context is your code gonna be running this is the first thing and there's there's a really simple trick I think for anybody to get this right and basically you all you have to do is imagine that your code is going to be running behind a load balancer so each request you can't guarantee that subsequent requests are going to hit the same instance of your code so if you assume that suddenly it changes the way you might write things you can't you know you can't use maps inside the process to store data that you wish to share because it's only in one instance you know so it kind of changes the way and this is the even if you even if you think I'm only going to run one one instance of this this helps with design anyway that that assumption so that side so be the thing to do and this interface is your friend this is this slide is in every one of my talks I've ever done in fact even if I show my family holiday photos that I've been write this I still got this in it I do I leave 20 minutes on it grant granddad's not bothered but it can you know you can shut up grandpa so this so yeah you've probably seen this but anyone that hasn't the top thing the handler describes just a single method sort of HTTP and then we have this handle of funk type here which is a it's another type but it's based on a function and then that function implements the mess since it matches and it just calls itself so it's kind of like Inception that's fine it's like Inception see and then that happened that's like sci-fi what yeah so this is how I lay out my project basically all of them I built even even if it I'm building a tool like it's gonna run as a server I build it as a package I have server go I have all my routes in one place usually and go you group by responsibility so you'd might have a users go that's got a user type maybe some helper methods around users maybe there's some errors in there and it all goes in one file that's kind of how we organize and go I break that rule when it comes to the routing because it's very whenever we think about a something that we need to look at in code there's API based we think I do in URLs and so that's a that's great to just have one place to go to to seed all the routes and and take it from there and then you build you can build a tool that imports the package and and runs it so and I'll show some examples of that so I make a server struct and this has all the dependencies the shared dependencies for the server so you might have loggers database connections actually you know the Rooter of course and that's how we're able to decide what code gets run when a certain endpoint is hit and other dependencies that you might have and it's and and I sometime I try and avoid constructors functions that just create things because you can't you know it's you can be hiding some some magic in there right we don't know if the constructor is going to be kicking off go routines we don't really know what's being allocated so I like to just kind of expose the type and and it's very clear when you use it but I do break this rule often when it comes to this because if these dependencies are required then it just makes it really easy you know it's really just about communicating with your team or yourself in the future not like through time or like telling telling your future self things would be no point in that anyway just write it down okay so yes all the roots look so it's obvious you what you're gonna handle and what code is gonna it's gonna be called this is where in Tom's talk on GRP seeing kubernetes which by the way for people if you're watching this on youtube in the future hello I hope the futures going well and also check out Tom's talk because awesomely shows it shows a lot of code of like really obvious stuff that you have to do every time encoding decoding the marshaling stuff and things well you can avoid that by pop-pop enough popping a method in popping a function in that does it for you and so this is the signature you can you imagine what the body is but if not there's there's lots of examples of this kind of thing online but notice it's a method on the server because of course this might want to use some of those dependencies same thing for decoding and if you are building a JSON HTTP API then maybe this is the whole body you're just going to create the decoder from the body and just decode it that's it but with this little abstraction you sort of future-proof you a little bit you may be in the future you change this without touching any of you handle a code and then the handlers themselves which are the the functions that get called in response to a particular kind of request with a particular URL in them method usually they're also should be methods on the server that's how they can access the dependencies and you don't you don't fiddle around then with the signature this stays true to the HTTP handler saying if if you care about that which some people do this is what I always do actually III make it so that it's actually a function that returns the handler and that allows you to do some kind of per handler setup so in this case I'm pausing some files in template there and then just because of closures I can I can just use that template in the handler that gets returned this is nice because everything's in one place is it's one function that has it builds the dependencies it needs and that could be doing anything and it could even be doing expensive things because this only gets called the first time the instance starts starts up you can also take arguments here too if you if you have it as a function so you can see here handle template I'm taking a name in so that returns a handler that's going to handle that and you can pass in other things too there may be other dependencies that are only specific to that particular handler or a few handlers and you don't want it on the server and really this is about storytelling you know because you could just stick all the dependencies on the server but if you do that it's just not clear where they're gonna be used necessarily whereas this does you know makes it very obvious and very clear and of course strongly typed and I think everyone knows by now we don't use context to to pass dependencies around so that's that's important too so we get type checks and things you know for example we can't call handle template without giving a name it's just a compile time error so and it's very obvious what what this is gonna then do does anybody put your hands up if you write unit tests when you write code okay now keep your hands up if you write the test code first yeah that yeah honestly it's if you haven't tried it do try it sometimes you can't do it sometimes you have to you know you don't really have it in your head what it's gonna be although writing is starting to write your tests first it helps with that process because you really start thinking about what am I going to build what do I need what as a user what would I want it to be like you get to have that conversation quite early and there's help in the standard library for testing servers so in this code I create some mock dependency I'm using the new server helper some of the dependencies I'm not gonna use in this test so I can just put them as nail that's quite nice too because if you did then use it unexpectedly you'd get a panic so you'll see oh yeah I need to either add the dependency here or maybe I've made a mistake and then you make a excuse me you make a real request using HTTP new request and you can set headers and bodies and all sorts of things there and then you using the HTTP test package you make a new recorder and then I just call serve HTTP on the server pass in the recorder in the request and then I could make assertions about what happened when that request was being processed this is this is very nice because you sort of testing really your whole stack you're testing the routing because you're making an actual request with a real path you know you're testing that the methods correct you're also testing any middleware that you've got going on you sort of testing everything that you that you do so it's quite a nice balance and it remains a unit test you know you mock out the dependencies so there's no you know there's no it's not actually going to send any emails you've they've knocked out and I have a tool if for those that don't want to write mocking code there's a little tool called Mark moq which is on my github which just generates see the mocks for you so you can use that if you like yes so there's some more things about testing oh and this is packaged as well is something I'm trying to it's a kind of light version of testify testify is kind of quite big and I think quite daunting now this it does a lot and is is kind of you know a really really simple cut-down limited kind of version but it gives it makes writing tests cleaner like you can see here I just say it still is equal you create it and pass in the t so if it's gonna fail it fails properly and there's other cool little things like if something fails if you put cop if we put some comments here like HTTP dot new request there has a comment if that fails it goes and grabs the comment and shows you in the in the failure so you actually can kind of communicate with yourself as well I do this I have a error handler since the HTTP handler interface is so easy and simple to implement I tend to do it a lot which is kind of the value of single method interfaces I'll go on about that all the time as well in other talks if you're interested but single method interfaces are brilliant for lots of different reasons and you should look up why because they're kind of cool what this one does is it just takes an error and a status and then it just reports that error in this case I'm just using HTTP error but that could be doing you know responding with a Jason error payload or something this is this is useful for in the case when you have those handlers and you're doing the a lot of the setup in the body of the function before you return the handler if something goes wrong there there isn't really a mechanism to report that and to make it simple so that you can keep just wrapping methods you don't want to really return an error so this is a nice way to do that you can just essentially what will happen is the if you hit that endpoint you just get the error if if that handler was unable to kind of get set up which of course your tests are gonna be checking right so it's not just you don't you're not just deferring it to runtime you actually would check that and of course the server type is self that structure should implement the HTTP handler and often it just defers it to something else maybe a router or something else but implement you know it just becomes a handler type that you can use wherever wherever HTTP handles are used like when you say I want to handle this this path with this handle you can also do this and you know I used to kind of I was I was you I used to be against this but it turns out actually it can be quite useful particularly when it comes to app engine you can roll your own handler interface yourself and change what that serve HTTP is everything else kind of stays the same but you you get to kind of solve really common problems in one place for example in App Engine until very recently you couldn't use the the context that they added to the request you had to use the you have to explicitly pass a context around so then you can do that there and also I like to return an error and just have one place that handles kind of just critical you know a status internal server error errors and and in your handler code it's nice to be able to just do if errors not nil return error or wrap it and return it something like that yes the downside is but you can't then use these types wherever you use with the handlers and if other packages that you're using are built on HTTP handler they won't work because suddenly you've changed the type you have to kind of build some glue to stick them together and that's what the glue looks like kind of just a function you pass in your Handler and it does the common stuff and returns a handler here this return HTTP handle a funk but thing that's making use of the fact that HTTP handler is a single method interface and has that handle of func counterpart which you can just use in this way and you can also make that handle a funk for your own type as well and then you get to use that and you in your own world and they look something like this so this is middleware this has been talked about quite a bit in go but it's a nice little pattern and it's worth knowing about so what essentially you have a function and you pass in a handler and it returns a new handler but it kind of wraps the original one so you can do things before the handler then you call you ask the handler to serve the request and then you can also do things after and I kind of love this and then to use it you just kind of chain them like this so you can write different things you can you can because it's HTTP handler you can pass in the whole server to these middle ways to or passing just individual handlers as well and you can test these quite nicely as well and the best thing you can use defer in there I love defer use defer you can also do some setup like in the in the middleware to write because it's just again it's a function and we're going to return a new thing you've got some time to do stuff like maybe there's something expensive that you want to get ready for this that you need or you can do it early so once you have your server then you need to be able to actually run it and you know lots of people especially when they're new to go they say that they can they feel like they can contribute to a project but they can't start a project they feel like they're starting something is actually the difficult piece and I understand that but this is this is how you can get started with writing a at all now these properties do get a bit more complicated like for example you'd want to take a flag so that you could control the address endpoint and there's other things you might you might want to initialize in here and things but but you can save that for later there's no need to solve every problem you just want to get something running and up then just do this and all we're doing is using listen and serve and we use our new server that we've that we've made since its its supports HTTP handler on App Engine it's a little bit different because you don't really build a tool you actually push a package and App engines magic runs that for you so in in App Engine it's not packaged main where the tool before as you'll see packaged main that's because it's gonna then be turned into a binary that can be run in this case is any package and you use the init function instead of the main function and all you do is HTTP handle and in this case I want to handle every possible request that reaches this instance with my server and on App Engine that really is enough I tend not to do much more than that at all and remember the router inside the server is with the routes is gonna take it from there so we're not we don't have to list out all our handlers here we've already done that and it's already tested in a in our other package so there's lots of options for hosting once we've built our awesome test driven API we we then want to put it out in the wild and there's there's loads of different options you have now actually I'm going to focus on Google App Engine standard edition because I kind of love it it's it's a bit weird but you I can very quickly build a service push it there and it's there and they've just run forever I've got a few little apps I've like an idea for a like a a little startup and I'll build it quickly usually just kind of a bit rough and ready put it in App Engine and it's there today still I've this I've got one that's been running for years and basically App Engine it doesn't cost anything it's all it doesn't run until requests start coming in so it's cool I can go I could go to the website now and you don't notice it you wouldn't notice that it had to then spin up instances and things standardization what does it do well abstracts away the environment of the infrastructure it obstructs away the infrastructure this this is why you can't use G RPC yet on App Engine standard edition because it doesn't really give you a server that you can then you know have persistent connections to you it's abstracted it's just you give it the code and you App engines gonna make sure it can run that code for you and when the requests come in so you don't you know we don't have to we don't really have as much control so if you need control you want to use G RPC things like that then App Engine standard edition is probably not the right choice but you can go from zero to planet scale for free not for free you have to pay but you you don't have to do any work for it it provides all kinds of things so it provides the actual managing and running the instances gives you data and file storage it gives you task queues cron jobs memcache monitoring logging debugging gives you a few things like that and essentially you this is how you do it you write your service you write that little connector piece you add an app yeah Mille config file and then use g-cloud tool after you've downloaded the SDK to to deploy it and it's really simple and you get things like this so you get like a dashboard in the cloud console you can see these are this is gopher I to me gopher ëismí lives on there if everybody in the room right now went on gopher ëismí I guarantee it would you well I'm not gonna guarantee it but probably you'll all just hit the site and it'll be there but if no one's on it at all there'll be nothing running for gopher eyes me and that's kind of cool I think you also get logging as well which is actually turns out to be quite good once you dig in it takes a bit a bit of time to learn it it's a bit messy it's a little bit like professional but it's it's pretty cool you see you can write you can write logs out here you can see the requests that were made any errors happen kind of report it it's cool and it's pretty affordable you can see this is an example site here so there's two point six million read operations on this and it's one euro 36 you know the instance is running time is is what takes what costs the most and this is this is after ten days and you know it's four to fourteen euro so it's kind of cool unless that's actually doing stuff you know so go because of because of the nature of go it being very quick you know there's no runtime you don't have like a JVM or anything it's just a single static binary that has everything it needs inside it it also starts up very quickly so here's the thing in App Engine as I said when no one's been on the site and it's no instances are running and the first request comes in if you use Java or - there's a you notice a delay waiting for the first request and then it happens and you know you can you can ask it to always keep one instance going just so what kind of try and mitigate that in go you genuinely you cannot notice the the delay I don't notice it I have to look in the logs to say did this did this start a new instance it tells you in there but it's kind of cool this is an example app Yama file so it's just you say I'm gonna use the go runtime the service is going to be default you can have many different services which you're cool because you can sort of scale them out independently so maybe you've got a user's service that gets hammered all the time for everything but maybe there's a like forgot password service that you've got that no one really no one forgets the password anymore because you know computers that's not right is it you can also do handlers at the bottom as well so actually I didn't finish explaining that API version go on and then the handless section at the bottom is how you route things to the go app basically so it's kind of just that that's all you need you but you can do this this is very common to have a static folder as well for just things that you want to just serve out and you can just actually build through you can just a free websites hosted using the free quota and you'll never you know if you use CloudFlare to aggressively cache the website you'll just never see any you'll never have to pay anything for a website you can just get a free one I wrote about it in a blog the other cool one at one of the cool services they're available if you use App Engine is the datastore it's based on Google's BigTable and it's got some strange things about it but but it's extremely fast it's you know it's proper Google scale you get amazing speed but there's some trade-offs to that and I'll talk a little bit about that now No now so first of all represent data as a stroked kind of maybe obvious but this is what we do so I'm modeling here the conference a conference so you need a kind you need to a string that kind of like a table named entity name so the kind is conference and then I have the fields for the conference the only weird thing here is I put the key inside this entity because if I'm gonna marshal it to Jason or persist it in some of the way I want I want it to have an ID so that's how you can do that and this little tag at the end tells it not to put this in the data store it just use a hyphen to say don't keys are a bit weird it's not like in like Postgres where you'd have a table that one of the columns is an ID and that's a primary key you don't have that keys and entities are separate completely if you get things and put things using the key then it's strongly consistent which means if you I'm gonna update my email address put it and then I get it immediately it's going to have that updated email address if I'm querying that's not the case for querying it's eventually consistent just because of the nature of the fact that it has to go and update the indexes when you make a change so there's a kind of that's the trade office it's really optimized for Mac for really rapid read but then you sacrifice a little bit of accuracy because of that it's just physics I think yes so you have three options when deciding how to have a keep what key to use you can just let let the data store come up with one for you if you don't care and that's fine you should do that probably I would say if unless you've got a good reason not to and if if you do have a good reason not to you can use an explicit int or a string and you might want to do that if you have a if if say there's you want to you want the keys to be predictable based on something like it probably not but let's say a hash of the email if you use the create a user a hash of their email as the key means that you just have to win this we're trying to sign in you just have to hash the email and you can get that extremely quickly rather than having to do a query to try and find a user with that email and by the way if they've recently changed their email address then you might have to wait before that would happen domain so there's some good reasons why you might want to explicitly control the key we don't store the key in the entity body so even though it's in the stroked representing it doesn't go into the body and I explained that earlier so this is how you put stuff into the datastore you create the object this is just a standard go then you use the datastore package from the the Cloud SDK that comes from Google's just free and you create this used new incomplete key let's say we pass in a context we're passing in the kind we use that constant everywhere and you can you can have a parent for the key as well and that's quite significant if you I'm not going to go into it now but it's worth looking at what the ancestor keys yeah it has some kind of guarantees around consistency that are worth knowing then I just called datastore put passing the key and a pointer to the object that I want to put in and that's it it's gonna give me back another key if it was an incomplete key then it's gonna give me back what the complete key is now or an error Eric things things do go wrong sometimes and then after I've put it I set the key immediately on the object because usually I'm gonna then want to return it or use it somewhere so you know since there's this disconnect between keys and entities that is something you have to do is to manage keys in this way getting stuff is as easy it's actually a bit easier so first of all I'd decode a key from an ID a key is actually a struct in datastore but when it gets marshaled or you can call a method to encode it as a URL friendly string so you don't have to take that string and turn it back into a key that you can use and you use decode key for that and it'll fail if the keys corrupt or something you know so if someone's trying to hack by guessing ids which is obviously a waste of time but but then you'll sort of catch that it's not really a security thing it's just it can't turn it into a key because you understand what's going on so then you create a variable to hold your object and call datastore.get and you're passing the key and a pointer again to the object and this can be nail because it will create it'll create it for you and then at the end after you've got it remember it doesn't have the key so I set the key again okay you know the key thing is a little bit odd but it's actually in your in your unit tests it's worth making sure that IDs and you're probably going to use the ID to validate you know as part of a few different operations so you do tend to see when that's not working which is kind of useful querying querying is also a little bit interesting here's some example code that will do it essentially you say data stored on new query again passing the kind and then you can add filters and you can also change the order and do things like that you know it's not it's not like sequel it's it's quite limited but extremely fast but you know it's you have to be aware of some of the weird things about it which I'll talk about for example essentially how it works is when you put an entity it also goes and creates an index of that entity kind of ordered by every field and if you do a filter by year and then order by name you actually need an index to power that that's how it works so it's very greedy on storage but that means it can be extremely fast when retrieving things but this will this code will fail if you don't add an index explicitly for year and name and if if I was to order the the name the other way like you can do - name then that's another index again and it's really just because it's it's kind of cool that just explains that I think oh no some weird stuff so you don't this is the URL fetch package you can't just act make requests you know cuz it's a controlled sandboxed environment you have to use the URL URL fetch to create a special client that you can then use and you get billed for the use of it and things like that so that's another thing that's worth just knowing and that's also a good reason why whenever you write any packages that make a shoes requests you should always let the users specify the client because if you don't do that in you wouldn't be able to use it on App Engine so always let people let the user give you a client if they want to default to default client may be something sensible and request tons limited and if you need some help on anything like this tweet me check out standard library io that's some friends of mine in London doing training and workshops and things you know and they'll do stuff not really trying to promote it but they want people to be able to do this and be successful with it so that's it I didn't put a thank you slide on but thank you I'm done [Applause] [Music]
Info
Channel: GDG Lviv
Views: 21,536
Rating: undefined out of 5
Keywords: DevFest Ukraine 2017, GDG, DFUA, DevFest, Ukraine, Google, Cloud, Golang
Id: FkPqqakDeRY
Channel Id: undefined
Length: 41min 18sec (2478 seconds)
Published: Wed Oct 25 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.