Battle-Hardened API Patterns from Two Years in Production (Kiyan Azarbar)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay thanks for coming everyone my name is Qian osbahr and I work on graphic you LAPI is at Shopify this talk the talk itself is called battle-hardened API patterns from two plus years in production but really that's just a fancy way of introducing what we've learned and distilled into a set of best practices for designing graph QL schemas or graphical api's you can use the terms interchangeably but before diving into the weeds I wanted to take some time to ease into the idea and explain why you should even care what I have to say today so I'm actually on on a team at Shopify named API patterns we're responsible for a lot of graph field related things as you can see or barely see with the font size but none of that stuff is relevant the only thing that's relevant is what's in blue so like number one which is establishing patterns and best practices now you might be wondering what the word pattern even means here but before I explain here's a really really really quick sort of condensed history and context on graph QL at Shopify this is like a talk an internal talk I gave that was 20 minutes long compressed into like 45 seconds so this is what graph QL kind of looked like working on graphical looked like about two years ago at two and maybe two and a half years ago mid-2016 at Shopify just a few pioneers exploring new technology building something which they hope would power radically improved mobile experience so we were working on a new mobile app a native mobile app both in iOS and Android and we had chosen graph QL as a technology to power that app this was I should point out this was a highly focused mobile centric version or subset of Shopify so it didn't have to encompass the full functionality but it needed to come what was necessary for mobile so the team that was working on the graph QL back-end for that the patterns and the principles and the unified vision that they they put out was really just like a happy byproduct of the team structure so when you've got like a small number of contributors and a culture of collaboration and peer review this sort of thing just happens falls out by default so that was 2016 you see they're having a really good time there and flash-forward to today this is what graph QL being a graphical contributor is like today at Shopify so Shopify is really large mature platform it was then and it's even bigger now but at the time they were only working on a subset to support mobile our REST API has also been around for a really long time that's been built up over time and has a very large service area each aspect of arrest was built and maintained by a team experts in the respective domains people who really understand the problem space and their API consumers so graph QL for graph QL or graph QL public API to have the same depth and breadth they should hold true again so in the past year at least the majority of graph QL contributions at Java have come directly from the specialist teams themselves so one consequence of this is the the rate of PR is affecting our graph QL schemas is already pretty staggering we get like on a slow day we might get like 12 or 13 PRS and on the peak day we could get in the 20s even pushing 30 so and it's and this continues to rise so this growth had to be managed somehow so Shopify as as a whole continues to have this like a unified vision for their graph QL API and also so that the individual teams could contribute to this vision and even enhance it so the stewardship of this vision is one of the main responsibilities of my team it sounds like really fancy to say stewardship but it's actually a word we use I wish we had scepters but we don't and in support of this goal we created a graph QL API design guide or design tutorial this was internal only at first it was really so that we don't have to repeat ourselves all the time when working with the teams that were writing graph QL backend code but eventually it was generalized and it was made public in May so this is the link so this talk is based on our design tutorial but it only covers a small portion of it so if you're interested in learning more I really encourage you to visit that link but first before fully diving in I want to pull up to a bit of a higher altitude and discuss in a more general general sense why this even matters so in my opinion api's are fundamental to modern apps but API design is fundamentally opposite in its approach maybe OP a pose isn't the right word maybe a better word would be complimentary but no matter how you slice it these two things are very different so this is my take modern app development is characterized by two strong driving forces one is the pace of development and also the frequency of direction changes that I tried to capture that in that slide it's a MotoGP rider doing like a 68 degree turn at high speed app development also exploratory so and this is where the racetrack analogy kind of breaks down but I'm gonna ask you to break into your imagination a bit and imagine a track that never repeats itself with lots of little Forks in the road where visibility is poor and conditions are unknown that's basically app development but the best apps don't shy away from change and they aren't afraid to try new things out measure and assess so app development is really subject to rapid course corrections so what app developers need is an API that can support these shifts so on the other hand API design it's my big reveal for my analogy it's kind of like the truck that helps get these super bikes to the racetrack and supports them in their in their efforts so I think a good API design ideally allows the app to pivot back and forth as often as it needed but it's important to note that whether our design choices are whether our design choices reward us or end up haunting us they do stick around for a long time so teams need to commit to supporting these design choices in their API it's much easier to add a missing piece than to remove or change one that's not quite right and they'll touch on that later but I think good API design really boils down to striking the magic balance so on one side we have and this is this is an interesting slide because it's Express as a pair of negatives things that you shouldn't do but it's maybe more useful because I've found these tendencies these two tendencies come quite naturally to developers but because they're an opposition they balance each other out so one of those is the tendency to over engineer to try and plan for everything in the future and make an API do everything you could possibly want and the other is a tendency to oversimplify to build something really quickly that makes a lot of sense get it out the door don't worry about complications that may never arise obviously both of these tendencies are not necessarily correct that the answer is like some middle ground because these are in opposition they balance each other out and we can focus on the actual positive stated positively the group on the left there their mantra might be don't paint yourself into a corner don't makes things so radically simple that an extra feature request will require you to be our connect and the ones on the right side the oversimplify errs might say build only what's needed today so what we're really trying to do is strike that balance so all this is just to say that's getting it right the first time is really our goal it's what we strive for when we're building api's with app UX and functionality changing so rapidly it's it's clear that sometimes the API is sometimes will need to change and evolve to support this now adding something comes for free and graphic yield but if you want to change something that's more difficult now the great thing about graph QL is it gives us some really great tools to evolve our API over something like rest in which deprecation is sort of you have to hand roll your own solution but even though we have graphical deprecations it's not a magic bullet at all so getting it right the first time is the design goal here so this talk is really about some techniques that we've developed at Shopify to help us get it right more often than not more quickly and consistently than if we were to consider all angles from first principles every single time it was built up over almost three years of experience for us to support our mobile app and then in building public api's we actually have two public API so if you're interested please come to our booth and talk to us about it and then now also rapidly proliferating internal use within Shopify on top of that so is you know has gone all-in on graft you all and these are the patterns that we've established so hopefully this is how you'll feel after this talk but it's not going to even come close to covering these patterns and principles that we've assembled together in this tutorial the main purpose of the talk is to showcase a few high-value and hard-won insights and hopefully whet your appetite for finding out more so hopefully you'll be encouraged to explore these further and visit that link so I'm gonna I'm gonna start out by introducing some some of our rules for designing schemas but I realize how off-putting it is for me to sort of stand up here on a stage and rein rules down upon the group of forward-thinking technologists so let's dispense with that word these aren't rules so what are they more like guidelines suggestions or what I'm gonna go with is a term that I think is even more accurate to the spirit of what I'm gonna present and that's patterns not just because in team isn't Hampton API packs so why patterns pattern pattern recognition is the process of extracting some meaningful signal from a sea of noise and working on our API for several years we've identified some recurring themes choices design choices which we got right ones which appeared often and performed better in contrast to other choices that were more rare so over time we've kind of recognized the patterns of design choices that succeed so over time we recognize these and we did what all developers love to do we enumerated them and I mean like literally a new rate of them.they we didn't even give them names they're seven numbers so we believe these design patterns work in most cases but they may not all work for you they don't even work for us 100% of the time to be honest there are always exceptions that's what software is all about so pick and choose which ones resonate with you and make the most sense for you so before we start I'm going to quickly introduce a feature that will be modeling in our schema all of the examples will be about collections so really TL DRS collections are just a way of grouping it's just groups of products so here's what the create screen looks like in Shopify to give it more visual example collections have a title description and a type the type can be either manual or automated let's call it automatic in manual collections you select the products by hand it's what manual means in automatic collections you define a rule to dynamically match products to be included so that's what you see here in an automatic collection you've got condition conditions you select should do all the conditions have to match or is it good enough for any condition to match and then the conditions are like dropdowns for example product tag is equal to and then you enter a string so at a very high level collections might be implemented in your application like this so we have two classes one from manual collection and one for automatic collections and either through inheritance or composition or some other super computer science II term they use a common base or an abstract collection so this is what graph QL schema definition language or SDL would look like for it for anyone not familiar STL is the SDL is just a precise and concise way of capturing the state of a schema what's there and what isn't what's allowed what isn't so I won't be explaining the spec but you'll be seeing enough of this and the examples to get a good sense of it so this is this represents the most literal way of designing this feature in graph QL we've got the generic collection interface and two types that implement the interface and to implement interface it's basically just a contract that you have to have the same fields of the same types so if you see there manual collection implements collection by simply just having a title that's string now automatic collection also satisfies that contract but it has an extra has an extra field so the rules field is an example of a field that varies between the two concrete collection types since manual doesn't have it but this schema is just a direct representation of how we implemented this in our application code there may be good reasons for our back-end to do it this way to be designed this way but it doesn't mean our API should mirror this design after all the the core concept that we're trying to model here is the collection itself and that's part of our business domain so the question is what our API consumers actually care about the two different types of collections and an interface and I don't think they would so here's the simplified solution we've collapsed the two different types you even got rid of the interface we just have a collection now with the title and rules but you may have noticed the downside you may may have may think to yourself well rules only apply to automatic collections and before they were only on that type now there now we have like manual collections which will have this extraneous field which might be confusing or misleading but you know we really want to design our types around the most important concept rules really of secondary importance to this concept so they shouldn't drive our design and if we had two separate types it was really because they were driving our design and one thing I've learned in trying to convince developers of a certain way of designing things of api's is that sometimes all you need to do is reframe the situation so you can actually say that you know collections manual collections are just a special case of an automatic collection that have an empty empty rule set so that works and that's what we've done here and it actually makes our design kind of more future-proof and flexible since you know maybe collections in the future maybe they could end up being a hybrid and maybe we want to have collections where you add products manually but there also have rules to automatically apply extra products to those so this is now distilled into our first pattern which is never expose implementation details in your API design it doesn't matter how you accomplish something what you're really trying to do is represent the concept in the API now design your console your API around the business domain and don't design it around implementation but also don't design around the user interface and definitely don't design it around legacy api's so we're gonna go over one more example of this because this is really our most fundamental pattern or guideline so again I describe collections as groups of products now this can end up being many many standard many to many relationship in our database and our database might be designed something like this with a collections table in a products table and then a join table which will call collection memberships so a collection membership represents a product being in collection so if we want to model this in graph QL once again we start with a really direct representation of this in our schema we've got a collection and it has a memberships field which is a list of it's a list field of collection membership type same with products they both have that now the first guideline again is violated here the first pattern which was to avoid avoid exposing implementation details and that's exactly what we've done so to improve this design we should ask the same question we asked before which is what is the main concept that we care about I think this one's even easier to come see so the main concept is really a collection as a group of products and products can be in multiple collections no one really cares about join tables no one wants to traverse through joint a of these so ideally for the API we can design that as directly as possible so the new version on the right does that and I've grayed out the fields that don't matter to this example we've got a collection that has a products field which is the list of products that it contains or that are contained in it and product has a collections field with a list of collections that contain that product so this is much simpler we've completely eliminated a type and a concept an extraneous concept to the API so here's here's where we're at right now a schema is pretty minimal and I've excluded a lot of fields that we have in our real production schema I did this to keep it simple but one other tip here is before you start adding fields and types you should really ask yourself if it's really needed at this time just because the database column or model property or a rest attribute exists doesn't mean it automatically needs to be added to graphic UL from the start so this is a pattern number two it's easier to add elements to a schema than to remove them or to change them now usually removing or changing something from the schema it's a breaking change and can be very difficult to deal with and the if you're wondering what the snake is this is what happened when I typed in high-res image of an adder so another thing I wanted to point out is that graph QL is an opportunity to really clean house you can get rid of things that don't make sense to expose you can rename fields you can shift things around it's an opportunity to get things right and you should be really comfortable with the idea of not being slavishly devoted to mimicking what is existing in the in the rest api or existing implementation so here's our collection type with some new fields added we've got two two rules fields now so I'm gonna highlight those we've got rules which you already know and then we have this field called rules applied disjunctive Li the second one is a new boolean property these fields both have to do with collection rules and they're obviously related so this is true semantically but it's also hinted by the fact that we chose names that have a shared prefix so is there a way to indicate this relationship in the schema somehow it's kind of leading question because obviously not you know there is in rest there are reasons to shy away from this because you know we want to like reduce round trips we don't want to over inflate the response for people who don't care about those fields but in graph QL we can actually group those two into a type so in on the right there we have a collection which has a rule set which is a list of rules and then a boolean that explains how those rules are applied whether disjunctive Lee or conjunctively so that just means generally just means or or any this this might seem like overkill to you because it's just two fields but we've really found that in it's not really overkill you could argue the case that if they're only two then maybe it should be flat but collapsed again into a type gives you a nice place to contain future fields are related to the same concept and definitely once you get to three related things we really recommend that people use a type so a pattern number three is group closely related fields together into their own type so it's very common for new schemas to start with a one-to-one mapping between graph QL types and models but one other thing I'd like to point out is that you shouldn't be afraid to create types that don't correspond to some to a model in your system or a table in your database don't be afraid to create types which only exist in craft QL if they help to represent your business domain properly so that's where we are now on the right we've applied some of these patterns but let's focus on this products field and this is kind of an important one it's not very sexy but it's important so right now products is returning a list of products under collection now at Shopify collections can easily have thousands 210th out tens of thousands of products so trying to return every product would be in a single array would be unwise so whenever you implement a field that a plural field that returns multiple objects always ask yourself if this field should be paginate it or not how many of this object can there be is there any theoretical limit in the data model if there is and this limit is really small then maybe there's a case for keeping it as a list field and you can make the call as to whether pagination is worth the extra complexity because make no mistake pagination is complex although we do have tools and clients to help us deal with that in our case the field needs to be paginating so I'm not actually gonna go over the details of pagination there we use connections from the relay spec which is a very common pattern you just need to pick one and stick to it but here's the the pattern we like to call it look forward to the future can you envision a time when this list field might need to be paginate a lot of people really want to start out saying oh it's no problem like it's not gonna be a long list it's fine that's fine it's just a bunch of known values but then those get added to an attitude over time eventually get to the point where they're really unwieldy to deal with but the important thing to point out here is not the details of connections versus list fields it's that if you choose one and then you want to go to a connection in the future if you choose list field you want to go to connection you can't it totally breaks the clients expectations of the return type so this is kind of like deciding whether you need 16 gigs of ram in your but bro you better be sure that you're getting the right size because you can't change it later or you can with the graph QL but you have to go through all the deprecation cycle and it's a real pain so now we come to the image ID field I'm going to summarize this really really really quickly that it's actually a tendency for people who are translating rest api is to provide ID fields but this is actually an anti-pattern in graph QL and the better solution is to actually provide the object itself so this is what I've done here a collection contains the image which is an image type which right now is just a placeholder for ID but it will give you a place to add other other fields on the images in the future so that's pattern number five which is to use object references instead of ID fields this really allows you to go in and traverse relations and do it all in one query whereas in the past with res yet to make multiple requests we've got a field here which is kind of weird sounds weird called body HTML and this is kind of for legacy reasons but what this field is actually doing is describing the collection so this field name can easily be confusing to someone using the API now it's really just a description so why not just rename it which is a much sort of clearer way to do it so our pattern number six is always choose field names based on what makes sense not based on the implementation or on what the field was called in legacy api's this is very important I'm going to skip over scalars and I'm coming to one of my favorite rules here so if you look over here I've been ignoring the collection rule type for a while but let's take a look at it now so as a reminder products can be in the collection when they're matched by a rule each rule consists of a field to match on or a database column and a type of relation and an actual value to use so here's an example there in white so we could say like the field is product title the relation is contains and the value is shirt what's important to note here is if you look at those dropdowns the left side field is a drop down the right side the relation is also a drop down so these are things with an own set of values and often their representative interfaces as dropdowns but because of this we can actually convert these two fields to enums and this is something that we do a lot a job by so now we have two new enum types collection rule field and collection rule relation and they're a well-defined set of fields that you can match on and relations and this is like really one of my favorites which is use younam's for fields which can only take a specific set of values if you're not one of those values you can't get in and this is actually something really important to emphasize because if you're writing your application in a dynamic language that doesn't have strong typing it's definitely not a tendency to think this way and even a database design level we often have tables that contain lists of known values but they're represented as varchars so we're used to using magic strings graph QL is this really great way to add a little bit of typing a little bit of like guarantees to your contract it's actually really really helpful in writing your queries people who are like working at graphical or anything with linting can actually see all the possible values there's no chance of misspelling them etc it's really useful for the consumer so here's something else here's some pseudo JavaScript code I'm not really the greatest JavaScript so I assume that it works but this is to try and determine to try and determine whether a product is contained in the collection so you'd actually have to get the full list and you have to Trevor you have to iterate in traverse and make comparisons and the problem is when that happens every client has to do this and every client has to implement the same or similar logic and it leads to duplication and potentially differences in bugs so what is the solution here and this is something that's really great about graph cubes because our fields don't have to directly correspond to database columns or things to do with our with our datastore we can have derive fields are really just driven by functions so this field takes the ID of a product and returns a boolean based on the server determining if a product is in the collection or not so we can have a field has product passing an ID and then we will get true or false and this is really really useful so pattern number nine is the API should provide business logic not just data complex calculations really should be done on the server in one place not on the client in many places so I'm gonna talk a little bit I've about mutations there's a lot more on mutations in our design tutorial which I encourage you all to read and I didn't really have time to get into that here but I do want to talk about one small aspect of it so here we have a mutation that updates a collection and the return type right now is is a collection but this doesn't give us much flexibility we can either return a collection or null but we can't return anything else so one solution is to create a payload type specific to each mutation which we will use as a return type right now this is just a wrapper for the collection itself as you can see there it just contains the collection but we'll see how this becomes useful so notice the return types null ability we can actually make it non null which means it can never be no that we can guarantee that we will always return the payload and this has some benefits so pattern number two is to use a payload return type so back to the payload type mutations can succeed or fail and you want to provide feedback to your API consumers on those failures no graph ul has support for query level errors built-in but these are not ideal for business level mutation failures are more about like validation against the schema whether your your query is malformed or not these are top level errors but a job finder and all our payload types we include a user errors field which is really feedback from the mutation so if there's something went wrong the query was well-formed but it just had to do with logic on the business side like who didn't provide a correct a when you're trying to update B you can actually be informed of that with the user errors so a successful mutation would return an empty list of user errors and it would return the collection object that was updated unsuccessful mutation would return a null for the collection object and you notice there that in the update payload collection doesn't have a exclamation decided so it's knowable and an empty user errors list we're sorry if it came back as null you would have a list of user errors which would explain what the what the issue was so patterned number 13 is that you can make mistakes and recover from them mutations should provide user or business level errors verify a user errors field or whatever you want to call it on the mutation payload and having a payload gives you the flexibility because you guarantee the payload is there and then the actual return type could be null but the user errors field could it could contain information and pattern number 14 there is kind of a corollary to that which is most payloads for a mutation should be payload fields like the fields that are inside the return type they should be knowable which allows us to return null and something goes wrong so I've got like 40 seconds less left which I'm really impressed with because when I tried this at home I went 10 minutes over so these are all the guidelines I could realistically cover in a short talk like this I wish I could stand up here and talk for an hour actually no I don't but but I have but I could I could do it so here's the link again to the full design tutorial that this talk was based on there is a lot more in this design store a lot of sort of technical nuances about mutations and and principles and approaches and I really urge you if you're interested in api's to take a look at what we put together we do it's not necessarily all correct and other people might have come up with better solutions but these are things that have really fallen out from you know three years of use almost three years of use and we're always interested in the feedback as well so that's the link over there and I hope these guidelines give you a head start in designing your own graphical schema so thanks for listening [Applause] [Music] [Applause]
Info
Channel: Apollo GraphQL
Views: 4,341
Rating: 4.9565215 out of 5
Keywords: GraphQL Summit 2018, GraphQL
Id: mqaz6vmAGis
Channel Id: undefined
Length: 30min 32sec (1832 seconds)
Published: Tue Dec 04 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.