The Case Against GraphQL - Robert Zhu

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
good morning everyone i'm robert from AWS i used to work on the graph kale team at facebook until I joined AWS last year it's an honor and a privilege to be speaking with you all today thank you very much for having me grass kale Asia it's also my first trip to India and it's been wonderful so far the experiences started before I even arrived when I boarded the airplane I was seated next to this adorable old lady she must have been no more than four feet tall she had these thick glasses that made her eyes look huge and before we took off she was struggling with her seatbelt and she elbows me just points to the seat belt and so I fixed the seat belt for her later in the flight she had trouble getting her tray table back in the upright position so she elbows me and I help her fix that the stewardess comes by and asks what drinks would you like she elbows me I say well I think she'll have a water wrong answer we took about took us about a minute to figure out that she wanted a specific amount of tomato juice exactly half a cup and at the very end of the trip she elbows me she says thank you it's so great to be here you have no idea how excited I am and when I discovered that I was gonna go after Leigh I decided to build a slide deck that contradicts everything he says so I'm going to present the case against graph QL why am I doing this why would I crash the party I spent years of my life working on graph kill I use it every day and I believe soon all of you will as well you will all gathered here today you're the early adopters you're the Vanguard you're the evangelists of graph QL the success of graph QL in the future depends on your ability to convince others to use it as a result arguments against using graph QL should not surprise you or offend you or catch you off-guard so I'm gonna give you my best arguments against degrees in graph QL that quote I showed earlier there's more I want to share the complete version he who knows only his own side of the case knows little of that his reasons may be good and no one may have been able to refute them but if he is equally unable to refute the reasons on the opposite side if he does not so much as know what they are he has no ground for preferring either opinion nor is it enough that he should hear the opinions of adversaries from his own teachers presented as they state them and accompanied by what they offer as reputations he must be able to hear them from persons who actually believe them he must know them and they're most plausible and persuasive form before I present the case I want to endorse an operating principle for us as the graph QL community it's called the principle of charity the principle of charity requires that you maximize the coherency and rationality of your counterparties argument Daniel Dennett provides us a four step process for exercising this principle after you listen to your opponent carefully first state your opponent's position so persuasively and clearly that they say thanks I wish I'd thought of putting it that way lists anything that the two of you agree on mention anything that you learned only then are you allowed so much as a word of rebuttal by the way I share this with you as a fellow student I'm terrible at this but I want to get better it's not easy but I think it's worth it because there are many benefits among them we reduce miscommunication we learn we empathize but perhaps most importantly this is the most productive way to disagree because when we disagree especially if the disagreement is over something a belief that is part of your identity the disagreement can feel personal they can feel like it's me versus you and even if you're right if you prove someone wrong that runs the danger of creating resentment with the principle of charity we have an opportunity to turn this miiverse you emotion into us versus the problem as I go through the cases against graph QL let's run an introspection query on our emotional state are we reflexively building a counter argument are we probing for weaknesses instead of paying attention are we finding ways to attribute ulterior motives to the speaker are we labeling the argument so that it can be dismissed now these are all natural and common reactions that we all have the principle of charity is an exercise in suppressing those reactions it won't be easy but let's give it a shot now I have organized my case against graphical into three broad categories novelty new problems and broken promises I'm not suggesting that these are unsolvable problems or that there are no workarounds I'm only going to present the case against them and throughout the rest of the conference you'll hear amazing speakers address many of these problems in their talks first up novelty graph kill is new and anything that new that's new needs to be learned now it helps that graph QL is intuitive especially to people who have used JSON but writing graph QL queries is just the tip of the iceberg as I'll discuss later adopting graph QL has a number of related consequences for the rest of the stack that the whole team needs to understand and I mean the whole team you need to train the entire team your front-end engineers your bag and engineers through DBAs your security personnel your PM's DevOps everyone is going to be on board and if you're building third-party api's you also need to train your users and we human beings we cling to what we know it's just our nature it's not that we dislike learning it's just that learning has uncertain payoffs and anything we do that takes time has opportunity cost for example let's say you're building an application in Visual Basic but you know Visual Basic really really well you keep hearing about this incredible new thing called flutter using this amazing new language called dart should you switch will it help you build better application faster well that depends on a variety of factors and that's the exact situation that a newcomer to graph QL faces novel technology also needs time to mature and often the early adoption of these kinds of technologies can take on an almost cult-like behavior so I'll talk about these two points in more detail graph kill was released in 2015 as we just told you though it was used internally for longer than that but for perspective 2015 that makes graph kill is about as old as the iPhone 6s and it's safe to say that the most battle-hardened implementation of graph QL is the PHP server that we used at Facebook but that implementation is not available either as open source software or commercially that leaves the graph kihl-jae s implementation but if you're not using j/s what library do you choose I'm just gonna take one for an example graph kill net is this production-ready it's the most popular dotnet implementation than 4.4 graph QL but is is 3124 github stars enough is a future compliant I don't know I have all huge amount of respect for the the community that maintains this but not all companies have the resources in the culture to handle incomplete and evolving implementations of software in production next we have cargo quilting and in software this is a term that refers to somebody who uses something because they saw a successful company doing it how many people show of hands please how many people have know the origin of this term oh nah not many people okay you're in for a treat those of you raise your hands you have to hear a really cool story second time uh-huh this picture you can see on the on the top right corner that's a wooden airplane on top of a hill on the island of Tanna and it can't fly because it was built by Melanesian natives who have no understanding of flight or aerodynamics see during World War two the US established a base on the island and the natives saw airplanes dropping cargo crates full of food equipment and technology that they had never seen before they noticed that these airplanes landed on airfields surrounded by air control towers air traffic control towers and decades later after the war anthropologists revisited the island to discover that the natives had built wooden air traffic control towers and wooden airplanes and dirt runways so that they could induce more cargo bearing airplanes to land of course it didn't work and this is because we human beings have an incredible capacity for what something called social learning we're so hardwired for social learning that we often confuse correlation with causation for example Amazon uses microservices and Amazon is successful micro services cause success Facebook uses react and Facebook is successful react causes success I want to be successful I should be like those companies and use micro services and react its die I don't want to give you the wrong impression large tech companies do create amazing useful technology but using that technology without understanding the context and the cost is like building a wooden air traffic control tower and expecting real cargo planes to land I want to repeat the case that leave made about why Facebook built rescue all Facebook built graph QL to solve data fetching needs for its native and web clients on mobile this was a critical problem for Facebook because they had hundreds of millions of users in developing parts of the world with low quality mobile networks data caps and devices that had luda capabilities on the client at that scale even minor in efficiencies content the company's user growth which is the main metric that the company tried to optimize but there are only a handful of companies that have that kind of problem at that kind of scale bottom line don't use graphical just because some big company uses it after you adopt graph QL there are a lot of common problems that occur that you might not have recognized in advance the first is control and if you built a REST API you know that the server has all the control the server decides what data to return to client the server decides how much David the return to the client and graph QL intentionally inverts this it puts the client in control but it has a number of implications imagine that you're building an API where you charge a flat cost per API call well two different queries to that API can have dramatic costs differences on your back-end when you try to fulfill them because the client can send an arbitrarily deeply nested query now you can deal with this by a mechanism called persistent queries where you whitelist then the set of queries that are allowed to be executed in advance and you could isolate the problematic queries you can say these are not allowed anymore but if you're building a third party API persisted queries or awkward at best you can also limit the query depth but that might make it awkward to author what are otherwise legitimate queries centralization let's talk about the schema for a moment the graph QL schema is a list of object types they're scalar fields and their relation with other object types but more than just a catalog it's also a consistent snapshot of your API and this consistent snapshot includes very common-sense rules like you shouldn't have two types that are named exactly the same thing that makes no sense right but in a very large code base you can get naming congestion and naming is already hard enough now you need to choose a name that is not only expressive but also unique and this also happens on the client where you want to name your queries around the time of my departure from Facebook the graph Gale schema there at about around 10,000 types and we had some pretty strange names popping up on the client like newsfeed depth for I still don't know what that means centralization also introduces problems when it comes to Federation and by Federation I mean you have multiple different graphical api's that your organization might own and you want to create one consistent graph QL schema that describes all of them that's the first party case that's like the third party case is you have a single graphical API that stance spans graphical endpoints owned by multiple different organizations in both of these cases I believe that centralization and Federation are simply opposite requirements it's very difficult to reconcile both of them at once for example imagine that the World Wide Web used one giant graph QL query what would that query look like or they're sorry that one giant graph GL schema what would the schema look like we would have to describe all the types of all the websites ever known and they would all have to be consistent with one another I just don't think centralization works at that kind of scale and graphical sensual organizing concept is the graph it's a very powerful organizing concept as we saw from Lee's talk but that's different from the resource as the centralizing concept for rest api's and this leads to a lot of features that you might expect that aren't there for instance in a REST API you have a constraint called the uniform interface constraint and if your rest api implements this authentically you also get this feature called manipulation through resource representation and I apologize for all the rest jargon but in layman's terms it means that when you fetch a resource like this player / 1 2 3 4 the representation of that resource the document that gets returned as a result of that fetch includes hypermedia links for you to modify and delete that resource if you're allowed to do so if we were to translate this requirement into graph QL would mean whatever type I'm looking at right now I want to get a list of all the queries and mutations that affect that type and graph Gale has no such feature there are many other differences between graph QL and rest I'd love to meet you and talk to you offline about all of them I don't have time to go over all of them another related problem is ORM s most mature Oh arms to a are built to work with the rest architectural style by that I mean that they can often take metadata about the request and use that to make more efficient queries and avoid situations like the onion +1 situation where you're querying for an object object needs to go and aggragate over some other field and you're running expensive queries over and over and over again but because of the way graph QL resolvers work few our aims are ready to tackle the challenges of integration with graph QL there are valid attempts to be sure if you're interested in this check out a project called join monster that generates efficient sequel queries to do this but again my point is that if you're already using an ORM and you migrate to graph QL chances are that ORM is not prepared to work with you and as you start to become very well-versed in graph kill as you start to use it frequently and extensively you'll notice that some of the promises that it makes initially are half kept and as with previous sections this is not a consequence of graph QL itself graph QL is very thin this is a consequence of all the decisions that you have to make about the stack around graph QL and these include type safety performance and simplicity first type safety this breaks down into three sub arguments we have descriptive vs. prescriptive type safety custom scalars and impedance mismatch let me break each one down from the clients perspective graph QL acts as a descriptive type system not a prescriptive one what do I mean by that let's say we have this field here called ID on the type of player and graph Gil tells us that this ID field is of type string graph K also tells us that the string type is a utf-8 character sequence that models freeform human readable text but what is the representation of a utf-8 character sequence on the client if you're using a language that has a dominant unified string type like JavaScript that's already in utf-8 great not a problem but what if you're using C++ these are all the string types in the standard library alone and if you've worked on C++ or larger C++ projects you know that most of them have their custom implementations of strings by contrast take a look at a prescriptive type system protocol buffers which generates a prescriptive C++ client type in this case we're looking at generated code for a client data type that models two fields ID and name and notice that ID is in 32 and name is STD string now these might not be the types that you want but nevertheless protocol buffers says this is the type it will generate for you you can do some conversion afterward but it's unambiguous as to what the type actually is once it gets onto the client so by being by deciding to be descriptive instead of prescriptive graphical essentially makes this an integration problem and the person's opting graph gal now needs to decide which graph QL code generator do I use or the trade-offs of each in the chosen code Jen framework how easy is it to create deserialize errs for custom scalars which leads us to the next type safety argument custom scalars now quick primer since I saw that there were a lot of people who aren't definitely with graphical graphical includes a number of basic types such as string int bool but what happens when you want something more than that as a scalar field that doesn't go any deeper the most common ones that are requested in the graphical community are long JSON and date let's take a look at long first in many languages you have a long data type that's used no a 64-bit integer but in graphic old integers are 32 bits well floats are 64 bits so in theory you should define along as a custom scalar but in practice a lot of people just say screw it and piggyback on the float which reduces the clarity of the API JSON is another interesting one when it comes to expressing JSON you want to use a custom scalar but what you're really saying is the graphical type system ends here here are now I don't know what the types of the following data is so this becomes a problem because if you're building a language that has the ability to express dynamic types such as c-sharp where you can model a JSON field as either a string or a dynamic type which is it how do you customize that and for date I think this one's pretty simple there's just no canonical encoding format and graph Cal can't exactly tell the HTTP browser the HTTP response what mime type it is so you you need to know how to deserialize date of any given format but you can't all you also can't ship the serialization code of the response the last one is a impedance mismatch so impedes impedance from him excuse me impedance mismatch is usually used to describe the phenomenon where the type system for your database and a type system for your application logic differs in subtle ways subtle but annoying ways and the same is true when you put graph queue all in front of the application logic it might not necessarily line up perfectly with your your server-side code or your client side code two examples I can give you our non null ability and Union types so in graph QL if you if you explore graphical schema and you see a little bang at the end of a type that means that that's a non nullable field and if you have a client-side language that respects non null ability or can express it somehow then you get a really good experience because you have all these integration that can tell you hey did you forget to do a null check before you access that field but if the language doesn't have that you're left with two options maybe you have this kind of maybe wrapper to give you a type int or you just you just don't know you have to assume every field is nullable same thing with union types with Union types this is graphical saying you can return type A or type B but not all languages support this Java is a perfect example okay moving on we have performance and graph QL claims that there are a number of performance guarantees the first point I want to make about this is caching and I need to disambiguate between caching on the client versus caching on the way to the client most of the time when you see articles or people talking about caching with graph QL they're talking about what how you cache the data once it arrives on the client but that's only a part of the problem the other part is that graph kill responses well let's back up a moment in rest for a long time we've had this constraint called cache ability for rest api's cache ability says that a rest response needs to use standard headers to indicate how casual this response is and we've seen these headers a lot right their cache control headers etag headers last modified by and intermediate proxies for that data on the way to the client interpret these headers frequently to create optimizations but with graph QL the response has no such a constraint so you can't have an automatic way to translate a graphical response into these cash flow headers and depending on the latency of the between the client and the CDN then you have situations where maybe multiple round-trip requests are actually more performant than a cache miss against the nearest CDN the second thing that you've probably all heard very often is that graphical eliminates over fetching and this is the case where you fetch a bunch of data from the server and you got the data that you want it but you also got a bunch of extra data that you just don't need and I know it's intuitive to believe that graphical helps you avoid this it does help you avoid it but it still happens let me demonstrate I'm and I know that this is small text so people in the back I'm gonna try to describe what this is here I have a snippet from a graph QL schema that's used for a social game this is a player and it's meant to have a whole bunch of scalar fields on it it's got email id avatar URL display name Twitter handle a whole bunch of stuff and if I create a query against this this is what the query might look like I named this query get player details okay now if I try to run code gen on this query I get a type that looks like this this is typescript by the way and this is great but if I create another query elsewhere in my application that fetches the exact same types the exact same selection set in other words on the player type then I'm gonna get a duplicate type from code gen and I just have two types that have the exact same fields that's confusing okay but there's a quick there's a quick and easy solution for that it's called fragments I can pull out the selection set from this thing from the query that I wrote and now when I run code gen I have the full player details fragment as the type that I generated right but if I'm a lazy developer then any time I'm gonna fetch anything approximating player details I might reach for this full player detail fragment in which case I'm over fetching so bear with me this is I think this is where it gets interesting you can solve this if you have enough discipline by instead of using the full fragment type everywhere you want to fetch anything about the player you break it up into more granular fragments for example I have these three fragments that fetched distinct subsets of the of the fields on the player type okay problem solved right but now when you code Jen you get three types over there and at Facebook we call this problem the the type model versus the fragment model model problem if you have the language on the client like JavaScript that supports duck typing this is not really an issue but if you're in a language like Java that doesn't necessarily support doctor typing what I should be more clear it doesn't support duck typing you end up having types for each of these and if these are reference types what happens when you instantiate these you're allocating memory on the heap and then you're copying all these fields over and that's going to throw the GC into overdrive especially for Java when you're building Android apps and there's Android apps are running on phones that with less hardware and memory and CPU that's a problem and so in order to deal with this problem we had to go and build when we got this data from the root query we had to go and turn this thing into a giant byte buffer and we had to create these thin wrapper types that knew how far to offset into this byte buffer in order to fetch any given field just so that we could avoid triggering a GC honestly I don't know of the cure is better than the disease the last point point I want to make is simplicity graphic you all you know when you first walk up to when you first start using it it seems really simple that's great but when you start to consider the whole picture and you start to worry about client-side state management code gen fragment models field usage tracking execution metrics query persistence real-time operations like subscriptions and live queries the stack is not simple it takes a lot of work to figure out how all that stuff fits together how to fix it if it breaks in fact the graphical ecosystem appears to be gradually reinventing soap and because graphical has externalize so many problems like off caching or Emmas you need dual you really need to do a lot of work and understand how all those pieces have trade-offs before another it's a lot to digest for newcomers so that's it that's I rest my case against graft kill in case it wasn't clear I use graft kill every day I love it and I think you will too and my the point again of me making these arguments to you is not to scare you away from graph kill you should absolutely not be scared away from using graph QL it's so that when you hear them from other people you're not surprised you know what they mean you know how to answer them and I hope you'll learn from all the excellent speakers that are going to follow how exactly to deal with all these issues thank you very much prayer for time and if you find any disagreements one last reminder if you come into these debates please try to apply the principle of charity it's the best chance you stand of winning your opponent over in a productive way thank you [Applause]
Info
Channel: GraphQL Asia
Views: 15,698
Rating: 4.8742137 out of 5
Keywords: graphql, graphql asia, rest vs graphql, bangalore
Id: djKPtyXhaNE
Channel Id: undefined
Length: 31min 58sec (1918 seconds)
Published: Tue May 14 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.