Backend Meetup: GraphQL at Scale

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey hey hey uh we're gonna start in few minutes one minute actually uh so can i ask you to move to the presentation area slowly now can you start moving people into that just just ask them okay hello everyone hello hello hello if i could just get your attention for a minute or two uh welcome welcome my name is mikhail i'm the group product manager and site lead of this beautiful pipedrive frog office and it's a great pleasure to welcome you all here it's actually our historically first external meetup in this office because it was quite unfortunate because we have opened this office in march last year right before the pandemic hit us so we haven't been able to do any meetups here yet but hopefully you'll have fun uh we have great amazing speakers lined up for you from netflix pipedrive twisto and strv so hopefully you'll be able to enjoy the event after the official part well done there will be networking there will be beers we can network so if you want to ask any questions just find anyone with a pipedrive t-shirt and i hope that you'll have a great time here and without further ado because we are already three minutes late before the official start but that's okay uh i would like to welcome gergie to the stage who will be your host for this evening thank you i hope you have a great time thank you hello hey everyone uh so yeah uh hey hey hey uh or as we like to say in pipedrive terra territory which is the same thing in estonian uh let me introduce myself first uh i'm georgie machuck and um you might know me from all the linkedins and github of the world um you can find me with this username that you'll see on the screen later on but yeah you might have heard about me from some previous events i was uh speaker and a few and it turns out that i'm not really good at it and uh yeah i don't have that many smart things to say so we decided that i'll be just hosting tonight and kind of guiding others to the treasure that i cannot possess myself but enough about me um yeah as mikhail said this is our first ever event in this space and actually this is first ever event that we're streaming here in pi drivetrack uh if if i'm not mistaken so it's a you know two times first for us which is super exciting but at the same time we kind of expect you know some issues and stuff to learn from in the future um we actually already have one that we want to mention and that's uh that we cannot stream to the youtube the camera and the presentation at the same time so you won't be seeing the speaker at some times but yeah we'll figure it out for the next time um but until then for uh kind of we'll try to make the best of it uh some general info uh so we have slido that you can use throughout the night if you have any questions just go to slido to this link and ask the question i'm guessing you all know how the slider works by now sorry sorry about that uh yeah so slido there's also you can vote for you know some some of the questions there and the most most voted questions uh will be asked after each presentation so just go there there are also some polls prepared so we might get some interesting data from all of this uh so yeah answer those questions as well um there is you know a clear agenda we're a bit late already but um can we get it to the screen yeah okay so in a few minutes we'll start with robert's presentation then we'll continue right away with hanza's presentation and and after that we'll have some short break mikhail's presentation and another short break and a presentation of datejust from netflix some office information for all the people inside of here so beerus on the tap as you all know probably together with some other drinks in the kitchen area and yeah if you see some snacks that you want just grab it and what else is there um so bathrooms are outside of the office so on the left there is women and on the other side it's it's men and yeah if there is anything you need help with just grab any person with the pipedrive t-shirt and and they will help you out they can also show you around the office uh as well and kind of give you a tour if you want to talk maybe about the career in pipedrive engineering wink wink there is jakub which is head of engineering here in prague office so yeah you might as well just go to him straight away i think that's it for me uh so yeah let's get to presentations uh robert trustman from strv is here who is a backend engineer and the name of his talk is the surprising and unexpected graphql uh fun fact about robert is that robert has a history at strv of being a bit of shooter because he several times deleted something that looked unused but but was in fact required for the live project since then the term robert fix has been used to come mean completely destroy it doesn't sound too much fun to me but go ahead sage is yours all right hey guys um i just want to clarify it usually really looked completely unused and usually i checked with someone responsible and they said like yeah it's unused so i deleted it and you know a few minutes later they called me and i'm like hey why is it gone like give it back and it's like it's gone man so uh that's fun fact all right so today i'll tell you two stories about my experiences with graphql one will be a little bit surprising like surprising for me and one not really expected but honestly i want to set the stage here okay so my knowledge about graphql is like this is the start line this is me and this is probably the top okay so i'm really somewhere here okay so don't expect this to be too complex it's really easy and if you have hands-on experience or maybe years of hands-on experience maybe you'll find this obvious but to me it was quite surprising so here we go the first one is the unexpected graphql let me tell you a little bit of backstory uh the project i currently work on that has graphql it's quite simple it's like a our own graphql interface and we are the only consumers so we don't really care about other clients uh so usually when the project started um it was a simple thing right there was like a few simple queries maybe some some resolver here and there here and there uh but then the project grows and you add more queries and you have more data relations and the the models get a little bit more complex so you start adding data loaders and doing all kinds of weird stuff with it um but to me for example when i started on this project uh using data orders felt quite natural it was very nice because it felt like i can just you know delegate this work to a later time i don't need to like fetch everything and then decide what i need or what i don't need so data loaders seem very nice at first um and of course it seems that if you can actually fetch less data from the database it should be done quicker right because less data right yeah probably right yeah i guess so uh let me just show you a little bit of how our current resolvers look like ignore the names they are not important what's important that this is um this is a data model the thing at the top and it has a bunch of things it's like a previous answers original answers all of these are loaded [Music] after you actually request some of them okay they are not loaded in advance they are just like loaded on demand through this data loader and we use this pattern quite a lot we use it like i don't know 60 of time when we write some when we need to fetch some dependency we use most of the time a data loader for that and yeah as i said at first this seemed like a great idea because less data you load faster it's going to be but we have a problem we have a serious problem right now the main page takes eight seconds to load and like i don't know if that's okay for you guys but for me that's like pages so i don't want that so i started digging into that and of course the first thing you want to check is that you check the network see everything's communicating nicely but the network's fast there's no problem with the network uh so you check your postgres database maybe you check your resql queries if they are performing well and yeah of course you find some like there's quite a lot of queries that need optimizing so you get into that task and you spend the whole week optimizing the queries and you are very satisfied because now on average all the queries are done in 10 milliseconds and you are so happy with it but we still have a problem it still takes six seconds so yeah good job on those two seconds but six seconds is still too much for me i don't know for you for me it's too much so i started digging i started thinking about this like what could be going on and uh to explain the reason why this is taking so long let's have a look at our usual relations that we have in our in our data schemes so the company that i work for um their part of the business is to oversimplify it is to build very custom forms you might think google forms but very like a for a very specific purpose and as you might imagine a form might be divided into some sections those sections might have some questions and each question could be translated into multiple languages and some question might be answered by a simple text but some of them might have like a pre-made choices you know like a picker and all those choices need to be translated as well and each of these things has its own relation its own database model they are not all bunched up together but they have a different they have a different model they are they are separated so that's good that's that's what you want but uh here's the problem you have a main query like when the customer oh i forgot to mention one thing let's go back here i forgot to mention one thing that uh from our perspective uh the consumer is usually the web and on the web you really want the whole deal you want the form you want the sections you want the questions you want the all the choices so when a customer comes up to our to our site the query usually selects everything that you can see here on the screen right they are not interested in only good questions they need the whole thing so you might be expecting that i i think you might know what the problem will be here uh but let me explain so you have a main query that's like a get me the form that i need to fill in so graphql loads the form but now it needs to load the sections because we use the data loader for it so it first loads the form then it loads the sections and it waits for the sections and rafquel gets the sections and now it needs to load the questions for those sections so again it goes to the database it loads the questions the questions come in and then the data loader again it needs to load the translations and all the choices and you know it's like a ping pong game it just keeps going left and right left and right and this adds a lot a lot of latency to the whole request so the problem is really that yeah all the queries are fast but if you all if you if you add up all those network network latencies and all those you know parsing and postgres doing its thing it just really adds up and for us it turned out to be quite a lot like for you it might be okay um it really depends on the on the hierarchy that you have in your models and how you implement them but yeah that was that was our case so i was really surprised to see that it could have this kind of for me it was really unexpected behavior but then if i look at it now it seems quite obvious so i don't know all right so let me just show you a simple example of how we for example load the form this is javascript actually it's typescript but don't care about that uh it's equalized you don't need to know that it's an object relation model something right so this is what we had before you load the form and that's it and nothing else and you send it back to graphql to do its thing and our solution that that we actually applied to improve the the response times was to i don't know if you can see it i'm sorry it's so small it's basically to pre-load all the dependencies all the relations that that are usually required like 99 of the time they are required so we load them in the same query uh even though we don't know it at this point we don't know yet if they are going to be used but in 199 99 time they are so we load all the companies we load all the sections all the questions all the translations everything so we send it to one query we get it into one nice object we send it back graphql and then instead of just going directly into data loaders we first check in those resolvers if we already have the data and if we have it we just send it back to the client and we don't go to the database again and yeah that improved things quite a lot so so yeah i mean for your business needs uh it might not make sense because probably maybe your usage patterns are so all over the place that sometimes it's needed sometimes it don't it doesn't it's not needed but if it makes sense you might want to use this you might want to fetch all the relations together and sometimes it might be better to fetch them anyway and then just discard them because i mean joining a table is quite cheap in postgres so there's very little overhead so might as well do it and also we had some uh asynchronous field resolvers for example a company might have a field that says how many employees it has right so this is an extra information that you need to fetch from the database and we didn't pre-compute this field uh before so only when it was requested the resolver actually got went to the database and asked for the data so again this is yet another request that graphql needs to make so maybe maybe you can actually pre-compute it in advance and not to rely um on the resolver itself to compute it right so that was the unexpected let's go on to the surprising part of graphql and this was really like a multi-layer surprises for me so let me start from the beginning so the first surprise was this is a code sample from our test helper when we write graphql queries it's very simple uh the only important part was that when i looked at it for the first time like this was my my first experience experience with graphql and uh i had a thought like if i make a query that i named employees the test helper actually returns that data wrapped in a nice employees object so i thought wait is it possible to actually send two queries at the same time in the same http request that would be amazing right because i i have seen our web to actually issue new requests for each query and http has a little bit of overhead um so i thought like maybe we could get rid of that maybe we could just group all the queries that we need and send it in one in in one http request but sadly nope you can't do that uh so at least this was some kind of technical implementation that we had in the uh in the test helper i don't even know why it's there but it's there uh but then i did uh do a little bit deeper because i was like so interested into this i really wanted to make this because it seemed like graphql could support this scenario and uh yeah then i found out that apple actually can do it so it was very interesting so this was all right so the levels of surprise was that first one was the thing that i could do it then i had the unpleasant surprise that i couldn't do it then i had a pleasant surprise when i found out that there is an alternative and the alternative is it's called apollo batch http link uh it's part of the whole apple package so you could actually use this and the syntax is a little bit different so instead of getting the results in an object by the name of the query that you use uh you just send an array of queries and you get back an area of results but it's still the same http request so yeah that's awesome so you can use it and um if you can afford it like i'm sorry not afforded if you if it makes sense for your needs you can actually speed up your response time of your web quite a lot but the next level of not so pleasant surprise was the realization that if you batch queries together the whole batch will be as fast as the slowest query in the batch right so that's kind of like um yeah i was sad about it i'll be honest with you but it's obvious so it makes sense for some use cases where you know for example that the queries will be fast enough to actually make sense to batch them together but sometimes you might want to pick something more heavy on the data loading outside of that badge so that you keep the other parts responsive enough but some content might actually take longer to load so it's not a silver bullet it doesn't fit all scenarios but i was nicely surprised to see that this is actually something that graphql supports and that was the surprising part and that's it that was my experience thank you very much very well yep very nice very nice presentation good job thank you can we get the slido up we have one question at least all right nice uh so let's go just for the evening uh we're going with cyto first uh just so you know everyone is on the same kind of page uh because we have a lot of i'm guessing we have a lot of users just online uh watching the stream so yeah we'll go through slider first and then go to some maybe questions from the audience so yeah the first question is how do you fix the slow data loaders problems if you are loading from the rest instead of directly from sql that's a good question um we don't load data from rest so i don't know but um uh i don't know i probably maybe consider using some kind of cash or the rest responses i mean if it can be avoided like if you know in advance that you are actually going to need this resource or if you think you are going to need it anyway maybe you could actually fetch it in advance somehow i don't know it really depends on the use case [Music] because the data loaders that i've seen or at least that we used they are really used to patch something that we own so usually when you have a rest interface it's probably something that it's external i guess um but in this case i don't know i would probably i would probably try to use some kind of caching mechanism for that i don't know i really don't know it's a good question yeah yeah i agree okay but i would consider that um but yeah well a bit behind and we don't have any questions on slido so let's move on to next presentation i guess okay thank you yeah thank you very much so next up is jan sallet uh who is platform team lead at twisto and his name of the presentation is our learnings from abducting graphql and fun fact about him is last week he walked 140 kilometers around mont blanc i don't know it doesn't sound like fun to me either so i don't know yeah just give us a minute to set it up but we should be going in no time okay i think you can start good luck okay hi my name is ian salvad and i'm a platform team lead at twisto and today i would like to talk about to about how did we manage to grow from rest api to graphql etwisto and what did we learn when we scaled the company from one small team to multiple bigger teams okay so i like to talk about introducing the problem that we had with scaling the graphql then i would like to describe why did we even choose graphql in the first place what benefits did it broad us they would like to describe our workflow that we find out that works best to solving all those issues also i would like to talk about some best practices that we find out that supports this workflow that works best with this and i would specifically what the talk about naming which is a big part of graphql and could cause some friction and problems so this is one problem i have to talk about and also last but not least i'd like to [Music] have some time for questions in the end of the presentation okay so let's go let me begin so facebook okay in the in the beginning a twister we had just like one app which was using a rest api it was developed just by one team so it was quite easy way to use the raspberry api but later fortunately the pump company grew but and now we have three apps on three different platforms which was web and react which was ios native app and android native app and we also have instead of one just one team we have multiple cross functional teams which are shipping features at the at the same time uh using the same graphql schema so naturally there were some issues that need to be solved so that's what i have to talk about and now let me like explain why did we even choose graphql as a solution to these problems um like i said at first we had just this one mobile app which had just this one design and it was it were quite well there's api but then we wanted to introduce a new react app which would also work on desktop so it was completely different layout from the mobile app so it will be really difficult to pull off this a rest api because it's we would have to redo all of our endpoints and include completely different data at the different places so it would be lots of extra work so we we decided that the graphql could be the solution to this and also we wanted to iterate more quickly with graphql with graphql so that's why we choose it so let me just reiterate some positives that we found out that graphql could bring us so we wanted to support multiple devices with different designs so as i said both mobile and desktop which was completely different layouts we wanted to iterate more quickly so the front-end guys can actually play with the design and can move things around as they want without any work from back-end developers and we wanted to be able to redesign the app without changing the api so we didn't waste time from my back-end guys just changing things because something wants something somewhere else and also we want to be able to release the features at the different times for for each app so we are not like locked in and the teams don't have to wait for each other just to be able to release the feature at the same time but when you have like more teams with working with the same graphql schema there are some difficulties that you have to solve obviously there some cooperation needed between the teams working with the same graphql schema because you know then they won't they need to know what they are working on so they're not working the same thing so they need to be in sync i want to want to know they need to know what's happening you also need to have some consistent naming because from the point of view of the front end developers this is just one graphql schema it's they don't care that there are multiple teams multiple teams are working on it so the naming should be consistent throughout the api you also need some consistent styles so for example when you decide on some type of error handling or something like that it needs to be you need to be consistent throughout the api otherwise would be really confusing to the frontend guys and now let me just describe this this workflow that we found out works best to to cooperate between our teams so so first thing when someone wants to create a new feature and introduce it in graphql it just creates a pull request with uh with the proposed proposal of the schema changes this is possible because we have we have schema committed in kit repository which means you don't even need to run the backend app to see the schema you can just open your github account and see it in the repository so you just create a pull request and anyone can actually comment on it so when you have the pull request the changes you just notify all of the parties participating teams which in our case are all of the front-end teams usually just one guy from each team just acknowledges that comments on it uh we also use like a derivative for that but you can use anything you want for it and it's like it makes sense because front-end teams can actually comment on that so if there's something that then they don't like we can actually resolve it before before the implementation even begins so there's not much wasteful work there and there's like another step but this is optional this is only possible in our app because our api is not public it's only for our own application so you just we just instead of implementing the feature we just commit a completely mock implementation we just request trans fake data instead of proper functionality and the benefit of this of this is that the front-end guys can actually start implementing the features before even the back-end implementation is finished so it actually speeds up a whole of the whole implementation of the feature so backhand the front guys then can start working in parallel to implement the feature so this really spread after things for us also when you have some bigger features it makes sense to just release it under feature flag for some internal testing before it's raised to public in our case it's just some it's a schools group and some colleagues in our company which are using a beta version of the app and when a thing is without box which almost never is but anyway when it passes through the qa and the testing you just launch the feature that's it okay now let me talk about some best practices that we find that work best with supporting this workflow and are really useful so the first one i would like to talk about is as i said to have a schema committed in repository because this allows anyone who doesn't even have the app running on his laptop to just see what's there and can actually propose changes and comment on it for example so otherwise it would it would need to run the app which is it's not required there also when you have the schema committing repository it's good idea to change your continuous integration pipeline to actually verify that the schema that you have committed there is actually valid and matches the code itself because otherwise when what can happen that some developers forgets changes the code and forgets to update the schema and you will get schema would get out of sync so it's a good idea to actually have this check in your ci pipeline to prevent this case this mistake also when you have graphql i i hope you have documentation for it at least we do and when you have documentation it's a good idea to have some linter in your ci pipeline to actually verify that the snippets you have in your graphql documentation matches the your schema because otherwise this will automatically prevent like your documentation getting out getting outdated so it's a good idea to introduce these tools that can parse any markdowns for example and find graphql snippets and validate it automatically so this is something that's really good idea to automate okay and then for example you have like multiple front ends with many different like teams developing it and running in different repositories it's good idea to actually introduce some additional checks to their ci pipelines to verify that their apps are actually working with the latest schema you have on your backend so you prevent these mistakes when someone releases the feature prematurely before before it is even deployed on the backend so this is also a really good idea also when you i would like to add that whenever you make some like guidelines or stars it's best to have it you know documented in your in your as a part of documentation for developers because when someone new arrives it's best to just reference him to this documentation instead of telling him what did it wrong okay well now i would like to talk specifically about naming because as someone said like naming is one of the hardest things in computer science and therefore it's one of the hardest things in graphql as well right so when we like designing the api and thinking about the names we just realized like we asked also the question like who interacts the most with the api and we find out that it's the front-end guys basically because in our case uh for back-end guys the graphql names are just one line of configuration right but from the front-end guys it's the names are all over the auto generated code is everywhere basically it's like huge part of the code base so it's best just to just let them let them choose the names because in the past we had some like we wanted to push some names from naming style from backhand to front end and it led to some really heated discussions and we waste wasted everyone's time and it's it was not worth it so basically it's to choose different but not let them just choose the names but that's the main takeaway so also there's like a you should not forget about the graphql schema that it should be independent of the design so so [Music] you don't need to change the api when changing design otherwise like you would lose one of the biggest benefits of graphql at least for us so it's really good idea to just make it independent in that way you can support many different views seamlessly and automatically and yeah this this is like [Music] a best practice for designing graphql okay this leads me to a conclusion so with this uh with introducing graphql and uh with interesting graphql on this in this workflow we actually were able to have even faster delivery of features with many times bigger teams than before so to us it was really worth it and it's allowed us to have a big flexibility we designed so the frontend guys can actually move around things in the app and it just works we don't need just to they have help from backend guys to change the api it's just it's very flexible to us and also we can support many different designs on different platforms so this is something that's very important for us because we have we support like every possible platform that is that is there so okay that's that's everything from my presentation thank you for your attention and now we have some time for questions oh good job uh very nice presentation uh we have we have a lot of questions for you uh so yeah we're not sure coming let's start from the top so rossia asks how exactly does graphql help you do the releases on multiple platforms compared to the rest okay so basically graphql allows you to have like different queries from different devices at the same time on different platforms which would be really hard to do at graph interest because you have just one usually just one end point for each resource but but you know we have like different requirements for each app so this would not be really possible so graphql help us there to just support many different platforms because every platform has different design basically and different requirements so this is like the area where the graphql is most helpful i guess i think nice okay um martin asks who reviews and approves prs with schema changes does it need approval from platform theme yeah basically this this like the the step i mentioned like proposing the changes it's just for for like general comments but usually the the whole delivery of the pr is up to the delivery team or the cross-functional team so they do the reviews themselves so they are those that are responsible for delivering the features so so no platform team is not reviewing every pull request that happens in our company no not really that would not be visible at all make sense uh does the whole twister app run on graphql now well this is a good question well it we hope it will something no but like it will it will be it will in very near future so right now we have both rest and graphql which is something that we found worked best for our discus because we were able to migrate slowly from the rest to graphql with both of them functioning at the same time using the same authentication and stuff so it was really easy for our front-end guys to just switch all right well this is very straightforward and a quick question i would say how do you test it yeah manually pro no automate everything you can right no but we have some like tests that test like the most uh queries that contain most of the stuff but yeah it's i mean what do you mean like testing a graphql implementation because you know that you should have some unit tests that verifies that the graphql information is correct and also the front-end guys you should also like have some tests that their implementation of graphical is correct and then you have some integration test that actually verifies that that the apps itself can connect to real backend and actually communicate correctly so there's like many many layers of the test i guess and you should in the best case have all of them otherwise you would you could have some unexpected issues on the production okay nice um let's do a last one uh so so we have enough time you know for for a break uh so have you tried to version versioning your graphql api oh we did not actually we have just one graphql api and it just works but it's a really good idea to when you like deprecate fields or something you have to track the usage of the apps so it's otherwise it could happen that you remove some functionalities is needed that is actually in use so it's when you actually do this when you track the usage of the fields on the resolvers that are there and you just deprecate them after they are not used and it's not a problem you can actually keep the schema living and you don't need interversion basically yeah i think it's kind of anti-pattern yeah it's on the bottom not considerable it's best to prevent this issue okay um well i think that's it uh let's have a short break before the next presentations so if you have some questions for for hansa or or feel free to just grab me and have a beer or something and talk about the technical stuff if you are interested in something like that there's plenty of beers so let's have a break and let's meet in like uh 10 minutes or something like that right yep oh hey hey let's go to next presentation all righty uh so next up is michael sanger who is principal developer at pipedrive here at piperev and you know my dear colleague maybe friend even and his topic is graphql server code generator which is super interesting i already heard it and fun fact about him is for several years he wrote restaurant reviews for all major newspapers in czech republic and hospital so if you go to talk to him afterwards he might tell you something about programming but probably more about food so yeah good luck have fun so yeah now you can hear me i can hear you can hear me so yeah i did something into the food scene you can follow me on instagram if you are into food porn and this is apple juice with trench foam on the top so hello again i'm michal i work here in pipe drive and i really like graphql i would like to show you how we are dealing with the graphql at scale and it's going to be a really sneak peek into what it's currently under development it's kinda experiment uh and it's also a pretty unconventional approach so open your mind and to understand the motivation i guess i need to show you how it all started so the part of pipedrive that is being developed here in prague is leads so users can manage their uh let's say business opportunities which is the shortcut of it they can filter it do an infinite scroll they can open each of these lead and edit uh convert archive multiple operations and the like interesting part is a couple of data comes from different data endpoints uh for example the person there it's a different endpoint and different microservices the labels come from some endpoint emails here so that's the situation and initially it was implemented with the react on on on a client with redux and it's a micro service architecture so a lot of requests redux managing and as the application grows it does not scale well so everyone knows that frontend loves graphql i assume everyone agrees with it i'm not going into reasons why is that so actually graphql helped us with the first scalability issue so we introduced a node.js backend service the graphql we incorporate the relay into the front end client so the graphql schema follows the best principles best best practices the relay is recommending and i recommend if you whatever client you use on the front end follow the best practices of relay and it's kind of funny that this simple step is still a big topic in couple of in many many companies i constantly hear that they are considering shall we even go this way so as i said let's let's assume that everyone is on this page and it's solved so we introduced this uh beckon service but that doesn't come for free there are a couple of um issues coming out of the manual work so it goes time and money to to build another service if you get new colleagues in your team you have to onboard them into new text tech you are also obviously building something like technical debt because you are constantly adding new features and the previous ones are a little bit behind and this brings again the question of scalability can it scale how will it scale if we will gonna add more features so that's that's the issues but kind of organically in a pipe drive we introduced the graphql federation it's kind of mainstream approach you don't want to have one single uh graphql service you build multiple of those and you somehow want to federate them into one graph because one graph one graphql endpoint is what the frontend wants uh yeah so maybe this could be like the end of story we have a federation and it's it works somehow but again the federation brings new issues so the main one is you still have the manual uh issue where issues over the manual work just multiplied by n where n is amount of these sub graphs you are maintaining then there is a question about the schema design and the ownership who should build which part of the schema how to reuse these principles and types if you want to use them who should for example build and maintain the note uh query which is pretty useful from the client point of view and actually will it scale because it brings extra complexity so how to how to deal with it obviously you can expect you will need more back-end developers and if front-end developers want something into the schema they have to talk to these back-end developers so they can implement it for them so these are kind of obvious issues and you somehow need to deal with it so i was like thinking about are there any alternatives and when i was thinking i was also watching youtube [Music] and there is awesome talk by adam cramer who is a member of the graphql platform team at facebook and he had a nice talk about how the graphql at facebook is being developed and he has also wonderful talk about managing massive schemas with codegen in facebook and the very interesting idea is facebook doesn't write their graphql service at all they have the backend and they just annotate or describe what properties should be propagated to the graphql schema that's interesting idea dramatic pose and the the huge benefit out of it is it's really simple to introduce something into the graphql schema basically every developer can create a pull request that somehow adds that annotation and part of the backend model is propagated to the graphql schema so the biggest idea out of it is automate all the things which kinda resonates with the the smart word by some smart famous developer i forget his name but that is the code easiest to maintain is the code that was never written but if you have this idea you would somehow would like to automate create the service automatically how to do it incrementally because we already have the federation so the overview of the whole idea is if we are able to start small and somehow automate and generate one of these sub graphs on one of these federatable graphql services we can integrate it in the current infrastructure and see how it behaves and somehow iterate on it and if it's convenient to use probably that generated service will grow more developers would like to use it and maybe even some services can be these graphql services can be deprecated and the final stage could be we won't need a federation at all because we will be able to generate the whole graph just like that so if we don't need federation couple of issues and the complexity will be reduced then comes the question how to automate easy to say but how and you can immediately come up with two approaches you can think about yeah i want to have some config i will create some code and this together will be an application that will run in the runtime and somehow behaves as the graphql service and then you have to deal with the errors in the runtime and you have to somehow imagine you are committing changes into the config how do you decide that these changes will actually work because all the rest is happening in the runtime the code somehow takes the config and act based on it it's not the simple thing to reason about then is the second approach again inspired by the facebook which is you have some config or some annotations you have the code gen something that takes the config and generates the code as you would write it and that's kind of mind shift because usually you are not used to write code that writes the code you just write a code and run it but if you can generate a code as if you wrote it it's kind of a huge boost to your productivity so we picked that way with config and code gen and our objectives here are we have to provide great developer experience because developer experience brings developers that would like to use it and uh if we produce a human readable code we can still stay with code review and all the processes that we have which is alignment with processes because we are good with dealing pull requests we we can run the continuous integration all the tests everything on the generated code as we would do normally with the manually written code so how to generate the human readable code it's like when you think about it it's not that simple maybe but then you realize we probably do it every day actually because when we are using prettier and we are using eslint and we have the babel we are actually doing it you can you can write as pic and the preterio will beautiful make a beautiful output out of it and the babel is awesome uh powerful tool that gives you uh gives you way how to how to generate the code so at the end you have the typescript output and maybe this is the way so what features we want to support with this generated code we want to have automatic pagination because we follow the relay cursor-based pagination we have a global id because global id is great for the node query and node query is great for uh some simpler schema and the front-end can automate with that query a lot of things we want to use data loaders and we want to introduce some common scholars mostly for day times for example and another great feature we want to have is a mocking because mocking is a great thing for fast prototyping and here we come with example how quickly you can build a graphql service that provides some more data so we have a config yep and uh here you can see that you define some types actually here we have just the one type it's called user and it has some fields id name and an email both are of type string and you call some uh mocking uh or faker library methods called full name and email and then you have a data loader that where you just say hey this data loader will provide some mock data through that helper and in queries of that config you define there is going to be users query which gives you a list of users and again it's going to be mocked with the disconfig you can just call the tool the code gen itself and you say generate the output to my service folder and when you go to the my service folder you get the whole service there and you can see you get couple of files basically something you would write manually if you would be implementing that that server just much more faster so yeah if you run the service you can throw this very uh yeah so this is the pagination with cursor and that thing so you got some data uh note that these uh global global ids and the cursor this is something you get for free and because we have also uh id as an we have a id field defined as uh no no now id we also get a node query for free so you can use that id to get the detail of the user and you immediately have a running graphql service your client can immediately your your front-end developers can immediately start using and that's of a huge boost in uh productivity uh and once you have your back end ready you can remove these mocks and start returning real data here you can see that the data loader instead of mock start using some data source with a rest api endpoint and you get a new field where i want to show you how easy it is to to rename something that's coming from the rest api but you wanted to have it in your schema a little bit renamed so instead of returning class underscore login you want to have last login in a camera case in your schema and that can be done easily just by the config and you add some useful description which is always very beneficial if you add in your schema and that brings me to an interesting point because the more rest api services you start using the the sooner you realize that it's a wild wild rest out there actually every service is little bit different and tons of small differences makes a usage of some like generic generated code very difficult to use if you if you want to have some example let's see that we have another schema with user and tasks so you define the user id name some score as good as end and a list of user tasks taken from some data source and then you have a task with uh id and title coming from a different data source but the the user data source gives you something like this uh if you don't have the score it's not now but it's a string unknown for some legacy reasons whatever you got this response what can you do with it and the tasks are not ids but some urls to some api endpoints that's how the data are coming what can you do with it if you want to have your code generated because the data loader expects ids but you get urls and that brings me to the most powerful feature of the of the code gen which is plugins so you can easily define a plugin to to the generated one of these generated files can be when it's generated and can be modified with a plugin so if you remember we get couple of files for each feature defined in the schema and one of them is the mapper and mapper is uh is a basically a function that is called after the data are fetched from the rest api you want somehow map the response to the internal shape you are working with your solvers so we need to modify the mapper and the actually the generated code of the mapper is kind of stupid it does nothing it just accept the data and return the data because when it's needed it expects to be modified and how it can be modified it's again the wonderful powerful tool called babel you just provide a baby plug-in that if you are not familiar with the with the paper plug-ins it basically works like this uh table lets you work it's called visitor pattern and it basically tells you give me give me an object with couple of defined functions and if the function is matching with the line i'm current currently processing i call the function and you can alter the code of that line so what we need to do here is we have to find the return statement we get we then we create with a babel template some different code we create a like typescript expression and we can insert that expression before the return itself and that way we get from code before this one after so we were able to modify not so useful response of the rest api into something useful and reusable so this way you are basically open to change whatever you want when you need it but if the rest api is somehow is aligned with your graphql schema you don't need these plugins at all so just to give you the overview of the development uh how we think we will soon use in pipedrive it's like every developer can modify the config it's really simple i hopefully you you saw it in these examples sometimes you need plugins when the rest api is not well aligned with your schema and client expectations but then you just run the code gen you get a code which you create pull request you send it to your github your colleagues can review it your continuous integration can run all the static analysis tests whatever you use and if it's fine you deploy it that's it so that way you probably or we think we will be able to to build the whole graphql service just with this thing and we probably won't need the federation at all yep so this was the peak into our kitchen we are really working on it and it's my daily job to to develop it really work in progress it's an experiment maybe it will fail and my next talk will be about how we felt with graphql server cogen anyway i would love to hear your opinion and at least wish us luck thanks for your attention i'm just glad you were watching youtube and not some other video platforms when you were thinking um which is video platform email uh so yeah let's get to questions uh so we have a few of them and the first one is well kind of obvious is there a cogent project production ready and will it be open source yeah defined production already uh so as i said it's currently under development we work in uh the development in pipedrive it's uh being organized as a mission so currently there is ongoing mission we are trying to use the tool and we will see within the mission how it goes so definitely not yet open source the question is how to open source it if the output of the cogen currently is a service that is being used in pipedrive it definitely has a potential but it's extra work to make it easy to modify to different surveys different server implementations and whatever maybe if you are interested in that code join pipedrive and you can work on it nice the next one is is this a monolithic conflict for all the teams or each team has config and it's been put it together in one app at some bill good good good question uh currently it's basically yeah monolithic config but of course the next level is probably every micro service could somehow publish each its own piece of the config that can be somehow gathered together validated everything but currently to start small we are thinking about uh one single conflict of course into javascript or typescript so you can import parts of it but currently a single config approach okay then anonymous is asking what's faster learning code junk config or adding graphql by yourself faster i would say yeah i'm fan of code gen so i would say it's definitely faster to add something into the config and it definitely scales better you have to just if you onboarding someone into the code gen you have to explain him couple of things in the config it can be more complicated to teach someone how to write babel plugins but that's again not something that's required and the optimistic expectation is we will need just couple of these plugins just basically to handle different variations of our rest apis and others can just reuse it so it i think it's definitely faster to start with the cogen once it's finished okay and uh let's have a last question vam i think them has had the last question last time as well so we'll just be ending with your questions today uh how do you keep up with changes is it working when you adjust your api i'm not sure i understand uh so you mean yeah if you change something on your api uh how and you need to deal with the changes in your graphql schema that's that's a problem and it would be definitely better if the config if the config is part of the microservice itself currently i think we will deal with it with uh something like end to end tests or something like that if if back end is somehow changing something that's published without proper notification of their users that's issue but if you have the config you can quickly check if some part of your uh response is being used and act according to that that's probably the answer nice okay uh thank you very much uh very nice presentation and yeah let's have a short break before the last presentation that we have today which is gonna be remote from california as i understand uh yeah let's have a 10-minute break grab a beer talk to mikhail whatever bring up pages to the screen nice nice okay we can see the presentation we can see you everything so yeah uh next up is tejas shikhari i'm really sorry if i'm butchering your name so you're just repeated the correct way um yeah so he's connected all the way from california as i understand and he's a senior software engineer in netflix and the name of his talk is graphql federation at netflix and fun fact about him is he grew up on the equator in singapore where temperature is always more than 28 degrees but his favorite sport is skiing you can start now good luck and have fun thank you can you all hear me just a sound check yep well all good sounds good good evening everyone um i'm uh from coming to you from here from sunny california and i'm hoping to present to you today about our your federation project and netflix um [Music] so my name is tedious and i'm a senior software engineer at netflix sub for past three years or two years i've been working on building the federal graphql platform with a wonderful team here and i'm excited to share our journey with you all today so the agenda for today is going to be uh sorry most of you might already be familiar with graphql federation but i'll start with a brief intro next i'll talk about our journey building it wins and challenges and trade-offs wins of building some with federated and also lastly if the investment is worth it for you so let's jump right into it so you might expect in the early days of a company you might have a simple client with a server and a monolithic architecture and talking to the database then as the companies grow the user grow and features grow we might start breaking into multiple services so that you know more than one you know a larger team of back-end developers can contribute to so this is good but you know it creates problems for the client you know it's they have to integrate with multiple backend systems and this is where our graph game came along and was a really nice thing it uh you know it provided the api that the client needed and it aggregated data from microservice architecture this is a very popular architecture and very much used in netflix and then we might have you know multiple graphql services to serve different use cases and then over time we the idea of one graph having uh you know often implemented as a graphical monolith was popularized you know by apollo in the integrity principles uh you know there's a one graph used in many places in the industry and it worked really well because it created leverage and reduced the duplication and you gave like one api that the client needed and eventually this graphql started growing and growing as more features were added and became a new monolith and oftentimes it's implemented by one graphql team so every time a new feature was added we would do code change in both the microservices and the graphql layer and oftentimes this was by a different team so it required collaboration meeting to make the change happen also the graphql team may have to become experts in many domains because you might have a lot of different domains within the company and it's all exposed to a single api also a first line of support for a lot of the upstream issues if there's some issue in the upstream the consumer of the api is going to first reach out to the graphical demands what's going on as the graph grows bigger the code goes bigger you add more dependencies for external libraries generated clients if you have gopc or thrift or something like that and as a result of those dependencies uh whenever they're updated new code comes in potentially causing failure and if you have this graphql monolith you have a lot of frequent code changes pretty much every day and if every new feature has to go through this uh layer and lastly cascading failures too so that you know whenever if a bug or a memory leak is introduced in one resolver it could cascade to a completely unresolved related thing and uh so since this talk is about graphical federation i'm going to talk about how graph your introduced graphical federation a bit and see how it motivates to solve some of these problems so the key value proposition of graphql federation is to distribute the ownership of the graphql api and the core ideas behind it is this federated type which is the ability to extend the type across the service boundary and then ability to hydrate that further so what do i mean by that so in the monolith we might have this movie type which has you know personalized title image and run time in seconds you know and they might be all coming from different services in the back so when you move this movie to a federated type we might want to have a type that extends across the service boundary so movie extended across images and runtimes and then each of the service provides the field they own in in the type and then so the key thing here is all of these services know about the ids it's the primary key of the of the of the type and they all agree on it and hydrates the field it owns and this is the core kind of idea behind federation having this federated type and you know first apollo kind of popularized it with the follow federation specification but then there's also other implementations for example the folks at atlassian have a similar thing called nato which is uh in essence the same concept but a different uh syntax is the apollo federation and with the ability to federate a tide we envision this architecture where we have these different services uh is what we call dgs's this domain graph service and these are standalone graphql spec compliance services that are provided you know having a type that is extended across service boundary uh there's a graphql gateway the graphical gateway exposes the client schema and it provides you know clients queries coming to the gateway and then the gateway then splits the queries into sub queries and it goes to the different services the schema editor is a workflow component that makes sure that the schemas of all these domain graph services can merge together to form a valid schema and uh you know check to check it out in more detail like if you're interested in the technical detail we have a blog post and also a qcon talk that my two of my colleagues did that was excellent so feel free to dig into that so let's jump into netflix's journey implementing graphql federation so at netflix we have two primary graph api technologies i want to say you know falcore which was actually invented at netflix and graphql so if you you know if you if you're browsing netflix and you're deciding what movie to watch and you can't figure it out when you're on that ui that ui is probably you know powered by far core and then we have a new organization within netflix called studio which is using graphql and they're really similar uh to each other they just you know they were invented at the same time around and and graphql caught on in the open source world so around in early 2019 we saw that the graphql monolith in studio was growing pretty rapidly and it was about uh uh you know some facing some of the bottlenecks that i i touched on in the earlier side uh and to to we we proposed the studio architecture similar to the diagram i showed earlier with the gateway registry and uh and um and the dgs's components the domain graph service and around the same time apollo also released the federation generation specification so this was the and we saw that and we were really excited that we thought it was a really good fit for what we were thinking within netflix and uh and then you know we we got a green light for the studio architecture and we also started collaborating with apollo uh we didn't at the time their gateway was implemented in typescript uh which was not a good fit for us because most of our stack is in the jvm so we rewrote the gateway in kotlin and integrated with all our internal observability systems and so on and to do this we had two core teams the federation team which is my team and the developer experience team which whose goal was to make graphql development easy and netflix we had volunteer resources across security observability and schema governance to make this uh easier and and this imagine about 10 people working full-time on this project about uh six months to a year later uh we had a first mvp ready of this uh platform and we started working on adoption and migration so the first step was to take the graphql monolith we had and make it the first dgs under the gateway and then we started chipping that dgs into different dgss like movie so we took out the movie type put it in movie service uh took out the production type put it in production service and so on and but this and we did this without affecting any of the client api so essentially the clients were still executing the same query instead of going to the monolith it started going to the to the domain of service built out developer education documentation and then initially we didn't have full spec compliance with graphql so you worked on that as well and now uh 2021 and beyond we're thinking of you know expanding this architecture has been successful for the studio organization and we think of how we can expand this to netflix api so netflix api is built in file code and we first need to move it to graphql before we can apply this architecture and then also we have a lot of internal tools you might have heard of spinnaker and uh some of these things that are open source which are using rest right now but the could be moved to graphql so there's the idea of enterprise edge which could power some of the internal tools at netflix such as spinnakers customer support etc and then you're also looking at solving for federation challenges that are that i'll cover in a so what are second wings of the federated architecture so first and foremost netflix studio is powered by graphql so just to give you an idea of what the studio is think of the studio organization as like hundreds of services and 60 plus application that you know people that these applications are used by our studio folks who work in the studio so think of like different processes like movie production post production visual effects etc and basically the applications help to streamline this workflow so these apis are now largely powered by graphql across the organization and federation has been fully live for studio for over a year now the second thing also is the federated architecture definitely delivered on the promise of faster product evolution so i approve this stats from production last week so we have about over 2 000 object types over 600 queries and over 900 mutations and if implemented in 56 over 56 000 lines of schema and this would not have been possible for one graphql team to implement by themselves we had hundreds of engineer engineers contributing to this across 90 gts and i would say it was not alone the federated architecture that made this possible there are a few other things so first i think since the ownership of the api is distributed now many engineers within the organization have to learn graphql so investment in developer education is key and i think uh bootcamp for federation and also a lot of example code for graphql development that developers could copy and paste bootstrap their services additionally our developer experience team also created this this is open source uh the dgs framework where you can bring built spring boot java kotlin uh graphql services very easily and this integrated with all of the netflix's cross-cutting concerns across observability metrics etc and developers can now just focus on writing the graphql application we also created a ui in front of our registry which showed all the different services uh their schemas the schema changes that were happening and also support links so that people can reach out to the teams where the issue if they had an issue with the schema or the api and also netflix has a lot of legacy applications implemented in rest with grpc and etc and and what the the federated architecture allowed was we could just convert some of these existing applications to power graphql and then start contributing to the graph so this allows us to modernize legacy applications and put them in in general another big one was reduce operational burden on the graphql team uh one of the biggest challenges we faced when i was owning the when my team was earning the graphical model it was we spent so much time on support uh issues and now the consumers of api can directly reach out to the team that owns the api and this is possible in a few ways so let's say i write a query as a consumer of the api and i get that this error in the error there is an uh location of the dgs where the error happens so in this case the error happened in mr.js and also the sub queries to that dgs that field now i can take this information go to the images dgs team through the slack channel and ask them like what's going on can you help me with this issue uh another thing is also we integrated a lot of these components from the top gateway all the way to the lowest microservice with our distributed tracing system that the netflix telemetry team implements and uh essentially it also showed you where the error happened so in this case this this is yellow and there seems to be some kind of issue so you can click through it look at the logs and so on and so forth you can go reach out to the team and also if you're developing a latency issue we had a zip code trace where you can identify where exactly the time was spent it showed you uh precisely and then you know maybe you forgot to implement a data loader let's go find the team who didn't do that and also i think uh for every request http request we also emit uh metrics to our server so we we actually uh augmented these metrics with some graphql tags so like what type of operation it is query mutation subscription the operation name and the query hash and this allowed us to then segment our metrics when i was looking at debugging some issue you can you can segment them by what query was affected and then you can see all the queries that were happening in production so all of these things helped and you know i'm sure that they're the wins but like with with any architecture there's there's always challenges and trade-offs so uh let's jump right into it and the first one is schema management is hard and i think you have seen across many of the talks in this in this meetup that this is a pretty hard problem so we had we had to get ahead of it by forming the schema we got ahead of it by forming the schema working group and there are three goals review the schema changes that were happening help with the schema migration from the monolith to federated architecture and also create best practices for student design across pagination error handling uh and so forth but we quickly realized it was hard to scale the schema group uh as the api started to grow the api was going much more quicker and so many since so many people working on it so initially we we tried to review every schema change but then the graph started growing and we pivoted to being only like new services and focus our energies on the core shared federated types netflix also has a culture of freedom and responsibility so you know it was hard to enforce like a review process i think a lot of teams tried their best to do it but oftentimes sometimes they would review changes within their teams and just push it out and also the best practices were hard to enforce so this is something that we are still working through you know thinking about how we can make this process better through tooling or new ideas that will help collaboration become better the large one graph also posed a problem so many developers were building schema in isolation and and although it wasn't as bad as before with the micro services we did see some duplication in similar functionality and this was largely because the developers weren't really talking to each other as much uh name collisions are inevitable and and you know as you saw earlier we had like over 2000 types and 1500 fields of root types and they all live in a global name space so you have to think of some name spacing practices or consistent naming or logical naming so we went with this approach for our studio side where there was a lot of different uh teams uh so for example here i've shown an example we have a protocol suite of applications that help with the production of a movie well we have hubble which man does talent management so how can you discover these apis even in the graphical window easily so we we wanted to make it easy to search this did affect the the code gen a little bit on the client side but you know there was a trade-off that had to be made here and you have to kind of decide this for your own organization editor handling area so error handling is already pretty vague in graphql spec and federation amplifies those challenges even more so let's say you know there's a couple of ways many ways to do errors in graphql so let's say i have this simple query user and i give give it a bad user id i could either have it in the errors block with not found or i can create an object and model it in the schema to support the error block scenario we created a spec and mapped all of the common http codes uh to us classification and then people and builds built automated tooling around this so that people can throw those exceptions and this would be automatically populated but even then it was difficult to enforce and largely the error handling was pretty inconsistent and this actually in turn affected the clients because they weren't able to reuse their existing error handling code across use cases product driven api is another big one where historically graphql was designed for the clients and this is a quote from the the wikipedia graphql but but it's it's the reality is the federator architecture does bring the graphql api closer to the back end and how what are some how do we work on some of those challenges so how do we prevent graphql from looking at the service model so one idea was there was we have a schema first design that has worked well for us you know there's some extra boilerplate mapping code between the service types and the schema types but it keeps the schema models independent and they can evolve independently code for schema is also fine as long as you have a separate set of models from your service models and then the new thing we're thinking about is how can we have more ui client developers contribute to the graph directly so they're already involved in the schema design process but what if they want to actually contribute to the schema so we're thinking of maybe a serverless platform for domain graph services somewhere so this is like more future thinking security is another area so in a federated architecture it's really easy for anyone to expose a new field to or a query to the outside world and having the right side of authorization is extremely important having a centralized system can help as you can declaratively define policies for the new fields you add and also you probably want some kind of a default authorization policy because when new services are coming up you might forget to set up the authorization for some fields and accidentally expose data to unauthorized users some kind of audit system or penetration testing can also help to make sure that you're not leaking unauthorized data so security should be forefront because it makes it so easy to expose an api in production some new challenges so now currently we are migrating the netflix api experience part to graph url and graphical federation some of the challenges we are thinking about is a b testing so we do a lot of a b testing and iterative product improvements and how can we do that well with graphql and federated architecture you know a lot of the a b tests fail and you know result in a lot of deprecated fields so how can we have a cleaner deprecation flow for failed av tests uh we're thinking of using something like an experimental directive which is which is already something that's deprecated and uh has an easier deprecation workflow but that's something we need to write iterate on another challenge is multi-platform so netflix has uh four ui platforms the mobile platform the tv platform and they all have different use cases so like the mobile platforms have lower connectivity but you know a large amount of ram while tv has always connected to wi-fi but you know less amount of ram so we might have to think about different pagination schemes we want we have different error partial error handling so how can we use the same api define one api for all these platforms so this is something we are also working on and then lastly there's also an idea of server-driven ui within graphql this is a big interesting area and we're thinking about how to do that as well in the federated world a few things we use apollo federation spec so there's some spectrum client stuff and miscellaneous issues with apollo federation that we're working closely with one example is if you have a monolith schema with polymorphic types so if you have a lot then it will be more complex to magnetic federation so what do you mean by that so let's say we take that example from earlier with with the movie uh where we split it up into three services uh what if it had implemented an interface in video now speeding that up becomes a little bit more challenging because this uh by itself is not valid according to a standalone graphql server so this is a challenge another challenge is like involving shared value types so so you know so before i talk about what the value type is we know what a federated type is so a federal type is a shared type across services where each service kind of provides certain fields on the on the federated type value types are another kind of a shared type which is i like to use the example of page info so if you use relay animation uh you might have this common object that you're using in multiple services uh movie paginating movies uh or paginating talent and the structure of this object is same right so you want to be able to share this across and that's it's possible but what happens is when you try to change this type let's say if i try to add a field and it's used in 10 different services how am i going to deploy those services all at the same time so that this type is valid across the entire ecosystem so this is another challenge we are working on and hoping to you know come to a solution but you know there are uh challenges with any architecture and i think the long-term uh goal i want to kind of cover the last part is is it worth it for you so this is a pretty early stage in maturity life cycle you know i think apollo is doing a fantastic job with their offering but there are several trade-offs and challenges that need to be addressed it is a time and resource intensive investment and i think we also partially got lucky in our organization because we had like a lot of champions for this architecture excitement was high which made it successful but but you have to navigate this in your organization some other questions to ask if you're considering a rough or federated architecture is do you have a robust micro service architecture in your organization federation might be a really good fit and the raw monolith reached the critical mass for a federation to make sense how many developers and federated services do you expect to have is it going to be hundreds then maybe federation is awesome fit but if it's only going to be a handful of services you might want to consider keeping your graphical monolith and lastly i want to bring this slide back and i showed all these bottlenecks at the beginning and i said federation solves most of them but it is not impossible to solve this without federation right we don't need federation to solve this so for instance we can take some of the concepts of distributed tracing and our handling and make sure that the graphql team is not the first line of support we can also create a way for multiple notable teams to contribute to the graphql api followed by you know if there's frequent code changes you can solve by carrying them in production so by connecting them you can expose a small portion of traffic to the new code and then once that's all good you can say okay i'm gonna i'm gonna ship this out that way you can avoid uh outages and then cascading failures can also be solved by functionally sharding your the same app but deploying it to different charts for different use cases so one use case uh does not cascade to other other usds so there are a lot of strategies to solve these problems and with that i hope this gives you a glimpse into the the journey at federating graphical and netflix how we started how it's going and i hope i'm really excited about the the the graphql but also about how federation happens on graphql and lastly i would like to thank all the wonderful folks here at pipedrive for uh giving me an opportunity to present here thank you thanks thank thank you uh so yeah thank you very much um there was a lot of clapping i'm not sure if you could hear it um can you hear me yeah i can sorry perfect nice uh so yeah a lot of clapping um very good job and yeah we have plenty of questions for you and yeah we have a lot of questions from arjun kurapov i can tell you it's all the way from tallinn estonia and yeah so do you support implement node query in your federation gateway any issues with that yeah actually actually we do so uh so this is a little bit of a interesting use case so so uh having having so by no you mean the object identification right just to clarify the relay object identification spec yeah so uh in one of the slides i show you how we have to hydrate the the the federated type uh so essentially there is a concept of this thing called entity resolvers that exist within apollo federation that you can use to hydrate so basically what we do is we automatically generate for every federal type a node type so we have this like and this is automatically generated on the gateway and so we extend that federated type and we based on the key fields so you can have a composite key for the federated type uh we encode a global id for it uh and the the logic for encoding from the so the key fields are like all the fields in the federated type right we take those fields we concatenate them and we encode them into a global id and then that global id can be used to look up all of the federated types automatically so this is all works out of the box so as long as any developer defines the node type it will automatically become but if any developer defines a federated type it will automatically become a node in our uh service does that make sense and uh yeah yeah i think it does uh any issues with that though uh no so far we haven't we haven't ran into any issues and actually really helps with client-side caching because it's so let's say if you have a query uh uh that you did and then you know you you only want to refresh that one particular view in your client and you only want to look up a part of it and often times it's just one particular type then the node query is really useful for that so it's been working really well okay thank you let's get to the next one so tomas is asking do any of your services use golang if so is there any package supporting federation similar way like your implementation in kotlin so unfortunately we don't have anything in golang most of us most of our stuff is uh most of our federated services are in java and kotlin and then javascript so i i'm sure though i think since we use apollo's federation spec and one of the reasons we use apollo's federation spec is uh since it's very uh broadly used in the community i'm sure there is something that actually uh is available for goal and by now i would have to probably look it up but i'm sure there's a there's a federation package for for go okay um i'm not sure about that i think we had some issues with that one but yeah it doesn't matter so let's go to the next one uh do you do your services register their schema in runtime or in compile build time yeah yeah this is a really good question so actually we think of the the services that are running uh they're independent of what the gateway is doing so the services are built and deployed independently and then there's plugins uh and pipelines so so plugins allow you to push the schema to the to the to the registry uh and you can do that before you deploy your service or you can do it as part of the deployment pipeline but it's completely independent and uh what we're trying to do is we try to deploy the service first because most of the changes are backwards compatible so you're adding new fields so you can deploy them to your to your individual uh graphql service and the gateway still doesn't know about them but once they're deployed successfully then you can push the schema up and then the gateway knows about them the thing is it doesn't really matter because any new field you introduce it's it's not going to be used right away it's going to be used after a certain amount of time so uh there's a there's a gymnastics that needs to happen there but i think what we're doing is is working well for us so far all right and another question from martial and uh so any plans to open source your gateway or schema registry maybe collaborate on pipedrive graphql schema registry which is open source as i understand yeah i mean this is always on on top of our mind uh i think especially the gateway uh stuff is is fairly open sourceable uh considering it's a rewrite of uh the followers typescript implementation and we've built on some new things on it but um i think we've we've considered doing that we just haven't had the time or the resources to make it make it happen the schema registry however is is built in a way it's it's fairly tight to our uh uh data system inside so we have to do a lot of work refactoring work to make it generic so that you can try to like any data system uh so that will be a lot more effort to to uh open source but uh the gateway is certainly at the top of our mind and maybe maybe we will do it in the next year or so all right looking forward to that um thomas is asking are you by any chance using grpc as a communication channel between dgs and gateway uh right now we are not doing that in production uh but we are we're doing some experiments uh with it particularly uh but not in production and maybe if there's a more follow-up detail that we can we can discuss offline like what the reason would be okay i guess we can get give thomas your email address or something um so yeah another one from artem so i noticed subscriptions in your slide do you reference main entities and subscriptions to or only event data yeah so so we do and this is essentially subscription the field itself is tied to one uh dgs usually and uh that the event is processed by that dgs and then you can actually hydrate after that it's just a query uh to hydrate whatever field that subscription uh whatever type that subscription returns so so yeah we have full spec compliance on subscription the hydration of federated types is implementation is in progress but it will be spec compliant by pretty soon okay works for me uh is ui for end users meaning us in the audience using the graphql federation as well yeah the ui is actually powered by a monolith graphql api but as i said it's just a monolith graphql service and and like i said you know how we're thinking about moving graphql federation to our internal tools so in the future we can expect this monolith to be one of the dgs's inside the enterprise edge that i that i talked about okay yep and another three questions from artum he's very curious do you limit query complexity uh we have a query complex static query complexity implemented but we haven't seen the need to turn it on yet uh but i think as we we launch it for the netflix api we might have because that's a more public api the the studio api is authenticated so so you need to have work in one of our studios or you know have reason to access that api there's a fair number of people who have access to it but but we haven't seen any uh reason for uh turning on query complexity yet but it is it is built into the gateway okay uh did you have any incidents when gateway or schema registry was down do you use it in internal network private requests too yeah yeah certainly uh so so schema registry um is is actually like i like to think about the gateway as the highly available component of the system and the schema registry as the as the workflow component so certainly like the registry should not be as it doesn't need to be as highly available so what we actually do is we we have the registry uh create an artifact uh that is stored in s3 essentially every time a schema change happens and then the gateway only needs that artifact to run by itself so so we try to decouple the gateway from the registry by having this extra layer of resilience in between is that was that the question or did i misunderstand it uh on the gateway side where we since we talked to so many dgs's but they were you know related to dependency management but nothing nothing major from the architecture itself okay yeah i think i think we can you know connect you and rtm as well but yeah we have two two more questions is it yeah uh and this is a funny one so are you aware of any easter egg in netflix ui uh let's see there is a lot of stuff that happens in the netflix ui i mean if you're if you're in poland you can uh access the the netflix the new netflix games uh tab on the on the on the ui and and check it out but where do you have to go to phone for that so vpn to poland you say yep cool okay and uh last one from rtom as well your best friends by now so do you register api consumers for tracing schema usages and deprecation how do you analyze its usage yeah sure so what we do right now for that is we take the for each query document that the gateway gets and this is only on the studio side we need a more scalable because the traffic is significantly higher on the api netflix api side but on the studio side what we do is every query document uh we log into elasticsearch and then and then count uh what fields are used over there and then basically keep track of uh join them with deprecated fields and see if there's any deprecated field usage stats uh and we actually show them on the registry ui and then that way uh when you when the developers are going to the deprecation workflow they can they can deprecate the field and then they can notify the clients uh saying you know we should remove the deprecated fields and and you know and when the usage goes to zero they can then uh make it backwards incompatible change to remove it from the yeah all right i think that's it for today finally uh so thank you very much uh that was that was fun and yeah have a well good night for us uh but good good day for you and yeah that's it that's it thank you very much thank you thank you all for uh it was wonderful okay uh so i guess that's uh pretty much it for or well mainly for the online users because that's that's the end of our plan uh we have some feedback form that will send you out uh so if you want to spend a few minutes you know helping us out uh improving the next and next uh next event yeah uh um it's uh you know it's gonna help us out so please do that and um yeah that's it uh i think we can turn on turn off the online online stream now uh so yeah yeah but for everyone in the office we're not closing uh i think beer is there or on the way it's here okay so more beer is here and yeah the the party continues you
Info
Channel: Pipedrive Talks
Views: 365
Rating: 5 out of 5
Keywords:
Id: UZWe5Usun7I
Channel Id: undefined
Length: 142min 20sec (8540 seconds)
Published: Wed Sep 15 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.