Decoupled Drupal in real life: Lessons from JSON:API at scale / DrupalCon Global 2020

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
and it's really easy to turn on um okay so it's 12 15. i'm gonna get started here in a middle minute but i'm having people say what experience they have with decouple drupal so i can like i might it looks like there are enough people here that maybe have very little experience with the couple drupal that i'll just pitch the talk kind of in general a little bit more that way so i'm going to go ahead and get started my name is aaron rasmussen and i work at aqua i'm not i was a developer for a long time before i worked at aquia but like i'm not a developer now i work with our teams to help support enterprise size sites as they like work with um delivering their drupal and decouple drupal experiences out to the site visitors on the internet so this is going to be pitched a little bit dugopsi and um and and it's kind of that way on purpose just because i think that is a part of the story that we don't tell very often um so this is decoupled drupal in real life it's really neat it's really fun to do as a demo like you turn on json api and build a front end and it works really great but what does the start to look like as we deliver it and then also what does this look like as we deliver in scale and what kinds of things that we've learned that other people can learn um the other thing is i've noticed in other sessions that um some some people have had difficulty with slides advance so just keep refreshing it really helps to see the next slide so switch there we go oh i'm back so this is kind of like on my intro slide i was remember i was spoke at drupalcon seattle we talked about caching decoupled drupal was on the radar there were a lot of people that had really great projects um but we didn't have a lot of projects we were seeing kind of out there in the wild yet and i really feel like we've started from the um the beginning stages of decoupled drupal and we're now like sort of hitting the teenager stage where we're seeing more of these projects like delivered and um you know visitors are accessing them and we're getting more experience with being able to deliver these sites at scale so i have kind of like the input there um in march 2019 json api was in drupal core pretty soon after that varnish support was added for jason api and that's one of the key things that i'm talking about uh so part of what i want to talk about when i'm talking about decouple drupal isn't just the fact that you have a drupal 8 site and you have a front end and they're you're using your front end to deliver content to the web or you're using your front end like in gatsby or tone to like create a static site and then deliver that to the web part of what i'm seeing cut like our customers use and people use drupal on the platform like on on the internet um is this move toward using your drupal 8 as like a content aggregator or using migration using other tools apis to be able to pull in content from other sources whether that's like social media product and information systems or e-commerce systems sometimes people have product customer data platform information they're using in drupal um and then also they're you consuming that content in a variety of ways right you might be consuming it in a decouple like front end or a static html front-end site or you're maybe sending that that piece of content from drupal out to like your marketing channels or you're sending it pieces from drupal about your product and your you know like information that you want to share out through social media channels so drupals were playing this role of like a hub almost or a content aggregator between all these different systems and then out to out to the consumers whether they're choosing to go through a regular website through an application or through some other software that consumes that information it's really interesting and i love the way that people are using drupal in these like new more dramatic ways because it's using the api system and it's using those content types and the way that you can be flexible about the way you set up data and then having it go out and then have it come in so the first project we'll look at is really kind of more fully decoupled website so this is union bank they did a beautiful website and um i'm really hoping that at some point the developers do a public talk about um that project because it was really great their site's really gorgeous i'm going to talk a little bit about just the back end from that like when a customer visits the site or when a site visitor visits the site what process that request can go through before it delivers the final results to that same viewer and i think that's helpful because i think it helps you understand where the caching layers may be and we found you know with delivering like at enterprise scale caching is really this key to success so what's unique bank why did they choose to go to coupled well you could they were able to build in some security and performance features that help the visitor their site they're very security conscious because their bank they get separation of concerns from like their javascript front end to teach you touches one part right and then the drupal back end um has a different permission system and has a different access level so that's important to those have things separated out they also decided to choose to use a web application firewall and cdn on the front end so they've got site visitors that come in the weber cust comes in it may be that it is just the home page and there's nothing and it's been requested a billion times and it's saved in the cdn and that response goes out right away so that's the first caching layer maybe it is a detailed page or there's some customization that's required so they go to their in this case it's a view javascript application that's the front end and it builds the request if that needs additional information from drupal it sends a back-end request to drupal that may be cached in a caching layer or might be built in drupal and then sent to the front end where it's assembled into a page and it goes out to the site visitor it's a little bit complicated but um it's kind of like that series of steps what um when you see it more simply you see it like a visitor has a request goes to the front end system if the front end system needs something from drupal grabs that request builds it in drupal sends it back to the front end that usually packages the visual layers um accessibility thing um any additional tags it needs to add if it's sending not just to a website but to like a screen or um some people use these for like for example in banks to be the display screens um so that um so you have that like packaging information you can also put into the javascript front end and then send that request back to the site visitor throwing in those caching layers um and it leads to like the front end caching level layer where you are doing that security step of having the web application firewall it speeds the delivery of those pre-packaged things and gives that instant site experience like instant view that site visitors really like and you get to keep all your gorgeous pictures and your formatting and it's pretty great um this is just kind of another cut on the same information right save the visitors go through goes to the wax cdn layer maybe some of those requests are handled there then the rest of them go into the decoupled application that'll handle a few more of them not all of those requests will go to the drupal um the ones that do if you're using json api and you've enabled caching you can cache the responses in varnish and that also helps you lighten the load on drupal in addition to that there are database cache caches like fast caches like memcache or redis and then also um php has like a small caching system on it it's kind of important to know these things um a little bit to understand where you can introduce optimization but also you can um you can have errors that crop up or parts of the screen that don't render correctly and so it's nice to know where to look in case you have a glitch or it's nice to know which cache you have to clear um you can do okay so in this diagram just to answer the chat the node is on is doing server side rendering and then that's why you can do a cache in front of it and yeah like simon says you can put the um background on a different sub domain and that helps um so you have the one request going to the other subject and they are on two um the ones that i've seen are usually on also in two different hardware layers so you can have um one level of provisioning for the server-side rendering and then another one for the drupal stack because they like to be on different software stacks um this just kind of like if you haven't checked your caching layers with curl it's really handy it just tells you what's happening in the browser um but it's super nice to kind of just take a look at the header information if you see that droop x drupal dynamic cache hit that got a triple cache hit um it'll show you if you're using varnish or if you're using cloudflare um what that cache age is so that can help you figure out why you're still seeing something stale or not and then also another thing to kind of watch out for is that usually there's like an x cache header and the x cache header usually only tells you about the nearest caching layer to where you are um and that might not be the caching layer where i like something like i was talking about before like an asset is getting stuck in so let's kind of gotta introduce this because this is really handy um the curl man page is really great and it you can also look at all that in your inspector in um chrome or firebox or any of them in browsers but sometimes it's nice to just kind of have it like a static readout and then be able to refer to it later um so with this site like the main lessons key takeaways um were things like when you build the beautiful front end site experience it's also good to be to take some time and build a beautiful web editor experience um that was really one of the things with this project that was really important um because the stakeholders that were at the bank that were invested in this project really like to see the gorgeous front end experience but then they took a look at the same content in drupal and this is what this is showing you it's kind of like the front end page which is really nice and then the background page which is very drupaly and i gotta tell you they weren't that excited about it they so if you take the time um or if you're looking at a fully decoupled project don't do fully decoupled project just for the sake of doing decoupled because they take a lot more time um but if you are working on a decoupled project do something that sometimes we call like partially decoupled or like um i'll just give chris types a little piece around i know he's a dribble con give him credit for this way of saying it but mimic the front end experience inside drupal and it does take extra time and it does take extra resources to do it but they will be your um the people who are working on your project like day to day having to live with the web development will be a lot happier with it so we found that some of these fully decoupled applications really go progressively decoupled just to improve that building experience the other issue um this was an early stage project and it uses these paragraphs pretty extensively um an issue that came up a lot especially when trying to render a really quality draft is that there are some paragraphs issues with nested fields like paragraph does is this thing that's kind of like a a russian doll right you've got the big field and then then like an image and then maybe the image has details about it and maybe that link gets updated or the original base image gets updated when you do that there's a revision entity attached to that and it doesn't always bubble up to that main entity um it does pretty well for nodes but custom content additional content types has our time so just be aware of that if you have a project that's using json api arrest avia in paragraphs uh also check out for observability um in javascript application if we work with drupal a lot we're really used to draw blogging and drupal does a great job of vlogging and you don't miss drupal logging until all of a sudden you're working with a front-end application and you haven't built the same level of sophisticated logging and something goes wrong so either build really good logging or check in and or check into observability apps you can do relic or honeycomb those really can help in the case of things going on that's another part where it's like harder to talk clients into really spending money on logging parts but when the site is in production it is one thing you really appreciate um also if you can build in a local cache oh hey look you're not has a logger too so there's definitely good loggers out there take advantage of it um and also um if you can build in a local cache to if you're doing server-side rendering with javascript if you do a local cache that helps a lot in terms of speeding the delivery of your site to the end user um so that's the bank side that's kind of like those are the key things that was a lot um the next project i'm going to talk about is i don't have any screenshots of it is still in the proof of content phase um what they're doing is aggregating data from drupal um social media and also a store and they're using gaps beyond the front end but this is going to be like pretty similar to what we were talking about before so what they need to do with this proof of concept site is deliver deliver artists like music artist information to all the fans that want to see it and they all want really great experiences and they already want to see it right away so the structure architecture they're considering or they're actually working with right now uses gatsby to pull in information from drupal from their ecommerce site and they're actually using drupal kind of similar to what i was talking about before as an aggregator and then doing the gatsby build creates a static site that they can load in a cdn and then use the cdn network to push that out closest to their fans the fans get that unified experience being able to get access to all the information that the artist puts out but maybe the artist isn't updating drupal directly maybe the artist is updating social media maybe they're updating things on youtube um that's social but it's a little different maybe they are sending information along with like the liner notes to their the record label all that gets pulled in and it just looks like a great like front end user experience got some gaps available that's awesome so in this example i don't have um that gatsby is self-hosted but i want to use this as an experience as an example to show you a little bit of like that aggregation that i was talking about earlier so they pull in data to drupal from all those different sources marketing content email youtube spotify ticketing there's a few more made the slides cluttered and they're doing that through a series of migrations and through using json api rest api directly to drupal using drupal to categorize and unify and sort the information then using gatsby to really do the presentation layer gatsby's already also talking to their e-commerce platform you do builds from gatsby to drupal to create the static html which then they can load in this weekend okay so gatsby itself use graphql and you can set in you can have it use json api so graphql and json and pi both work a little bit differently um let me see oh gotcha okay so just in terms of like i thought i'd lay this out just because not everybody like looks at the caching layer because graphql graphql is great and that is an amazing structured query language um it will send a request usually um quadruple drupal builds the response and sends that directly to the front end um what we found kind of like practically speaking is it really helps if you're using graphql to add in caching as a like a database caching layer like uh we use memcache and aqua but redis is similar um jsonip json api gives you the ability to cache in a front-end caching system like varnish or a cdn um and what that allows you to do is the front end can go back to that fast cache in this case like a varnish cast cache and that gets delivered quickly to the front end if there's a unique json api request it'll go to drupal they'll just that'll build a response and then send it back to the front end uh triple at aqueous we use with per the purge module and so we send set long cache lifetimes and use the purge module to purge individual assets and json api strings so that when some content is updated that asset is updated right away in varnish and then the front-end caching systems and is accessible to the front-end layer cool and then json apis and core which is great and varnish caching is just a a matter of enabling caches so it's you don't have to do anything separate or different to enable varnish caching for json api you just have json api enabled and then you go to the performance settings and you set your cache lifetime we just kind of like charted this out to show you a little bit of what that can do for you so this is that that um we're just looking at it from the drupal stack perspective it's a little confusing because we use a bunch of software on our drupal stack but the total number of requests is like the top left corner purple and you can see there's a lot of requests that come in for a build if you are using json api you can send a lot of those to varnish and have a fewer number of requests that are actually going to apache in drupal and that reduces the load that you have on your database layer so mysql and then also pulling from them cache so practically speaking it means you can do a lot more with less of like a hardware spend so the json api um like that part is kind of like static like it builds the file and then it can be saved in varnish but then if you make a change to a node or a you know content on your drupal site you can use the purge model module and the right plugin and clear that out as soon as you need it to and it depends a little bit on how you build that front end application so okay that's a good point i should go through the numbers so in terms of total requests on these graphs um they're talking about 30 000 requests this is kind of like the peak that's the top line for over 15 minutes um you can send about in this example about 20 thousand at peak um average more like ten thousand um was sent to varnish which means that you have occasional peaks up to twenty thousand requests to the drupal but you're more your average peaks a lot lower and yeah graphql can include caching and that also just means that the number of requests to sequel are a little lower so enabling i talked a little bit about this before but this is just kind of reviewing um decent ipa is included in corey you can just enable it i mean it really is kind of that simple setting a cache lifetime if you're already using a front-end cache like varnish it's pretty easy it's the pot purge model and the plug-in that works for your host or caching system um to clear that front-end cache when content's updated um it's really helped the people that the projects that i've worked on to remove access to unneeded files and api the json api extras modules have been super helpful for that and then this that project that i'm talking about is mostly uh yeah it's you're using varnish usually with um anonymous traffic or if you're doing something like securing all of the traffic so in this project all of this traffic can come in and is handled by varnish because the drupal is only delivering content to gatsby um the access control is actually handled in um through allow and deny lists and then there's another way you can do it which is use the push the authentication to the edge and that is what this next project i'm going to talk about does um it's an event platform it's like the event platform we're using right now except it is built and react and it has a drupal backend um and it's completely closed system all of the traffic is authenticated and it was built to be a trusted secure place to host events for a particular company and save progress like through videos and certification programs and learning programs drupal's also used to manage media digital assets shareable assets with the site visitors during events so it's really important on this one that the front node front end is turning away and handling live events and drupal's being used to update and add information so this one's just a little bit a lot more complicated in terms of this diagram and it is going to be too small and blurry and i'm so sorry um so i'll just walk you through it um basically when the site visitor comes in there's an initial request to the cdn and assets that don't require a login are just delivered right away then they get to the front end application they need to log in to be able to access any information that's handled through giga so it's in the browser and then it logs them into not just the platform site but the whole network of sites that company uses which is really handy because it gives them that seamless experience through the entire network of sites then um once they're logged in they usually go to like a main page like a landing page or a senate page is very similar to this hop in platform that we're in right now and um there's a chat that runs through and grabs usually um also a little uh a headshot or um icon of their image that they loaded they're all talking to each other those are the delivered that's delivered through the embedded video platform um so that's actually a separate system that's also interacting with this front-end site for everything that has to go back has to do with the individual's consultant's progress or something that's like specific to them there's also like a drupal request that goes back to the triple air and then also for some of the pages menus and assets that goes back to the drupal layer i call it the drupal error but it also includes a varnish catch which is really helpful in terms of delivering that content quickly i'm going to get to that left question later so on this particular platform because it is a live event platform it's pretty quiet until you have suddenly have a whole lot of people logging in at once so it's been really helpful for this one and in addition to having like varnish cache is also have a front end cache and an external cdn to really deliver some of the like smaller assets um right out to the edge layer it's also the project where we learned that it is very important to be careful about what you're using to vary in that cdn in an early part stage of this project we act they accidentally set it to vary by user and then had a ten thousand like you know seven thousand user actually in that case event and our caching layers exploded on the front end the varying by user was not a good idea so double check what you're using to vary um it is really important to get that right especially before you deliver a bunch of traffic to it also if you're doing um handling varnish through like a balancer layer like we do at aqua if you're cashing all the assets and you're cashing json api on top of it that can be a big level like that can be a lot to cash so make sure you beef up the balancer later it is really amazing what those caching levels layers can do they can really speed the delivery of assaults but sometimes they can slow everything down and also json api x-rays is great uh perch works really well um this is just looking because i promised failures right this is looking at some an actual real life failure that happened with this event site and what we did this is one of those where we did not have the balancer layer uh beefy enough so our varnish cache did was not on big enough hardware so what we see on this is that the the front end stock the node stack has this really smooth like ramp up in requests per second and then all of a sudden the cpu just starts spiking that's when that front end stack was trying to get to the back end stack and the back end stack that um that caching layer was just busy uh it's had too many requests coming in at once and even the io like even the requests coming in and the request coming back it was just too fast too much for it to handle so what happened was that cpu went to the sky so in this case it didn't break the front inside at all um it did cause for new users to not be able to log in as quickly as they wanted to but you can see what's happening with like cpu and memory and then when that tailed off is actually getting close to the end of the event and the number of people logging it actually tails off and you can see the load kind of like checking up and down at the same time so this is not how you do it so what would you do so i was thinking about like what is the best advice to try to give people if they're trying to prepare for this kind of traffic where it all comes in we know that everybody tries to log into an event at the last possible minute um but what are what are things that you can do to help prepare for something like this um i think number one if you're using drupal make sure you're optimising your drupal caching strategy um and by that just like make sure check through your views check through the way the site's about make sure that um like the basics are covered that you're using good caching um also take a look at your assets and just make sure that there are size that can be delivered like images and pdfs and all those things can be delivered quickly and easy easily to the front end consumer audit your entities and fields where it's possible disable access or turn things off if there are places if you know a traffic event is coming and you have some seldom used apis go ahead and just shut it down features you don't need shut them down um because the last thing you need is to be like in a high traffic situation and have somebody like trigger something resource intensive on the site i'm using cdn upsize your balancer test test load test um cache warming and then i don't have on here and i should the json api extras but it's really great and then also the um drupal.org if you look in the json api documentation there is security consideration um document that's really great if you just run through everything the security considerations document you're in a really good spot for handling traffic the other thing that i thought was like a good solution so this is the same setup right it's the event site and node stack in the drupal sec um and it's actually very similar to the it's see which one is this this is actually more users logging in than the last one they were able to keep the cpu and everything under control and we did that with a combination of really going through and dialing in the caching strategy also um upsizing you know what hardware we had and then also they did a really good job of incentivizing people to join the event sooner and what that did was it lowered the peak it lowered the maximum number of requests so you could actually handle that this a larger volume of site visitors by spreading out the data platform cool anyway i thought that was a really great idea so think of two like as you're going through this situation sometimes like the answer is technological but sometimes the answer is like social and if you can incentivize people to do a behavior that helps your website like that can be good too um that's the end of like my pre pre-done part of my talk it looks like we have just a few more minutes um and there are a lot of great questions and i have really been appreciating how you guys have answered questions for each other through here that's all amazing um so to get to steve's thing the re the way that you would move authentication to a browser is by using um like a single sign-on solution um gigia has one and there are others that will allow you to manage the sign-ons through um like javascript in the browser layer so you can if you have authenticated users you're really in that situation where the front-end caching layers won't work for you and you really have to look at back-end caching like what are ways that you can improve that database performance it's like memcache is one of them radius is one of them um being able to like spin up new pots might be um probably not actually now that i think about it but um basically you're taking that same kind of performance problem you just have shifted it to a slightly different place with content localization in translation you often have like a vary by country so if you can vary by country then um that opens up it's almost like a different cash bucket and you pull information out of that cash bucket i think that is like the simplest approach that's been like workable that i've run into um other solutions haven't worked as well because sometimes you if you aren't like properly setting that that cash id or that vary by in a country specific way what i've seen in real life is um like translated content accidentally being shown to users instead of the english content or the other way around so with the very you can vary by country in the cds that's one way to handle it another way to handle it is to double check that the cache id that drupal is using is actually enough of a different id and then also that that is bubbling up in drupal and then bubbling up into your external caching layers it's a little esoteric but hopefully that makes sense cool yeah alejandra cordova and just gave you know the same answer in it so yeah [Laughter] um so you can have external applications consuming json api endpoints um so you could version the api i guess um you could version the api you could use cache right and then you're just using purge to like clear the cache and that updates a specific object if it's just specific object information if you want to add a new feature data model you don't know that as well but i'm hoping somebody else in here knows um there might be an answer to with the jason api extras or being able to change the path to the specific thing i think that's just about everything i think i'm running out of time it's been really great talking to everybody oh so aqua lift works in um the it's javascript and in the front end so the really important thing when you're working with decoupled decoupled architecture and aqua lift is just to make sure that you're able to include the lift code and then you might have to do some um when you create a slot it's a position to for customization it's called a slot in aqualift which is now being called awkward personalization um just to make sure that the slots that you're going to be using are included and then make sure that you're using with the other tool you're setting position information that'll match the slot and those are the key parts to kind of have all working together and we are happy to help you figure that out in detail all right thank you so much for coming we really appreciate all of you
Info
Channel: Drupal Association
Views: 510
Rating: 5 out of 5
Keywords: DrupalCon Global 2020
Id: IZqZ9I-BznY
Channel Id: undefined
Length: 44min 26sec (2666 seconds)
Published: Fri Aug 28 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.