Monolith Decomposition Patterns • Sam Newman • GOTO 2019

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

This is a talk from GOTO Berlin 2019 by Sam Newman, expert in helping people ship software fast and author of “Building Microservices”. Check out the full talk abstract below:

Big Bang rebuilds of systems are so 20th century. With our users expecting new functionality to be shipped more frequently than ever before, we no longer have the luxury of a complete system rebuild. In fact, a big bang migration of a monolithic architecture into a microservice architecture can be especially problematic, as we’ll explore in this talk.

We want to ship features, but we also want to improve our architecture, and for many of us this means breaking down existing systems into microservices. But how do you do this while still regularly releasing new features?

In this talk, I’ll share with you some key principles and a number of patterns which you can use to incrementally decompose an existing system into microservices. I’ll also cover off patterns that can work to migrate functionality out of systems you can’t change, which are useful when working with very old systems or vendor products. We'll look at the use of strangler patterns, change data capture, database decomposition and more.

👍︎︎ 1 👤︎︎ u/mto96 📅︎︎ Feb 26 2020 🗫︎ replies

Mate hurry up and publish building microservices second edition please

👍︎︎ 1 👤︎︎ u/doniseferi 📅︎︎ Feb 26 2020 🗫︎ replies

Thanks for sharing! Learn more from our conference chair, Sam Newman, at the inaugural O'Reilly Infrastructure & Ops Conference, happening in Santa Clara, California this June. In his 2-day training course, Moving to microservices and beyond, Sam details framings for microservice architectures that explore the various forces that can drive the design and evolution of microservices, then leads you through a series of interactive architectural kata exercises to put your newfound knowledge to the test. You can also check out all other related training courses and sessions in our Moving to microservices learning path. We hope to see you there!

👍︎︎ 1 👤︎︎ u/oreillyinfraops 📅︎︎ Mar 17 2020 🗫︎ replies
Captions
[Music] [Applause] everybody thank you so much for coming along I know it's a second day towards the end of the second day your energy levels are starting to dip a little bit you've got too much new information in your head you're feeling groggy and we're gonna cap it off later with some alcohol and Sam Aaron's gonna be playing some loud music to keep you awake so please do hang around for the party later I'm here to talk to you about monolith how we go from so a month to microservices should we even do it and I'm gonna share with you so about three or four different concrete patterns that you might want to be using as part of your decomposition of a system and thinking about how you might move towards a Microsoft architecture and fundamentally that is how I think you should use microservices I'm not really for adopting a micro-service architecture or up front so I think there's some challenges there so this is mostly going to be about how do we move towards a microservice architecture and fundamentally when something gets too big what you do about it and I do think it's the natural state of software to get too big right I think people feel bad or I get too big I can't deal with it now if the software was the right size you wouldn't split it apart so you're never going to get to the point where it you know it's got to get bigger for you to want to split it it's just life deal with it move on and what I'm hoping to do is share with you some concrete patterns that you can bring into action when these things happen I previously wrote a book about micro services and I've now run my own company I do training advisor work you can find out information about what I do on the Internet I am in the process where I finished my work I've handed over the proofs and everything of a new book called my little micro services which should be in print next month it's ironic that I spent a lot of my life trying to help people ship things quickly and frequently and as its idea of iterative incremental change and then I spent about last four years of my life creating books which have a very much a transactional one-off BIGBANG release going for them and something is just that the path to production for books seems to be going on forever but they should be out soon and you can get an access to an early version of this over sort of on the Internet but we here to talk about sort of micro services or more to the point or not really might be services but how do we go to something that looks a bit like this from from this from this this is the monolith of course some monoliths are picturesque and we've unfortunately I think we've come to sort of use the term monolith as a derogatory term we use the word monolith now instead of using the word legacy I would imagine there's probably some Google trend search I could do to show this happening and I could cherry-pick the data to prove my point but just go with me because I'm I'm just going to tell you its facts and we can move on but we do talk about the monolith as being a thing like it's a well understood shared concept but I don't think that's actually true because I think mothers actually come in all types of sizes and I think it's worth us exploring that very briefly the first thing is when I talk about a monolith I'm primarily talking about a unit of deployment and so a monolithic system would be one where all the functionality is has to be deployed on mass as part of a release process there's some nuance around that but we could start with the kind of de facto the simplest monolithic you'll ever type you will ever see which is so simple that you probably never actually see it because things are always a bit more complicated than this but basically all of our code is packaged together and deploy it as a single process we might have multiple copies of that process for increased scale or robustness and all of our data is in a database and I always like to tell people that this is a distributed system is just a really simple distributed system it will get slightly more complicated if you also have you know a web-based interface because now you've gotten at least two networks involved but most of the time you can sidestep the challenges associated with building distributed systems in this kind of topology we have the minor variation on this of course which is what some people refer to as the modular monolith this is where we're embracing cutting-edge ideas from the late 60s in terms of thinking about how to break the part software into your modular boundaries the idea here is that those modules can theoretically be worked on independently I actually think Amit Nam of the teams I work with would actually be better off with a modular monolith than they would a microservice architecture the challenge is often in how do you find and identify good modular boundaries which allows this work to be done independently and now with this sort of work I could take have a team working on module see a different team working on module a and so on and so forth allowing a degree of parallel working but fundamentally the act of deployment still is a monolithic unit I have to integrate that code together I statically link that code together I deploy the thing as a single unit at least that is the case for most runtimes there are of course runtimes out there which really do allow for the hot deployment of modules are in a really robust way you know they're kind of you know some of you may have heard of Greenspan's law which is the statement that every system evolves to contain a half-broken implementation of Lisp the microservice variation of that is that every micro service architecture eventually evolves the point where it contains a half-broken implementation of Erlang because Erlang does actually allow you to do some really funky staff around modular deployments and I hoping at the current pace of change in the Java world in about 15 to 20 years we too might have something that Erlang had 15 years ago cross fingers we'll see what happens but in all seriousness it we may well start moving back away from micro services if more of us have access to runtimes which allow us to hot-swap modules the reality is most of us don't and therefore we're forced into think different ways of working and different sort of architecture topologies a traditional modular monolith will typically have as well as being all bundled together in a single process the data is often still bundled up in a single database as well which does cause you issues if later on you want to maybe split module e out as to a separate micro service architecture and so you can look at variations of this approach which is something I've advocated in it with a few of the teams I've worked with which is you know if you think that maybe we'll do micro service in the future but we're not sure one of the ways you can kind of hedge your bets is to start with module boundaries where you think your micro-service boundaries would be but also isolate the data associated with those micro-services because it's often decomposing the database is the most problematic aspect we'll come back to that idea of how we pull databases apart a bit later on turns out it's really difficult and then of course we've got the other type of monoliths I'll two more to look at we've got kind of the third party model if this might be your your customizable off-the-shelf software finance base CRM it could be a cloud software it could be staff based software this is software which is completely out of your control and we're often in situations though where we want to migrate functionality away from these sorts of things you have limited ability to change the system you're even very little control over how it's implemented or built if you're lucky you'll get a database and that can open up interesting possibilities of doing things like change data capture you might even have api's if you're very very lucky again I have worked with or clients that have had to migrate away from their own software but treat it like blackbox software because they've lost the source code and it don't lose your source code please check source code in it is 2019 but I still have to say that every now and then the worst type of model if we deal with and buy worse I mean the type of monolithic architecture which tends to not be very beneficial for what we tend to want nowadays is of what's called the distributed monolith the distributed monolith is one where our software is actually the sort of deployment topology of our software is functionality deployed as separate processes those processes are communicating over networks but because of how we've broken that system apart or maybe other factors we're in a situation where code often needs to be changed across module boundaries maybe business functionality is smeared across these boundaries the classic three-tiered architecture would fall under this kind of banner and now you enter the world of either having to deploy everything together or quite complex release coordination activities to roll out changes we have to replace the whole system and these sorts of distributed monoliths have the problem that they have all the downsides of being a distributed system but also many of the downside of being a monolithic system as well this is a bad place to be if you're here the the secret to here is to not mad anymore services maybe you should be merging things back together again sometimes these distribution lists can actually emerge partly because of how your deployment processes work but fundamentally these systems have a high cost of change the scope of deployments are much larger there are many many more things to go wrong and you typically have much higher release coordination activity those of you who are practicing a relief technique called to the release train choo choo everybody aboard of the release train you may sometimes sleep walk into a distributed monolith the way the released trains work is you set a cadence you say every four weeks all the software we've created will go live at the same time the whole point of a release train is it's a remedial activity it is a stepping stone if you're looking to actually improve your delivery practices the release train is not an aspirational process for shipping software it's like having training wheels on your bike if you stay with the release train for too long you codify the idea that we bundle all of our software together and release it periodically that's not the point we're trying to move through that towards proper continuous delivery there are some processes out there such as safe which is anything but which actually codifies the release train as being the best way to release software which is patently insane if you don't know more on this topic I suggest you read a book that's quite hold now called continuous delivery which explains why a release train may not be what you want to aim for but it can actually lead to these really tightly integrated architectures so what we're looking for here is monoliths have properties those properties may or may not be a problem for you the fundamentally you have to accept that the monolith isn't necessarily the enemy it is extremely rare that your goal is to kill the monolith it sometimes happens but most of the time you're in a situation where you're trying to achieve something as a business but your current architecture won't let you achieve that goal and so what you need to do is you change the architecture enough to do what you need to do maybe to handle more scale to allow more developers to work side-by-side effectively and efficiently it is vanishingly rare that I work with teams who are trying to completely remove the old monolithic system they know we just need to change it enough to solve their immediate problems and then they get on with other things this idea I think too often is we're going to engage in some kind of like Big Bang rewrite of our systems so you know as Martin Fowler once said if you do a Big Bang rewrite the only thing you're guaranteed is a Big Bang which is great I like explosions in action films I don't necessarily like explosions in my IT projects the reality is that you know micro-services bring an awful lot of pain and suffering but it's very hard to assess how that's going to impact you until you've actually started using them it's this reason I don't you know we need to think about adopting micro-service is not like flicking a switch it's like turning the dial as you turn that dial up and you have more services you get more opportunity to take advantage of those services you also have more opportunity to take advantage of experience in a really true visceral way how horrifying distributed systems can become and for that reason it's a good idea to turn that dial up gradually create one or two services maybe extract just the one extract a piece of functionality from your monolith integrate it with the monolith first deploy it into a production environment and learn from that before you move on you will not appreciate the true horror pain and suffering of microservices until you're running them in production you have not completed a migration until it is in production so it's really important not just from the point of view that we're looking to incremental er change our architecture for you know it not taking us years to change something but also in terms of improving our feedback cycles start with something easy extract here deploy it learn from that experience use that to refine the next thing you do if therefore follows though that any techniques we use in this area must be things that allow us to make incremental change to aim system without requiring the whole thing to be we architected this also though of course has the added benefit that any incremental migration techniques we might use can also allow us to interleave a bit of architectural refactoring with you know shipping some features to our customers which might actually be a good idea so that all the techniques under take you through are designed to be used in an incremental fashion to allow you to turn that dial gradually to migrate your system to learn as you go and then maybe as you become more experienced you become bolder you can turn that dial more quickly although maybe you don't need to you've also got to remember that micro services are not the goal you don't win by doing micro services I win because I sell copies of my book that's great for me I'm not sure that that's necessary a life goal for yourselves this is why it's all so silly when people start comparing how many micro services you've got like somehow that's important comparing yourself with another company it's not helpful because they're in a different situation with different drivers different challenges different skills different technology so what mons oh I've got 1,400 micro services it seems nuts but they seem to be working quite well as a company you might need four if you're four and your developers are happy and your customers are happy move on with your life don't worry too much about it but just above all please please do buy my book so incremental migration patterns let's start with one of the most well known in this area we're gonna look at basically two patterns for how we might pull apart our application code and a couple around how we deal with the data we'll talk a little bit about data and data management as well so we can start off with quite a well-known patent and that's a pattern called the strangler fig pattern so this is actually named after a type of plant that you can find Australia what we're seeing here is actually a tree in the rain forest in Queensland we're seeing a tree and wrapped around it is actually a type of thick such a vine and basically it sort of descends and wraps itself around the plant the idea with the strangler fig is that by itself it couldn't possibly establish itself in a jungle setting it couldn't be tall enough to get it you know roots into the soil but also to get leaves up into the canopy or it can get photos this everything else and so it wraps itself around an existing structure and the strangler fig pattern in our application sense is the same idea we wrap new functionality around old functionality and this has also proved out another thing I've been saying for many years after having lived in Australia for quite a while it's that every people in Australia were lovely the food nice but all the things on the land want to kill you and all the things in the sea want to kill you the weather wants to kill you and some plants apparently want to kill other plants so please do go but just look up and watch for drop bears so the strangler fig pattern is very useful and it's surprisingly simple and bit but and and also can be used in so many different contexts the way it kind of works is we just wrap something new around the old system what we're really doing is we're intercepting calls to the old piece of functionality and diverting it to where the new functionality exists this has a nice padded property that the existing monastic system effectively remains unchanged well functionality hasn't been migrated calls go to the old system as before there are some types of architectures that make it extremely easy to implement a classic example of this would be something like a system driven via HTTP so in this situation here we have a mullah thick system which is restricting calls from some upstream source via the HTTP protocol HTTP works really well here because it's very easy to transparently intercept and read and divert calls without the client really needing to be aware of that so this could be asked looking to intercept HTTP calls underneath the user interface this could be a headless application the first thing we would do in this situation is we either put a proxy in place that proxy is going to sit between our monolithic system and the upstream services I would advocate your totnes proxy in place and you deploy that proxy into production and you do nothing else at this point it will just be doing pass-through calls the reason for doing this when we're moving to a Mike reservist architecture we are adding Network hops Network hops add latency it's a good idea to find out how bad your network is quite early on in this process if you find that sticking your network proxy in between just your upstream of your downstream service and you know 400 milliseconds of latency your network sucks and you need to get a better one I remember in 2005 we spent it was a banking client we spent ages tracking down a performs bottleneck and to find the all traffic between two servers in the same data center in London were being routed via Luxembourg that puts a bit of a cramp on your day and if you do have a network which does suffer from issues like that it's good to find that out early and this is also the thing where even our incremental refactorings can be broken down into a series of smaller steps each of those steps can be put into a production environment and can be assessed so at this point we're able to assess the impact of adding an additional network hop we've also of course got the proxy in place to start intercepting our calls now what you do is you start working on your brand-new service you deploy that service into a production environment you test it in situ we can do this because it is not released to our users we've too often I think can bound combine these two concepts together in our heads the idea of deployment into a production environment and relief of that software to our users those are actually two separate activities I can deploy something without releasing it when I hide something behind a feature flag it is deployed but not released from doing dark launching is deployed but not released in this situation here I'm deploying my service into production I can make sure the deployment process works I can test it in situ but it's a safe operation because at this point no calls have been diverted once I'm happy that my new service is working properly and I think it's ready to take the traffic all I need to do is to change configuration the proxy to divert calls it used to go to the monolith over into my new micro service architecture if I have a problem it's a quick rollback what I need to do is change my proxy configuration we will have to talk about state if you're going to potentially have a city appear at a time where you might want to flick back to actually serving that functionality out of the monolithic system and these and the service you've created is needing to store or a tree state from a database that does mean for a short period of time you would have kote you both have to use an access the same database during that migration period but nonetheless it's an incredible useful idea I've done variations of this with user interfaces and other protocols I chatted to a real estate company in Zurich called home gates who use a stronger patent but they were intercepting FTP not HTTP calls I've seen it done with message brokers I think there's a lot of different ways to make this pattern work one of the nice things about this is that no point here is the monolith aware that anything is going on so these sites of ideas can work very well when you're looking to migrate functionality away from existing black box systems that you'd have no control over this could also work well if you didn't want to disturb the team looking after the monolith you're doing something maybe a refactoring around the edge so that's quite useful when we start looking inside our model if we realize there's only certain pieces of functionality it's likely to work for so this is sort of showing the relationships between bits of business functionality inside our monolith we've got a bit of a directive a cyclical graph of dependencies this is the kind of thing that would come out sort of domain modeling exercise so these these boxes they might represent your bounded contexts for example and therefore all your candidates for which things should be a service and so if I'm looking at something like say invoicing or order management those things are likely going to be higher up in my call stack it's a bit easier to intercept those calls coming in but if I take a look at something say the ability to award points for boarding making orders or sending notifications to our customers those are things which are deeper inside our system there's not a call that comes in that I can match that functionality when I place an order receive a payment as a side effect of those operations I might send notifications so they thought if I wanted to extract notifications related functionality or loyalty related function meows that monolith I'm gonna have to get my hands dirty and go inside the monolith and sort that out and this is where another pattern comes in and that's a pattern called branch by abstraction now this pattern is something you might know of if you've done any trunk based development is very useful in that context but it's also incredibly useful for micro service migrations because it gives us a safe way to factor critical code paths in a way that can also be verified we'll come back to that verification a bit later on so the way this works is we need to create an abstraction over the current functionality the functionality you're trying to move you need to basically get all that functionality in a box put it in the box still a bit of refactoring and so you know you start looking around and get all the notifications functionality and now I've got a notifications class that's easy and I changed the existing functionality to use that class so this is just a good bit of creating a nice abstraction and then create an abstraction point which would allow me to toggle between implementations this is what's known as an interface in an object-oriented system so at this point the sisters would be a safe refactoring to do most of you probably are using languages that give you a refactoring browser just creating a new abstraction point over an existing bit of functionality this is an easy refactoring to do and at this point nothing's got exciting you can be chipping away at this over a period of a few days while you're doing other things what this is now giving us is a point at which we could swap in an alternative implementation of our notifications functionality but we could do that inside the same running system the reason is patents called branch by abstraction is it's really being compared to the normal way we would handle this which is potentially doing is refactoring in a different source code branch the issue is that denies as a whole load of interesting possibilities when it comes to how we roll this functionality out also merging so this is actually a much smarter technique for these kinds of situations so we've got our extraction point this is easy the next thing we're going to do is we can start working on our brand-new service implementation and this is going to be a nice news notification service that's going to receive requests the functionality that used to be inside our monolith has been copied into that notification service assuming we can do a straight copy and paste that service calling implementation there is effectively just going to be client code inside the monolith it's going to call out to our service and that service coding implementation is going to implement the same interface it is checked in it is deployed into production but again it is deployed but not really because the implementation of the abstraction we're using is the existing functionality when we're ready and we think you know what our new functionality is working well and you know how I've been able to test it because we've been able to test it in production because we've been deploying on notification service in production we can switch it over and that again is a simple use for feature flag so now the implementation of the interface we're using is our new service calling implementation if that works everyone is happy and low there was much rejoicing and then you can gob viously clean up the feature flag after a period of several weeks so once you feel that you don't want the option to be able to switch it back cleanly you can then do is if you want to remove the feature flag and potentially even remove the abstraction point and the old implementation there by cleaning up the code base nice thing about having both implementations in there is it opens up the possibility of us doing things like dark launching and parallel runs which is a really interesting techniques all right before I forget if you want to do that little refactoring before we get to the service a bit there's a really good book by Michael feathers called working effectively with monolith code which I'm going to get Michael to rename it so he can jump on the same bang banking that I'd be riding the last ten years but this is a great book about how to refactor existing systems he talks a lot about how we identify seems encapsulate code safely inside those seams and making you know and how we can add tests to do that operation in a really nice managed way there's different versions of the code examples of his book for different languages as well if we can implement both implementations of our notifications functionality inside the same monolith at the same time it allows it to do something like a parallel run with a parallel run when a call comes in to implement that piece of functionality we actually can dispatch that functionality to both implementations this might look a bit odd but what we're basically doing here is these patterns are architectural refactoring patterns a refactoring is something where we change the structure of the code but we don't want to change the behavior of the code we want to be able to verify that the system is still functionally the same as it was before but we've now got an architecture the let's do other things so how do we make sure it's function equivalent we'll do a comparison and if we can coexist both implementations we can do a live comparison so in this situation a call that comes in with cool both existing implementation and the brand new service calling implementation will be able to verify whether or not we get the same answers we can also verify whether or not the brand new service according implementation responds quickly enough and this can allow us to actually sort of say well actually we are executing both pathways it seems to be behaving fine okay now we're ready to switch over during this period normally the existing implementation would kind of be the source of truth in terms of which results you actually would keep on going forward this is an incredibly useful technique a github actually uses technique a lot into it when they've been restructuring their Ruby code bases but actually created a tool called github scientist which is a library for managing these parallel run situations allowing you to run two implementations of the code and actually set up sort of like a scoring system say is the new implementation working appropriately this is written for Ruby but there's a whole load of ports for different languages right now it's there's a lot of Perl ones there that's really hard anyway I didn't realize that was still a thing but it is still a thing and it's got scientist ports for it now this is quite useful okay so we can now run these things side by side again not some only something we can do because of that branch by abstraction pattern plus it also means we get to use trunk based development which means everyone's going to be happier so that's good down with branches I have sides that the data question and we need to talk about data very briefly with our normal monolithic system we have all of our data in a database I've extracted out my new functionality from my monolith but my invoicing service needs to access data so the question is what should I do in a situation now when we're looking at situations where we may want to switch which functionality is live either the invoicing functionality in the monolith versus the invoicing functionality in our new invoicing service we're in that state where we're switching between it it's sort of quite appropriate to just sort of for a short period of time to just access the same database once you've decided though that actually know the invoicing service is now where dysfunctionality is going to live we need to avoid this direct database connection because that's going to be an ongoing source of coupling you know as we all know if any of you spent any time looking to micro services menu you know virtually everybody will say don't share databases and there's a really good reason behind that so that isn't like doing this but for short periods of time as a migration technique it's ok as long as you kind of you're doing this for like a few weeks just to just know in your head you haven't actually finished extracting that invoicing service because you're still showing a database so until we've dealt with that situation your extraction is not complete so we then need to think about is ok with I can't do this what are the kind of scenarios well the first thing is to think about is well what is the data I'm trying to access so if the data I'm trying to access is data that I don't really own like it's non invoice related data but data I need to be aware of for example I might need to look up a customer details or information about an order that data and the processes that manage that data that's all still inside the monolithic system in that situation is completely appropriate for our brand new service to go to the monolith and ask so in this situation we expose some kind of content service interface in this example here have made that an API so I go fetch the information I want I've avoided direct database access by creating its API I'm allowing myself the ability to decide what is shared and what is hidden now if the data you want is really data that you should own we need to invert this relationship in that situation we need to grab that data are skipping this we need to grab that data from the monolithic system to me transitions and pull that data over into our schema into our world so we now want to own and manage that data it is ours we should now be the source of truth for this data the monolith of course needs invoicing data so it now needs to come and talk to our brand-new service so it's honey you got to decide when you're putting a service out what data do you need some data is yours and that should come with you some data should stay where it is and then you user well defined API the the kind of difficult situation is sometimes with these monoliths you can sometimes end up a situation whether the - depends on you and you depend on the monolith and then things get a bit hairy but you know I can only cover so much stuff in this talk so let's go a bit deeper into this idea of moving data because just that transition is quite nice look they're just all did I made a box appear and then the Box traveled across the system and I knew if I was being mean I could say and that's how you do dates basically fracturing which of course is hiding a lot of information and detail so we all talked about a couple of show a couple of examples of database refactorings now the first thing I'd say when we look at database refactoring is do take a look at this book by Scott Ambler and promote ah glad you're still it's probably a little bit old now but this is for database refactoring what Martin's book we fracturing is full code refactoring so this is talking about the low-level patterns of how you refactor a relational database most of the stuff I'm talking about here and I talked about in the book are for refactoring relational databases because in many ways they have more challenges in this space but a lot of these techniques work just as well for other types of non relational databases the other thing you're going to want as if you're doing this work is some kind of tool chain that allows you to make incremental changes to databases allows you to version control those changes and run them in a deterministic fashion an example tool would be something like Flyway DB the Flyway D bees excellent for doing this there are other similar tool chains for different languages by a much much more hesitant about the use of schema diffing tools for these kinds of purposes but take a look at Flyway and you'll find the equivalent for whatever kind of environment you want Ruby migrations is another great example and so we've got a tour like this we've got some basic refactorings it's all looks like it's gonna be quite straightforward and then we start getting into some nasty issues so we can start off with a pretty simple case here we've got a monolithic ecommerce site we actually sell compact discs I've been selling compact discs for my fake company since 2011 and we're starting to feel the pinch a little bit so it turns out no one wants them anymore we're thinking of diversifying into selling 8-track cassettes because we assume that after vinyl that's where the smart money's gonna be maybe some laser discs so we've got some catalog related functionality and this knows the name of the artists and they've got some ID for a piece of information here so we've got the best of death polka volume for death polka is a thing I thought had made that musical genre up until one of my workshop attendees played me some deaf polka and my ears were bleeding afterwards and we've got finance related functionality and this is where it's going to store effectively our financial transactions and and maybe as simple as putting something into a ledger table okay so as we make a transaction we store stuff in a ledger table we've got a very simple use case we want to display a top ten list of our bestsellers so what are the CDs that have sold the best in the last week now in this kind of situation with just these two tables that's actually a very simple straightforward query we would do a select on the ledger table we would group would limit that select based on items sold in the last week we were grouped by the ID of the CD and count up how many copies of that CD we sold so at that point we've got ten IDs and the counts of how many copies we sold that ID by itself is not very useful for a webpage linga top ten releases but the information about those albums is over that line items table and so we would need to go and make effective your join over into the line items table to pull back that data but we're thinking about making catalog and Finance two separate services the the issue though is we've got a join relationship here so we have to deal with that join relationship so what we instead we're gonna want to do is move the join relationship up into the applications here we're gonna do a query to fetch things from the ledger and pull back the date so we need and that's at that point it's only going to be the IDS using those IDs we're then going to go to the catalog service and say please give me the information for these items and that will give us the same functionality as the old join query but now we're effectively increasing Network hops we're increasing database calls as well we end up with this sort of system now for something like top 10 list that's generated once a week the increased latency of this operations between art and over concern this is a very easy thing to cache but there are other more insidious problems that we have here if you look inside that monolithic schemer we probably got a relationship like this we've got a ledger table or with the rose in it we've got the online items table and we are going to have a foreign key relationships that skew column or sometimes called the stock keeping unit that is a foreign key relationship that points and refers to rows in the line items table there's got two main things that you do this for in relational databases one it improves performance for joint operations and two it enforces referential integrity or I wouldn't be able to delete row one two three because the database let's say uh-uh you can't do that because you're gonna you've got references about inbound references although normally we're using a database as a safety net right the application layer would hopefully stop us from doing that invalid operation but if we've missed that it was still weak enforced in court in the database layer we moved to these thing to separate schemas owned by two separate databases we kind of have some questions the first thing is how do we even denote this relationship exists anymore and it's a particular example what I've done is I've replaced the SKU column here with a sort of templated URI I figured this would be a way to wind-up people that don't like rest and also people that are really pure about their rescues and get annoyed both camps at the same time so in this example if I was building a rest-based system this would allow the finance service to directly dereference that pointer effectively that referent that HTTP reference and pull back information about that album and that's nice because on the point of view the finance service i don't literally know how need to know how to look that thing up i just reference that pointer a lot of people get worried about this there so what happens if I changed my mind about how I look these things up I'm then gonna have to go and rewrite the foreign key relationships effectively in the system so a lot of people just leave the IDS as they were before and then in an application code the application code of the finance service needs to know how to do that lookup the bigger issue of course is that we've got nothing enforcing our referential integrity in this situation there's nothing to stop us deleting wrote one two three from the catalog and that would potentially cause kind of an interesting problem at this point now sometimes and I suggest this you know people is always okay if you do a cascading delete so you could implement a cascading delete in situation bites a when I deleted an item from the catalog table I could fire an event that event could say item 1 2 3 deleted the things that care about that could also delete a row from their location if they've got a reference on it now I will say this deleting things from financial ledger is a great way to go to prison so don't do that in this particular situation but nonetheless you can see how you could implement similar sip processes like a cascading delete in a systems-based environment if you worried about deletion causing a structural you know referential integrity you could also just do a store set of a soft Ally now a status column says it's things no longer available and that has solved this problem but that's not a problem that could have existed in the old system we potentially have had to change the behavior of our system to cope with this lack of enforced data integrity and we've also made this operation slower so you've got to decide if all of that was worth it did you get enough out of these new services maybe you did maybe you didn't but you've got to be aware of the things going in and again this is something it's much easier to assess the impact of if you've made one cut at a time deployed it into production measured it and understood the behavior but fundamentally we've got more Network hops and we've lost enforcement of data integrity so we'll look at the look at the last example now and I figured it's towards the end of the day so we'll talk about one of the more contentious areas of distributed systems just before you get some beer so this is a very simple example of our music shop we've got the catalog of warehouse functionality we're storing stuff in an item table and the row of an item table looks like this but the BG's hits four dollars 99 forty five four twenty five is the number of items in stock the BG's hits four ninety nine is a recommended retail price for this masterpiece and we're looking at maybe making catalog and warehouse to separate services so to prepare the groundwork for that a pretty obvious refactoring would be to do something like we'll actually just take this table and split it into two separate tables again this is really bread-and-butter refactoring stuff so this is nice it looks great so now we can have trans log of the warehouse a to set services but then we start putting that in a production running context we might end up with a quite simple topology like this I've got my catalog service I've got my payment gateway I've got my warehouse service my warehouse service tells me what I've got in stock now in a distributed system we are always on the watch out for things called partitions partitions are what occurs where one part of a dish which your system cannot see another part of a distributed system these things are not under your control they will happen and if you don't plan for them they're going to happen anyway so you have to plan for them right so we're gonna take a look at a scenario has been created by our brave new world the payment the warehouse for some reason cannot currently be reached this means we cannot check our stock levels remember the old monolithic system this is a failure mode that could not exist it was either all up or all down now we've got a partial failure we know how much stuff costs we can take people's money but we don't know how much stuff we've got in stock now I don't ask you a question if you're running this CD online Emporium and you had a failure mode like this so you know how much stuff costs but you don't know if you've got those items in stock put your hands up if you tell us if you just you keep selling CDs you capitalist bastard right I put your hands up if you wouldn't sell CDs you would be out of business in like a heartbeat right in this business context ecommerce companies keep selling and the reason they keep selling is they go through a very simple thinking process if I don't sell any CDs then we make no money we can't sell CDs that we do have in stock our customers are upset and they go and buy the CDs somewhere else and they don't come back if we sell the CDs and it turns out we don't actually have them in stock we can apologize we can give those people a refund we can offer to backorder that CD but we're probably going to set up set and fewer of our customers and we're going to make more money so they always will make that trade-off change the context you get a different outcome if you're selling tickets for a concert they tend to make a different trade-off because if I sell tickets and then you think you've got a ticket to go and see the World Cup final in Japan and you booked flights and then we email you and go oh you don't have tickets to see the World Cup you're a bit annoyed the other thing of course is if your tickets can't be purchased right now on that website what do you do as the person buying a ticket you'd go back later because those tickets are only hold it held by that supplier because basically these sort of online ticketing companies are effectively a cartel this trade-off actually speaks to one of the more fundamental trade-offs we look at in distributed systems and a lot of things have been talked a lot of rubbish has talked about these things and that's a thing called cap theory so cap Theory basically talks the trade-offs that happen in in a distributed system I'm simplifying this to the point where whatever you think about is when a partition occurs you have to decide whether or not consistency or availability is more important to you when we say we're gonna take the money what we're basically saying is it's more important to us to maintain availability of the operation to sell CD's and in exchange for doing that we're now going to have an inconsistent view of our stock if we don't take the money we're saying it's more important for us to have a view of our stock consistency in that context is more important to us and we're gonna trade off availability that is about as much as any of you ever need to know about cap Theory don't look any further into it just gonna make everybody's heads hurt what you can say if you also do is whenever you meet someone that says I've beaten cat theory you can just you know ignore them and more pass them because they haven't they've just made a trade off and that's fine we've gone through a lot and I'm out of time so I won't talk about that this is sort of there's a lot more patterns in my book if you want to go over onto a rise online learning platform you can read an ear production version of it it will be in print next month more informations over my website but thank you very much for your time [Applause] [Music] [Applause]
Info
Channel: GOTO Conferences
Views: 72,145
Rating: 4.8987985 out of 5
Keywords: GOTO, GOTOcon, GOTO Conference, GOTO (Software Conference), Videos for Developers, Computer Science, Programming, GOTOber, GOTO Berlin, Sam Newman, Microservices, Software Architecture, Microservice Architecture, Monolithic Architecture, Monolith
Id: 9I9GdSQ1bbM
Channel Id: undefined
Length: 43min 57sec (2637 seconds)
Published: Wed Feb 26 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.