The One Question To Haunt Everyone: What is a DDD Aggregate? - Thomas Ploch - DDD Europe 2022

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
good morning everyone how you feeling ready for some awesome DDD stuff today you're at the right conference yeah so first I have to excuse myself it's I have been speaking at DDD Europe quite a few times but it's the first time that I'm here in the beautiful red stage so excuse me a little bit when I'm nervous um uh I'm Thomas I'm a principal engineer at Flix we're doing all these Flix buses Flix trains maybe flick submarines flicks roller coasters we're always looking for new ways of Mobility um yeah and I'm going to talk about a topic today that um some say I'm kind of well obsessed about right and I talked to a lot of people about Aggregates and I get a lot of questions during conversations and especially people um that are just starting out with DDD are especially interested so let's get started I want to give you a little bit of an Outlook the first question is of course why do we need Aggregates in the first place right why do we need this pattern I mean there should be a reason for us to actually use it the second question is actually what is an aggregate right what is the essence of it what is what is this pattern about and the third question might sound a little bit confusing but it is actually very relevant it's how big should an aggregate actually be what should be inside which would be outside Okay so why do we need Aggregates um and of course I'll start with a quote by Eric Evans from the blue book um as every experience domain driven designer would do and it's quite a bit of text so I'm going to read it for you as well it is difficult to guarantee the consistency of changes to objects in a model with complex associations invariants need to be maintained that apply to closely related groups of objects not just discrete objects yet cautious locking schemes cause multiple uses to interfere pointlessly with each other and make a system unusable so it's quite a bit of text and quite a bit of things to unpack here so what I want to tell you about is okay what kind of situation is he actually talking about like what is the scenarios in which this becomes actually useful um so what I want to show you right now is what complex can actually look like so anybody has an idea what this is what is it no okay it's an execution plan for complex xcl query that involves more than 100 tables right and you might think well our system will never get there right I mean there's just I don't know four or five entities right now the problem is when your product or your system becomes successful and you scale up operation you scale up development it gets very fast to become there and it's fun to get there actually but it's not fun to be there so the ride there is very nice you're in this box and they say you get your feature requests and we're super successful oh a lot of people are using our product and we have features more features and yeah everything is going nice and at some point you say hold on why is it taking three months to add this small little thing uh to the system so we need Aggregates that successful systems do not spin out of control right because the complexity is growing with the success of a system and it's cost by scaling up this operation scaling up the development getting all of these new requirements all these new features and you know there's some unbounded growth that you know we want these successful systems to actually stay successful that's the reason for that group we don't only need Aggregates I mean Aggregates are a very small pattern you know there's much much more in the domain-driven design Community um that we will use and you are at the very right conference to learn about them but today I will focus on the pattern of aggregates so let's Dive Right In of what an aggregate actually is an aggregate again it's a quote from Eric Evans from the blue book and aggregate is a cluster of associated objects that we treat as a unit for the purpose of data changes each aggregate has a root and a boundary the boundary defines what is inside the aggregate and the root is a single specific entity contained inside the aggregate so let's visualize this so we start with the boundary right so this boundary is a consistency boundary that means that everything that is within this boundary should be immediately consistent whenever data changes occur right preferably in some kind of atomic operation maybe an asset transaction or other ways of how you can make it consist I mean I'm not saying you should use asset transactions to do that there's multiple ways to achieve consistency but it should be Atomic you know it should be immediately consistent so if we have outside and inside there is something inside right and to get from the outside to the inside you need to have some kind of doorway right a door that you can go through um to you know get some signals from the outside to the inside so that's where the aggregate route comes in right the aggregate route is the door keeper that guards you know the entry path from the outside to the inside of the aggregate so all of these data changes are handled to the root and of course it it's not just the route that does everything so the root can delegate to things that are inside the boundary it's a closely related group of objects it's a closely related group of models so in this example we have an account entity as a root so it would be an account Aggregate and we have like some value objects or some other child entities inside like a token and there's also this invariant thing and people starting out with domain driven design sometimes get confused what invariants are actually are right so what does invariant as a word actually mean it means never changing even if other circumstances are changing right so these are rules that need to be satisfied at all times and it's also an aggregate's job to enforce these invariants so we have talked about the outside and the inside so when the outer World tries to you know communicate with an aggregate it should only have references to the aggregate route it should only be able to communicate and you know pass this aggregate around through the aggregate route because the boundary is artificial the aggregate root is something Concrete in your code right how you implement the boundary is up to you it's you can use a namespace you can use a module you can use you know an object it doesn't matter but you have to you know somehow get this invention of the boundary into something concrete that your development team will understand so if I say the outer world can only have references to the root exposing references to the insides of the aggregate on the other hand should be something to avoid and the reason for this is very not very simple in that case but I will show you a little bit of examples and I gave this presentation you know in the preparation to my team and you know I was asking for feedback and I had another diagram here and they were like huh Thomas no just show some code and I said yes so here we go it's a little bit of code it's PHP don't blame me it's it got a very nice language over the last years so I'm going to show some examples here right now that you know are violating these principles so we get an account from some kind of repository right so we have an account aggregate here so the first violating principle is you know giving out a reference to the outer world right so you have an account you get the token a child entity and you expose it to the it will so you have given now the reference already and now you do some data changes operations on that chart entity that are not guarded by the route so you're you're violating the principle that all these data changes should go through the root or for example an invariant checked outside of the aggregate right so there might be a challenge from a request that you create and you want to confirm the account and you know you do this not inside the aggregate but you do this somewhere in the service layer right it's often what we call leaking logic you know when logic is leaking from a lower layer to some upper layers so what are some better approaches here again we get the we we get the account aggregate from some Repository and now we invalidate the token through the account aggregate route right we're not exposing the token but we're actually calling the operation and you know the implementation of the aggregate is responsible for processing this data change you're not giving out references anymore you know you just return a read-only thing right so okay I give you the value but you cannot change it and now we move the invariance inside of the aggregate right so we have the challenge again but now the account confirms it with this Challenge and if there's something wrong it will throw some kind of domain exception what is an aggregate it's first a unit of consistency and it's a unit of concurrency right if you remember the quote from Eric Evans in the beginning it's about you know keeping it consistent even under a high concurrent pressure you know when a system gets successful and you have much much more you know customers users whatever using your product of course the concurrent pressure will increase so the next thing I want to show you is a picture of two people so these are two identical twins they're called Daniel Tarr and Ben flathletar who here thinks that Daniel is on the left one two three who here thinks that Daniel is on the right okay thank you uh Daniel is on the right I switched the name so to confuse you but it's impossible to say who is who with the information you have here right now right I mean they wear the same clothes they have the same hair color they have the same haircut they have the same eye color they probably have the same voice they're pretty identical so without them or someone telling you who is who you won't be able to tell them apart so in order to you know compare to Aggregates there needs to be some kind of identity that is you know attached to that aggregate and then we would be able to tell two of these Aggregates apart so Aggregates are distinct right you know one aggregate with an identity is not the same in the other even if all the values inside that aggregate would be the same right so if you would compare all the values and they would be the same but they have a different identity they are different things and that's a different also to Value objects where you know if the values are the same they are actually the same thing and if I say value object and you haven't heard it before you will probably learn about that later in the conference so um or you can ask me later about it um in the lobby so again I want to come back to the quote of Eric Evans and he especially mentions for the purpose of data changes right so it's about changing things but for Aggregates to be able to process data changes they have to exist first right so this is a timeline no time flowing from left to right and you know at some point there's a Genesis of this thing you know there's this thing is created somehow and this creation is driven by the domain so it could be that a contract has been signed or it could be that the customer has registered or it could be that the car has been produced right so there is something in the domain that triggers this kind of birth of an aggregate and then there is you know during its lifetime you know it goes through different phases or stages or it processes a lot of data changes you know it does stuff they're just the things it is supposed to do you know it is at least useful I hope it's useful um and does useful stuff uh within your product or your Software System but at some point there is also an end to it so it's not just you know born and then lives forever but all often there is an end for example like a contract has expired or a partner company was liquidated or a car has been shredded right so there is this kind of life cycle that Aggregates have so they represent life cycles so an aggregate is a distinct life cycle and it has an identity okay so again I'm I'm like decomposing this quote by Eric Evans and you know focusing on small little parts of it because there's this tiny details when you read the Blue Book the first time you thought you understood it and then you read it a second time and third time and then you after the third time you understand you haven't understood it at all and then maybe after the fifth time you start to realize okay that's what he meant so closely related things are normally used together right imagine like your knives and forks and spoons The Cutlery you probably have them in the same drawer in your kitchen right you probably don't put them into like the forks in the living room the spoons you know somewhere in in your children's room under the bed you know you put them together so the same applies to to software systems and the same applies to aggregates so we have two databases here database a and database B I'm specifically not mentioning the Technologies you can just add your own basically and on the left side we have the account entity on the right side the token entity right and then you start to you know reconstituting this thing from multiple data sources and you know imagine writing this data sources back so you have to deal with the distributed transaction right now it gets very very hard to maintain this consistency so what this means is that you have put your forks and your spoons in two different cupboards right even worse example of this is this one who thinks this is a good idea who has this running in production you have now put your forks and spoons onto different planets we actually need to take a spaceship to get the fork for your dinner so much better way is to put these closely related groups that the Aggregates represent together data locality it's much easier to guarantee the consistency if you store you know these things that are inside the aggregate boundary together it will be much easier to guarantee the consistency I'm having database charts here as a example but it could be also microservices right I mean it's about putting the stuff in together so in this scenario The Cutlery is neatly organized in our drawers so an aggregate is also a unit of distribution so let's recap what we have learned so far about aggregates Aggregates are first and foremost part of the domain and its language right they Identify some concept there and if they don't you know if you find an aggregate inside your code and you know you cannot map it to you know the language that you hear when you speak with your domain experts or your team members then you should trigger the discussion why is this concept here in aggregate although we're not talking about that at all we should put it into our ubiquitous language and I give you a hint it's always nice to keep a language dictionary in some kind of Wiki or something where you just put all the concepts with the definitions in and invite everyone you know to add modify remove things around closely related to models like we talked about that right the aggregate itself is artificial the boundary that you put around these closely related models we have to invent them sometimes they're nothing concrete there are life cycles with an identity right they're somehow born or you know go through a Genesis and they end at some point so nothing on on this Earth or in this universe actually is infinite everything will end at some point the units of consistency right they guard the inside from all these messy concurrency parallel processing problems you know that you have on the outside of it and make sure that it's consistent at all times they're units of concurrency as we have just shown right it's it's the existence reason for this pattern it's about managing this increasing concurrency managing this this increasing load of your successful systems and their units of distribution right putting your forks and your spoons together so what if I need two Aggregates how to keep two units of consistency consistent where we have talked about a single unit of consistency and it's easy to do that inside the boundary but what if like two of these units of consistency have to interact and there is some kind of consistency rule between them so we have this custom registration bounded context and we have an account there and whenever this account is opened our CRM module you know should also have a account entity so there's a rule between these two things and Eric has again um a quote that any rule that spans Aggregates will not be expected to be up to date at all times through event processing batch processing or other update mechanisms other dependencies can be resolved within some specified time and you probably heard this term very often what he talks about is eventually consistency right the consistency between two Aggregates is eventually consistent and it's it's it's clear you would have to build another consistency boundary around these two things in order to make them immediately consistent and now add the third one now the boundary gets really huge and at some point it will be impossible to actually guarantee this consistency so this brings me to the third question here which is it sounds a little bit confusing but it is actually very much to the essence of domain driven design it's about how big should an aggregate be how do we find out what is inside and outside of this boundary and to be honest I can only give you a very unsatisfying answer it's as big as necessary and as small as possible I'm not sure who said this quote it wasn't a conversation so if you're the person please talk to me I will add a proper contribution so this question of what is inside and what is outside represents the essence of domain-driven design it's about where do we place the boundaries how do we decide what is inside and what is outside and these boundaries are not only in the small tactical world of the aggregate these boundaries are everywhere so you will probably learn about context mapping a little bit later we're there where we talk about boundaries that are on the larger scale and defining the interactions between those things so we use a lot of methods as the main driven designers to Define exactly that we use the knowledge of the domain experts we use heuristics we use a lot of other techniques that you will learn about today and you're at a very right conference to understand that as at a much deeper level so thank you very much thank you Thomas
Info
Channel: Domain-Driven Design Europe
Views: 38,749
Rating: undefined out of 5
Keywords: ddd, dddeu, ddd europe, domain-driven design, software, software architecture, cqrs, event sourcing, modelling, microservices, messaging, software design, design patterns, sociotechnical, event-driven architecture, domain modelling
Id: zlFqjD2LKlE
Channel Id: undefined
Length: 26min 37sec (1597 seconds)
Published: Thu Jan 26 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.