RailsConf 2017: Built to last: A domain-driven approach to beautiful systems by Andrew Hao

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

(marching band music) - Oh there you are, I've been looking for you. Welcome to your first day at Delorean, you're gonna love it here. Now here at Delorean, we're revolutionizing the industry of time travel. With one touch of a button on our app, you're gonna be able to summon a driver, and a Delorean will come roaring around the corner, you jump in, and you get taken to whatever time period you want. Now, oh by the way, I might mention to you it's a little bit messy because we got a lot of teams here, you know? So you might be a little surprised when you open the codebase, but don't worry, it's totally normal for everyone to feel a little bit surprised when they first join. One thing you should also know is that we have several of these codebases, so when you implement a feature, you're gonna have to check out this codebase and that one and that one and that one, and not only that, we have some funny naming conventions around here, so when your product owner tells you about the fizzbang widget, don't forget it's actually called the foobar doohickey, which someone actually called the bar thingamabobber for some reason. Don't worry about it, it's just the way things are around here. And I know what you're thinking, you're thinking we need to really clean things up around here. Which I'm sure, I promise you we're gonna get around to, but for the time being I just really need you to be really heads down on our biggest, latest product development puppy deliveries. And I guarantee you it's gonna be a hit. Well okay it's time for stand up so I'll see you around, you'll get started with your team, welcome to Delorean. Hi I'm andrew and I'm a software developer at Carbon Five. And like many of you I've been a Rails developer for several years. And at Carbon Five in prior jobs in the past I've been a part of teams that worked on large codebases in Rails that have struggled to really scale as they've grown in size and complexity. Now I've been thinking a lot about beautiful systems. What are things that make a system beautiful? We've talked a lot about beauty here at Railsconf, so beauty as many of us rubyists might think, comes from the language. In it's syntax, in it's form, in it's expressiveness, whether we have nice DSLs that read like english. Or it could come from the tooling. It could come from developer ergonomics with beautiful error messages that are very helpful, or an amazing debugger. Or if you're in a different language, it could come from a great type system or a compiler. Some of us might consider beauty to come in the form of tests. Whether our community encourages us to write great tests, whether there is the existence of a test sweep that makes our code resistant to breaking changes. But what I want to propose to us today is that a system is beautiful when it's built to last. When it has longevity and it stands the test of time with changing business and product requirements. These long-lasting systems are just large enough. They know their boundaries. They don't grow past them, they know what they're responsible for. They are highly cohesive and loosely coupled, and what that means is that they contain the necessary set of concepts within themselves and these concepts are all close together, no need to reach outside to actually go fetch a concept somewhere else. And when I say they're loosely coupled I mean that they minimize their dependencies on each other. And they have precise semantics that fully express their business domain. So when you jump into the codebase, there's no confusion as to what it means, as to what business process it's representing or trying to implement. Now some coworkers and I, for the past couple years or so have been reading papers from computer scientists from the past. And we came across David Parnas's paper from 1972, in which he wrote On the Criteria to be Used in Decomposing Systems into Modules. And he called this criteria information hiding. Here's what it said. He compared a software system, it's job was to process text, and it took an input of words and it basically did some processing on the text and it shifted things and alphabetized things and he compared two approaches in which he turned these systems into modules. So in the first step he treated the program like a script and he said step one goes into a module, step two goes into a module, step three goes into a module, and he says that that's probably the approach that most people would've taken with this program. But in the other approach, he divided it up by responsibilities. So he said this module is responsible for line storage, this module is responsible for alphabetizing, this module is responsible for writing things out to the disc. And what might seem obvious to us 45 years later is that this is a good idea. And so what he concluded in the paper was that we divide up modules by difficult design decisions or design decisions which are likely to change. And by doing that we insulate these modules from affecting the rest of the system when they change. So I wanted to bring this out and draw this out a little bit more because I thought this applied very well to software systems. Meaning systems in our business. Where are the design decisions that are gonna change in our company? Well I wanna put forth that this happens within the business groups that generate these changes. Here's an example. My team on marketing wants us to generate 5,000 promo codes. Now on your team, finance wants you to implement a new audit log every time someone charges a credit card. And on your team, product wants you to implement food delivery. And then on my team marketing says oh actually we want 2,000 of those 5,000 codes invalidated. And then finance needs us to add yet another attribute to the log and your product team wants you to now launch a second market, and to me that sounds like change divided up from the parts of the business that are driving them. So if we've ever worked in a nice, greenfield Rails app, we know it feels really nice. So marketing asks us to do this, finance asks us to do that, it's easy to add features as it gets spread across the code. But as time wears on we know it feels a little bit more like this. And then so the question now becomes how do we get out of this large system that's doing too much stuff? Well I heard about microservices, and I know that they're not easy. If there's anything that I've learned from attending any of these conferences is that they come with an operational complexity that most people have failed to consider and realize only too late after stepping in. How much do we extract? Do I extract like one little feature? Do I extract an entire area of the codebase? And where should I draw those boundaries? What if I extract something that's too specific and then on the other hand what if I extract something that's too generic? I once worked at a Rails company with a very large Rails monolith and we realized as the engineering team, we needed to show something to our CTO about like where we wanted to take the architecture. Well we had no idea, we didn't know if we needed in the end maybe like something on the order of 10 systems or 90 systems. I don't know. If only we had something to help us visualize what we need. Well in 2003 Eric Evans, author Eric Evans came out with a book called Domain-Driven Design, and in it are both a set of high-level strategic design activities and also very concrete software patterns. I must also warn you, there's a lot of enterprise speak in it, Java code, or .net code, you'll find it on the internet, that will be very confusing which certainly confused me when I first got my head into it. But my coworker told me at the time you should really look into Domain-Driven Design, 'cuz I think it will help us. So today what we're gonna do is we're gonna pick an activity from Domain-Driven Design called building a context map. And in this, through the context map we're gonna learn some concepts from Domain-Driven Design that will help us understand our systems. And then we're gonna learn some refactoring patterns that we can then apply to incrementally organize your systems around the boundaries that we'll find out in our context map. So let's get started. Domain-Driven Design is very, very much focused on language. That's the first distinction that I usually tell people when they ask about the subject. And in the language we have something called a ubiquitous language, and a ubiquitous language is not, it's not meant to be a global language for the entire company to use, but it's simply a shared set of terms and definitions that your team in your area of the business can agree on. And we typically use this language to drive the design of the system. So through the development of something called the glossary, we get people together in a room and we simply get together and we write out the list of terms and definitions, this is a very, it seems like a very straightforward exercise, so we simply come up with nouns and verbs that we use in our domain or within our team. So for example over lunch we might sit together and write on the whiteboard, okay well a driver is this thing, a driver owns the Delorean, the driver drives around and provides driver services, the rider does this and then your product owner might be like wait we call that a passenger. And you scribble out rider and you write oh actually we're gonna say passenger. And then you might talk about events as verbs, so there might be an event in which we hail a driver. Or there might be an event where we charge a credit card. So on and so forth. And the idea is that this term of, this list of terms and definitions is something that we codify either in a document or in the code so that we can all be in agreement about what words to use. And then we go on and we start renaming things in the code to follow the business domain. So for example, we might have something in which we have a user requesting a trip. Now there's two, there's two language things in here that we realized don't actually follow the business domain. So we go rename it. It's actually a passenger, and the passenger hails a driver. Now let's move on and let's go visualize our system. I'm gonna go generate an ERD diagram for us, and there are gems that do this for us, one of which is called Railroady, another one's called Rails-erd. And simply the goal right now is to get a lay of the land of the architecture of the system. Using active record relationships to drive these relationships. So here's what something like that might look like. It's a little hard to read, it's very hard to read, nobody will be able to read it. Don't worry, I've done the work for us. Most likely if your company has a very large codebase, this diagram's gonna be gigantic. We once printed out ours at a prior company and it was maybe like six feet long printed out on roll paper. Most likely it's gonna be gigantic and it may or may not actually be usable. If it's not usable, you may have to generate one by hand or something. So let's start by defining a few core concepts around domains. The domain of your business, the core domain is the thing that makes the business do what it does uniquely to the business. So at Delorean, our core domain is transportation. If you were Google, your core domain would be search. If you were Flickr, your core domain would be photo sharing. And then there are things known as supporting domains. Supporting domains are simply areas of the business that support, that play supporting roles to make the core domain happen. So here at Delorean, we have a team that's devoted to driver routing and all they do is they come up with maps and fancy algorithms to route the driver to the right places. Or we have a domain for financial transactions in which we charged user's credit cards or we pay the drivers. And then we have a optimization team that tracks business events and makes recommendations to the rest of the business on how to optimize certain business process. And then we have a customer support team, which manages user tickets and keeps people happy. So now what we're gonna do is we're gonna go discover these domains by using the diagram to help us think. So we're gonna look for clusters on that diagram and we might discover a few domains we haven't thought about. You take a look here, there may or may not be clusters that pop out at you. I'm gonna do the work for us here. And I have come up with this but as a team you might come up together, make this a group exercise. Most likely it will not be as clean as I've fakely made it out to be here. But for the sake of illustration, let's go with this. So now we've got a list of domains in our system, and we have a rough idea on what models belong in which domains. Now let's talk boundaries. Because boundaries are an important concept that will help us divide up our systems. Now in our Rails app or a Ruby app, we might have a boundary of a class. The class is the definition for a certain concept but it's concrete in code, and it's meant to be a single boundary around one concept. A module could be a boundary across a collection of concepts. And it's simply a namespace for multiple classes to live in or something. Gems are another way to package up code that belongs together and ship it around. And then finally things like Rails engines, Rails applications, external applications, external APIs, can also be boundaries upon which other concepts are contained. So a bounded context is simply a software system. So when I put it out there and I say bounded context today, I will simply mean in running software system, somewhere in production, or could be, a software system that could be run within your business. But since this is domain-driven design, there's a language component to it. So linguistically, it's actually a part of our domain in which concepts live and are bounded in their applicability. So what that means, you might think about it a bit as a playground for concepts to live, but they're not allowed outside of that playground. And I'm gonna spend a little bit of time explaining these a little bit more. The bounded context allows us to have precise language. Because it allows us to have terms that have conflicting overloaded terms, it let's us separate them and give them their own playgrounds to run around in. Here's an example. We have a trip class. And to us a trip is simply the thing that a passenger jumps into a car, and they go for a ride. So that's a trip, there is a time and there's a cost of the trip. However it's a little overloaded. In the financial transaction world, the concept of trip time is when the vehicle is moving. Folks in our finance department are like they just made the decision that we're not gonna charge the car, we're not gonna charge the customer when the car is stopped. I don't know what that means when you're time traveling but bear with me here. Trip time in the routing context is calculated when the passenger's in the car no matter whether the car is moving or stopped. So you can see right here is that depending on what context you're in in the business, whatever software system's using that, there's nuances in the behaviors for the same context. Or what about trip cost? How much money is the customer gonna be charged? That's what cost means in finance, but when you go to the routing domain, trip cost is actually a completely different term. It's some made up metric for trip efficiency, some sort of scale or metric. So those two concepts have similar names but wildly different actual definitions. So what do we do? Well if you're like me what we might've done in the past is we might've simply just made it a little bit more specific and then we would've just like closed the box and like walked away from it. And then what we done here now is that engineers in the code will actually have to understand the nuances of these methods and understand that one is meant to be used in this context, and one is meant to be used in that context. Just briefly here, we can fix this by introducing two bounded contexts. There could be a trip that belongs to the financial transaction bounded context, and another one for the routing context. But we'll get into that a little bit later. So here's what we're gonna do, we're gonna go find the boundaries within our existing systems using that diagram as a guide. So we'll also keep in mind that there are other systems in the landscape of our business. So things like other teams' services, or other cloud providers. So over here I've started out with our diagram, I'm gonna pull out the ERD diagram from behind it just to make it more clear, and I'm gonna draw a big circle around what I know to exist in our monolith. Here it is, the MonorailApp. And I thought a little bit more about it, and I realized that your team has the email service and oh we actually use an AWS service called SNS to send push notifications to people's phones, we use Braintree to charge credit cards, and actually in the process of drawing out these other systems, I actually realized there's some other domains that I haven't thought about yet. So I'm gonna write them in, so customer notifications, another domain I haven't thought about, and there's some marketing because marketing sends targeted emails to people. And then finally I'm gonna draw out dependencies between these bounded contexts. So I'm connecting these bounded contexts and then I'm drawing a, I'm writing a U or a D. The U stands for upstream, the D stands for downstream. What that means is the upstream system is a system that is the source of truth for certain types of data. So the upstream system may provide the API, the upstream system may fire the message, whereas if you're the downstream system, you are consuming or you are dependent on whatever the upstream service provides. And drawing out these directionality dependencies will actually help us understand the lay of the land to understand the relationship our system has with other systems in the world of our business. You might notice a few things. A few things about the context map, you might notice that one bounded context has multiple supporting domains. So this is very intuitive to many of us because we felt the pain of the monolith. That monolith did too much because it was trying to manage the code for all these different domains. Another thing we might notice is that there's multiple bounded contexts that have to support a single domain. So over here we might notice that financial transactions and customer notifications both span several software systems. And that's just kind of a call out to us to make us realize that if we ever have to implement a feature in any of these domains, we're gonna have to end up touching a few of these systems. And then in the end there is an ideal or a suggested outcome that DDD suggests to us. That every domain is matched up with it's own bounded context. So this might look something like this. This is certainly not a practical architecture for many of us but if we took it to the extreme every domain would have it's own software system running behind it. You might also imagine this to be maybe the ideal microservice architecture, when I said that I almost immediately wanna take it back, but the idea is that everything is segregated to be highly cohesive within itself. Okay now let's get to the actual code. So when we begin our first refactoring step, we only wanna change a little bit of things at a time. So what I'm gonna do now is I'm gonna draw out one domain, and I'm gonna make it a module. So let's say I'm gonna implement a feature somewhere in ridesharing and while I touch those features, I'm gonna actually bring in some of those concepts into my ridesharing domain. So I'm gonna start with the model, and maybe it's related classes. And I'm gonna modulize it. I'm gonna introduce a ride sharing module. And then I'm gonna have to do the Rails-y things to get the rest of the application understanding that this thing is now namespace. Now I'm gonna go move that code into a new folder. I've now made a second level folder called domains and then within that I'm gonna start a new folder for every single module I've introduced. So over here I've made a ride sharing domain, and I'm basically just dumping in all the code that I've collected for my models, my controllers, my views, or maybe even my services. And the idea is to move all that code that's related together into their own folders. This will temporarily make that ridesharing folder a little messy, but I wanna put forth that it's okay in the interim. Now there's something also called aggregate roots. And the idea behind this is that an aggregate root helps us address the problem of god objects in active, in active record. So we often times have objects that know too much about the outside world or the outside world knows too much about our object. So over here we have an active record model that might be explicitly bound to other models in the ridesharing domain. And this payment confirmation as I've illustrated here has a lot of relationships that may not be necessary but however the fact that they're all explicitly defined here makes it difficult to refactor, makes it hard to write tests, and is just kind of awful to look at. Additionally, outside actors may actually know a lot about the internals of my domain. So I might have a payment flow that's a web UI that calls in and like it updates all these models, or I might have an external ETL process that runs nightly and picks and chooses when it wants out of my domain, or I might have a thing that pushes notifications to drivers and has to reach into my domain. And so the idea of an aggregate root is something that, it's the idea that I'm gonna expose only a single graph of objects to the outside world, so I'm gonna simplify what I expose to the outside world so that the outside world has a reliable interface into my data models and I protect my internal data models from being, from change. So over here I've decided my trip is gonna be the root. And then the aggregate is gonna be all these other models that flow out from the trip. The idea here is that this trip now will expose everything else but only through itself. And so any time someone makes a direct method call to me, every time someone, someone asks for the trip, I'm only gonna expose that aggregate root through a JSON payload or through an API endpoint. You might have multiple aggregate roots per domain, which is okay. But just expose just enough to the outside world such that it makes sense. Now here's a quick thing that we can do, we can build a service object that will provide this aggregate root. So the idea is that I'm gonna make a service object that will create a aggregate root that is basically a facade. So here's what I'm gonna do, I'm gonna introduce a thing called a fetch trip. And this fetch trip essentially is gonna wrap a, over here I've written it as an active record query that simply returns passengers and drivers on top of a trip but alternatively it could be a ruby struct or something like that just something to have data to pass back to the outside world. And now callers in which used to be tied to my domain through active record relationships will now simply call the service object. And the service object will then return to them the related data models that they need. Finally let's talk about event driven architectures. In the past you might've had to go somewhere else and do a side effect after we finish processing code in our domain. So over here when a trip is being created, this code then reaches out and does something in an unrelated domain which ends up coupling the two domains. So over here you can see that a ridesharing concern then has to perform something in the analytics domain as well. What if we flipped the data dependencies? So instead what we're gonna do is we're gonna go publish an event, and then we're gonna go have the other domains subscribe to that event. And so therefore we lower the coupling between our domains. I will now introduce a thing, so I'm essentially introducing a message bus. And within this message bus, I'm gonna introduce a publisher, and we're gonna have subscribers or handlers to handle these events. And I'm doing this through a gem called whisper. Whisper provides published subscribe semantics for ruby applications. And so here the domain event publisher simply passes through an event and then calls through whisper to publish the event. And on the other side, or sorry, here in the original code instead of reaching into the other domain, I'm just gonna fire the event. Now on the other side every bounded context is now gonna handle or respond to it's event depending on whether it needs it or not. These handlers will also use things called command objects to basically perform their side effect. So over here I've made a domain event handler. This domain event handler will listen to the trip created event. Through the definition of a class method called trip created. And so therefore every times an external publisher publishes trip created, this domain event handler turns around and fires the log trip created command. Here's the blue code in which whisper is set up to subscribe to events, so the event handler is subscribing to events from the publisher. And this is a illustration of what the command object looks like, the command object is simple lightweight service wrapper around a specific side effect that has to happen. We might also see that other domains need to also respond to these events and so over here we are introducing some extra behavior that is now decoupled from that controller, and so in the financial transaction world we would also do things like creating audit logs, or we deduct gift card amounts, et cetera, et cetera. So this is kind of the end result of our new architecture where we introduced a message bus. It should also be noted that this is technically not a message bus in the asynchronous manner yet. Whisper actually just has some nice wrappers that allow you to make it look like it's async but it's still synchronous in the web request. Until you introduce the active job wrapper and now your handlers will now be queued up as active jobs. And so this allows you to actually make your side effects asynchronous with your worker queue framework of choice, like Sidekick or Rescue. And then here's the blue code to make that happen with active job. Finally you can also introduce a real message queue, so if you actually wanna decouple your systems to other external systems, you might introduce a thing like RabbitMQ and there's a few nice gems that let you do that, Stitch Fix has a nice one called Pwwka, there's also a gem called Sneakers. Let's talk about a couple of more advanced topics. The first one of which is what happens when we want to share models between domains? Let's say I have a system that's off here and I have a system here but they actually still need to access the same data. Well there's a concept called a shared kernel in which we simply, we make it okay to ship around a certain shared package. And so what I suggest in that case is we simply namespace our models under that shared namespace. And this could actually become a gem if you have to ship it to external libraries or not. But I would also say that if you find yourself sharing a lot, you're maybe not thinking about your domain clearly enough. Because there might be an actual thing that you wanna do, which I'm gonna talk about next. Is when when you have one model that actually needs to belong in two domains. Sometimes you have a concept that just has to be broken up. And how can you get these concepts codified within their respective domains? Now there's something called an anti-corruption layer, which I'm gonna introduce, which is simply an adapter that maps a concept from the outside world into a concept that we can use within our domain. So here's an example, so remember that trip that I introduced earlier on? Well we know that it actually has the semantics for two domains. So what if I introduced a nice, very expressive domain model for the routing context here? So over here that's a really beautiful domain model that has language that really reads and flows nicely and matches the business domain. And what I'm gonna do is I'm gonna make an adapter that simply maps, maps us from that legacy data model into our internal data model. So there's simply a mapping function that just maps things together, and then it converts and instantiates our pure domain model internally. And then we're gonna add a repository in which we simply grab the thing from the outside and then we instantiate an adapter which converts us to the internal world. And now internal domain code is gonna call the repository instead of directly reaching for the outside world. Let's talk about what happens next. So one would imagine that you can apply this incrementally. So the beauty of this is that you can apply one or a few or maybe all of the refactoring patterns. And you might move things first into incremental domain oriented folders, so you might simply pick and choose, move things and drop them into folders. Next what you could do is you could turn those folders into Rails engines, so they're actually self contained applications. And then next you can move these actually into their own Rail services. And then finally you can move them to whatever language or whatever you wanna use of your choice. And the idea here is that actually as you continue decoupling these things, you allow your systems and your teams to scale because they're gonna actually be able to run in isolation from each other, slowly but surely. Okay I wanna throw up a few caveats because this happens to be very true. DDD will work very well for you if one, you have a complex domain. You need the linguistic precision. You find yourself really caught up or tripping up over words or names or meanings, or you, maybe your business is just very complicated, you have a lot of regulation in your domain or something, there's just a lot of nuances to your domain. Second of all, you might work in a very large team, you just might have a massive team working on a massive codebase. So this might actually help you isolate your systems well. Third, you're open to experimentation, you have buy-in from your product owner. And fourth, your whole team is willing to try it out. Including other teams. So if you're the lone wolf that wants to kinda slide this under the door, this is not gonna work. You need to have buy-in, you need to have agreement that hey, we're gonna try out this new architecture and we're gonna try out a few of these refactoring steps. How does it feel? If it doesn't, if it doesn't feel good maybe you need to have second thoughts. Or if somebody's really against it, this may not work. This is in response to a presentation I gave earlier, there was a conversation on Twitter, and somebody said, Hey you're just doing Java. And then I had this thought like oh my god like are we becoming the thing we hate? Just kidding, I actually have a lot of love for Java. But I would say that in Ruby and Rails we often times have this search, we're on this quest for simplicity. But that may not necessarily be the best thing in every case. Because in domains where there's some necessary and essential complexity, maybe what we need to be searching for is clarity. So with that in mind, I wanna share with you some things to be on the watch for. If it doesn't feel right. When do you stop? So hopefully you've been applying these incremental refactorings but at a certain point you might feel like wait I feel like I'm Overdesigning, I feel like there's a little too much going on here or it feels like it's kinda silly just to make this thing do that thing and then plug this thing into here. Second of all you might feel like maintaining these abstractions is kind of a burden, you're just, you're just like doing things just for the sake of following the patterns or the book. It actually might be okay to simplify things. To instead of creating a pure domain, instead of creating a service that does this to an adapter that does this to this other repository, you might be able to get away by smashing those three things together into one object and then calling it a day. It's okay. And then finally if other teams are silently grumbling or they're not, they're, they're very obviously grumbling about it then maybe it's time to stop and have a conversation. It's okay, you don't have to follow this by the book, once again. The beauty of this once again is that we incrementally refactor, we incrementally apply things. So in summary we discovered the domains in our business and we developed a shared language with our business. We built a context map and so we saw some strategic insights and we saw the lay of the land. And then finally we found some refactoring patterns and some organization strategies to help us organize our codebases to hopefully get us to that next step so we can build out systems that will really scale. And with that I wanna thank Railsconf for inviting me here, here's all my contact information. If you'd like to talk, I'd love to talk with you afterwards down here. Thank you very much. (audience applauds)

Info

Channel: Confreaks

Views: 8,886

Rating: undefined out of 5

Keywords:

Id: 52qChRS4M0Y

Channel Id: undefined

Length: 37min 18sec (2238 seconds)

Published: Wed May 10 2017